(IDE) Make Inform more SCM compatible *VW*
I can't help it ... I'm a software engineer first and an IF author much, much, much further down the list. I want to keep my Inform project in SCM (Source Code Management). I use Git, which has no problem with many of the normal problems (spaces in the file names, etc.).
However, it's a little hit-or-miss figuring out which files are (re)generated from the source, and which are not. Some guidelines would be nice, and if all generated files were put in a subdirectory, that would be even better.
The biggest challenge is the Skein. It's stored as a single file, which means that it gets changed on each execution of the game. This would work better if there were too files: a temporary file (not for SCM) that contains recent runs, and a blessed file that contains the portions of the tree that have been locked or blessed.
I have to say, I like being able to see the history (hits and misses) of my IF over time ... just like with my other program source code.
So the next version of I7 is going to make this hugely easier, but I would like to say what I think the ideal would be for a complicated project like Kerkerkruip. Kerkerkruip has its own extensions, as well as ATTACK, and some updated regular extensions. We track all of these separately in three git repositories. It would be nice to use git submodules to include them all automatically in Kerkerkruip's Materials/Extensions folder, but they are overlapping. So the ideal for us (and probably only for us) would be if Inform would look for extensions directly in the Extensions folder and another level down. We could then have Materials/Extensions/extensions be a submodule loading the i7/extensions repository with all the Author name folders beneath it. But as I said it's probably overkill for everyone other than us, so I have no expectation that it should be implemented.
I've been using raw story.ni (and extension.i7x) files in Git for a while now, and it's worked acceptably. The problem with word-wrapping is that most changes will rewrap the line, so you wind up with a diff for the entire block of text either way.
Your scheme will work better for tables that contain very long lines, though.
Coming back to this, I did build a simple tool this week to do a simple conversion back and forth between the story.ni format and what I'm referring to as a "stanza" form (because it looks more like poetry than the typical "prose" story.ni). It's a fairly simple transformation: original newlines get decorated with pilcrows, word wrap at 72 characters, add newlines before tabs, split into multiple files at header lines. However, it will look a lot better to a source control system (smaller files with shorter lines should lead to better diffs and merges).
At the moment I've just posted the Python source to my blog: http://blog.worldmaker.net/2013/sep/11/musdex-inform-7-stanza-handler/
You can work with it standalone from a Python console, but it also works as an extension to my tool called musdex (http://pythonhosted.org/musdex/) which you can set up to auto-extract and auto-combine in association with your source control system.
To use this extension with musdex, ``pip install musdex`` and then you'd copy the code from my blog into i7stanza.py, place that into your site-packages (or other PYTHONPATH accessible location) and you can the --handler flag of the add command: ``musdex add --handler i7stanza.I7StanzaHandler story.ni``.
It's a "quick hack" project this week, but seems to do everything I expected it do. Doesn't solve the Skein XML or extensions, but maybe it is enough of a place to start to figure out what the next steps might be.
David Cornelson commented
Since this suggestion was introduced, services like SkyDrive, DropBox, and iCloud now allow anyone to store files in a central repository and some of them can track changes. It would seem to me that the IDE should simply get out of the way of file handling entirely. If I7 has a group of file types it knows and how it uses them, that logic should be open to configuration by the user. So a list of project folder locations. Then for each project, a configuration of where to find extensions and where to put results.
Of course if we had an extension packaging system similar to nuget or apt-get, that would be extremely helpful as well. The only question is whether existing extensions have "requires" statements properly stated. There's a lot of work behind something like this I suppose.
Personally, I'd like to be able to simply configure paths to projects and then within projects define where to look for extensions and where output should be stored. That alone would allow me to do what I want to do. Of course I'm a programmer, not a writer, so I can see how a writer might view this process entirely differently.
"Could I put an Inform7 project under git, but use .gitignore to ignore everything but story.ni in the Source directory?"
Yes, this is what I do. (You should also preserve the uuid.txt file, which contains the IFID string.)
Ryan Collins commented
Could I put an Inform7 project under git, but use .gitignore to ignore everything but story.ni in the Source directory?
What I'm also thinking is some sort of script that would allow you to break story.ni up into several files for collaboration under version control, and would reassemble those files to story.ni to run from Inform7.
Kevin Norris commented
Adding my vote for extensions support. Right now it's a major pain to work with extensions and SCM. hg really requires each project to have its own directory, and I would guess other SCM's work similarly. The Gnome IDE also needs to better support things changing behind its back (esp. extensions).
Sarah Smith commented
Like 3-4 of the other commenters here, I think the Extensions are the first issue to resolve for version control. In my view fixing that would provide the biggest improvement to SCM compatibility, at the smallest implementation cost.
I have created a set of extensions for my game, and on my Mac machines (laptop and desktop) those get stored in a per-user folder under ~/Library
I tried symlinking in the "Sarah Smith" directory from my SCM controlled project into ~/Library/Inform/Extensions/. but that gave Inform a real headache. It produced a System Error 2, and was not able to compile at all.
The simplest thing that could possibly work would be to have some sort of configurable "extensions path" which could be added to the per-project settings. So alongside setting whether or not it was a glulx or z-code project, have a field for "extensions path", with a "browse" button on the UI. When the compiler is resolving extensions it searches this path first, before looking in the ~/Library/Inform/Extensions/. hierarchy.
David C commented
I've circled back to this problem and I think I know of a fairly simple solution.
Instead of coming at it from an SCM perspective, maybe altering the way the GUI sees files is the better way to go...
The easiest way to look at this is that there is a physical set of files. This includes the project and the relevant extensions. What if we added a logical layer to the project...like a sort of project index. The index would contain the physical locations for all the parts of that project. So if I select My Project, the GUI first looks up where the project is, where the related extensions are located, and then handles everything accordingly.
In this way, the user is allowed to map projects and extensions in any way we wish, with any source control we wish. I7 wouldn't care...it just needs to know where those two things are (project folder and extension folder).
I'm not at all sure if this is a Mac compatible concept, but it would certainly work for Windows.
As I said, please add a new suggestion to handle integration if you think it's important. It's a much more involved and complex issue than perhaps you realise.
However, this first step of making the project folder play nice with SCM tools is very manageable and is also a necessary step towards what you are after. Adding extra features to this basic issue will only make it less likely to be implemented, which is why you need to log a separate issue.
Note that a lot of people are very comfortable using SCM tools from the command-line, or via a standalone GUI interface (e.g. Tortoise) and for all these people sorting out the project store would make things instantly usable.
I should also note that, in my opinion, Eclipse-style integration is outside the scope of the Inform project, due to the huge number of different SCM systems out there (SVN/CVS/Git/Mercurial/etc.). To spend time writing and maintaining tools to integrate with each of the popular SCM systems would take far too much development time away from the core project aims.
What might be within the scope of Inform would be to provide the necessary hooks to allow third-party plugins that provide this functionality, but again, that's a separate issue.
Johan Paz commented
Without real integration between the SCM and the IDE, that will not be so useful.
> And please, some kind of integration between SVN and the IDE?
I think that is too specific and tangential a tool to be appropriate for the current Inform IDE. If Inform played nice with source control, then it's easy enough to use other existing tools to do the actual commits/updates/etc.
That said, if you feel strongly about integrating SVN then by all means create a new suggestion to deal with that. This one is purely about being compatible with SCM, not to integrate the toolset.
Johan Paz commented
And please, some kind of integration between SVN and the IDE?
Personally, I already use an SCM (SVN) for my Inform work, and the source code is not a problem at all. Perhaps this is because I use a graphical diff tool (WinMerge) which highlights changes at the word level (or even character level) rather than the line level. From my point of view, automatic re-wrapping would introduce problems where currently there are none, as it would mean that inserting a few words at the start of a paragraph would change each of the subsequent auto-wrapped lines, thus showing a lot of unrelated diffs. Currently I would just be shown the changes I had actually made. Andrew Plotkin has already made this point, but I just wanted to back it up with a real-life example.
The things that make SCM a *real* pain have all been mentioned, but just to re-iterate:
* Problems with unexpected diffs between skein revisions, largely due to Node IDs being changing with each save.
* Multiple directories for each project (i.e. the main story folder and the separate 'materials' folder).
* Lack of clear guidelines about what needs SCM and what is regenerated automatically.
* Co-mingling of files that need to be source-controlled (story file, skein, settings, etc.) and files that don't because they will always be regenerated (story index, compilation output, metadata, etc.). This one isn't a show-stopper, but it would make setting up new projects easier if we could just ignore a folder, rather than manually specify a whole bunch of items to ignore.
* Centralised extensions are a problem, as it means that (a) versioning of extensions is kept separate from versioning of story files, and (b) you can't have multiple versions of the same extension in different projects.
There have also been reports of entire directories being deleted and re-created, which would completely break SVN, but I've not experienced this myself (the report said it was a Mac issue) however, if that is really happening, it is probably the most serious issue of all.
Finally, there is one other problem which I suspect can't be resolved, which is that the compiler version can have an important impact on building the project, therefore if you need to re-compile an old project for any reason, you need to either roll back to the version of Inform you used to compile it previously, or make whatever changes are necessary to bring the project up-to-date, and do a full re-test. My solution to this is to version the entire Inform7 program directory (minus the large documentation folder and a few other non-relevant items) so I can easily roll back to any given point of time if a recompile is necessary. A solution might be to include older compiler versions in the I7 distribution, and ask whether to use the old version or the latest when compiling an older project.
Well, linebreaks are sensitive in reStructuredText in many of the same ways that they are in I7: for table formatting and header declarations, and things like that. Any "automatic" wrapping/unwrapping tool in I7 should be able to use the parser (or a minimal subset of the parser) so that it knows that it can only wrap/unwrap paragraphs and not change "hard linebreaks".
I realize that paragraph reflow may not be the most efficient solution, but it is no less efficient than the current long lines in the worst case, and in the best cases it can be more efficient.
If we want to talk about the most efficient techniques, I think the most efficient storage means for I7 would be a format that puts no more than one sentence per line, and possibly as fine grained as one clause per line. In such a storage scheme you'd probably want explicit paragraph markers like HTML's P tag or TeX's \para or something.
Also, there are a few standardized tools out there for XML formatting like xmllint (which I just remembered that I've used to various degrees of success in combination with my musdex tool for source controlling zip archives). Such a tool could be bundled or used as a shared library for cross-platform formatting needs.
Speaking of musdex, now that my minds on the topic, it would be possible to create custom handlers for it to experiment with potential storage formats in musdex's existing source-control aware transformation engine, before deciding if any of those approaches is right for the IDEs to be aware of them.
I am sympathetic to the "long lines are awkward with source control" problem, but my guess is that automatic wrapping/unwrapping would be error-prone and would not gain you that much. I7 source (unlike Restructured) *does* care where line breaks are. Tabs and indentation are also significant, so the wrapping/unwrapping algorithm would have to be very precise to avoid breaking things; breakage might be inevitable. (And then, if you edited the file in another editor -- which is a related request -- it might not maintain the correct formatting guarantees.)
Plus, if you edit more than a few words in a long paragraph of text, it winds up rewrapped -- so the SCM would see the entire section as different *anyhow*. That's why I say it might not gain you much.
I've used SCM with other line-wrapping IDEs (Xcode, for example) and the inefficiencies really are minor.
XML formatting is a good point. (That's a factor for Xcode too -- its project files are all XML.) I agree that it's worth normalizing as much as possible. If Mac and Windows XML toolkits normalize differently, well, that only affects a handful of cross-platform users and they're still no worse off than they are today.
Something that concerned me when working with Inform in my SCM of choice (darcs) and related tools (I built a Inform 7 lexer for Pygments as a toy: http://blog.worldmaker.net/2009/feb/17/code-snippet-moment-inform-7-lexer-pygments/) is the way that lines are stored in I7 files. I like the proportional font approach and the word wrapping as it exists in Inform's editor, but it doesn't play well with SCMs in particular. Most SCMs use line-based diffs for several key pieces of functionality. (Darcs is patch-oriented and provides some nice behaviors when patches are well-formatted and makes it all the more obvious how much it matters how source code lines are stored, but every SCM I know of is susceptible to this to some degree.)
Which is to say that most SCMs today expect source code to be stored with plenty of line breaks and a relatively low number of columns per line. This is because most of the diff, merge, and patch tools that SCMs rely on (to varying degrees) come from the old "fixed-width" terminal paradigm, for the most part.
I7's files don't particularly "break" this approach, but they do eventually tend to inflame various inefficiencies in modern SCMs. These are the sorts of things that tend to particularly be noticeable when multiple people are touching the same file (merge problems, conflict problems, and the like).
One suggestion to the IDEs that should be relatively "easy" to accomplish, but may help SCMs immensely and would be to save the ASCII output with paragraph wrapping at, say, column 72. The IDEs would wrap long paragraphs when saving, but would be free/encouraged to unwrap them back on load to allow the standard GUI wrapping to do its job. (I use paragraph wrapping like this in working with reStructuredText files in SCM: my editor wraps columns to 72 characters when saving/editing, and reST ignores newlines inside a paragraph...)
I do agree that global extensions can be a version management problem with SCM development. (Even traditional programming languages have over time seemed to trend away from global extensions and search paths.) Another suggestion, in addition to the other good ones here, would be to add a manifest file (or perhaps use/improve the existing manifest) that would not only describe which versions of which extensions are in use, but would provide install URLs for them such that Inform on loading a project might check for new/changed extensions and prompt/auto-install them.
As for the skein: storing XML in an SCM can often be a fragile/tough thing to do. The best advice is generally to make the XML as "stable" as possible: normalize and format it consistently; change as few of the nodes as possible on every save; change as few of the existing lines in the saved document as possible. Unfortunately this can be tough, particularly if using different platforms' XML libraries. There are some things that can be done at the data-structure level that might be tried to help assuage this. Because it is a file meant for usage with a tool and with lots of auto-generated pieces you may never get the Skein to be the most well behaved in an SCM. On the other hand, but it is mostly worked with by tools and semi-automated procedures there should theoretically be a lot of flexibility to change and affect the storage process (and subsequently the SCM footprint) in future versions.
David Cornelson commented
I've recently been reorganizing Textfyre source and find the entire source layout within I7 enormously painful. For one, extensions in I7 are much more fluid than in other programming projects. Coding IF has always had a sort of unwritten rule that if things don't work the way you like them, you just crack open the library or extension code and change it per-game. With I6 I would start with a basic library and then keep that library for each game I was working on. I think I7 extensions start being "global", but to assume they'll stay that way is a mistake. We should assume extensions will be modified by the author on a game by game basis. If they get out of sync with the official extension release and version, so be it.
So my recommendation would be to have a central storage for "original" extensions, but when an extension is included in a game, it should be copied to an extension folder within the game folder. There could be some reporting on comparisons, but that would be an additional feature. The issue with the skein isn't as important on Windows or with Subversion since we can set ignore to things we're not concerned with (I never use the skein and am actively working on an external unit testing tool).
Distinguishing un-changed original extensions as a sort of Inform7 Library and then placing per-game extensions in a game folder seems like a decent way to resolve some of the problems.
AdminAaronReed (Admin, Inform 7) commented
Another issue with making I7 more SCM compatible is the requirement that extensions are stored in a single system-wide "extensions" folder... as I understand it (and like Graham, I also am not as familiar with this technology as I ought to be) this means that each time you check out or check in an extension, you'd need to manually copy it between your SCM project directory and your system's extension directory. Ideally, you could keep project-specific extensions in the same folder as the main I7 project file, to avoid this step.
One of the issues with the skein is that (at least the Windows IDE) likes to change all the nodeId values every time it get saved. So even though there are no (or only a few) logical changes to the skein a revision control system sees one change every 10 lines or so throughout the entire file. This makes it difficult to have multiple developers using the same repository and merging their skein files.