The version control system I want, and might build someday
I have a number of short stories and one novel in various states of unfinishedness. I write on the ‘puter, because Emacs is faster than longhand. I like to keep old versions around, because then I can be merciless in my pruning: if I decide that 1,000 words in the middle just don’t work, I can chop them out; those 1,000 words are saved in an earlier version, and I know I can put them back if I change my mind a few months down the road. In practice, I have never gone back and put the pruned text back in, but knowing that I can gives me the confidence to delete it.
The tool I’m using right now to keep track of those versions is Perforce. Perforce is designed for large software projects, and makes a number of design decisions which are the right decisions for large software projects with a large number of developers, but which are not necessarily ideal for a single person keeping track of a relatively small set of files.
If only one user will be using the system and the files being edited are English text, then optimizing for disk space and performance become less important (the complete works of Shakespeare take up about 5 MB; the King James Version of the Bible 4 MB.) To ensure that versions and information about changes will be accessible in the future, (a) metadata about changes–when the change was made, whatever comments I put in about the change, &c.–would be stored in a human-readable format in wide use, e.g., XML, as opposed to in a database*; and (b) versions would be stored in full as individual files, as opposed to as a list of changes**.
The goal is that the entire set of versions and metadata about the changes would be human-readable: not just readable to the extent that, years after the software which created it is long gone, a human could write a program which could turn it into something which a human could interpret, but readable to the extent that a human being armed with a text editor or word processor and nothing else could make sense of it. This makes one as certain as one can be that the versions and metadata will be accessible in 15 or 20 years.
I have other goals: it would be nice if the server (the part of the system which keeps track of the versions) and client (the portion that the user interacts with) were separate, and the server could run either on one’s own machine or on a separate machine; it would be nice if the client and server could run on Mac OS X, Linux, and Windows; it would be nice if the UI were a little less complex than the Perforce client’s; it would be nice if I could easily get Emacs to act as a client, rather than launching a separate app. The primary goal is that it be future proof.
* Perforce stores its metadata about changes in a database format which is (a) proprietary and (b) is different between different versions of the server, and between processor architectures.
** Perforce stores all the versions of files in a single file, which is open (RCS) and which it looks like would not be difficult to reverse-engineer even if it were not open–however, if space is not a priority, and reading the files in the future is, it makes sense to remove even light obstacles to ensuring that the versions can be read.
Leave a Reply