TeX Live VCS History and Statistics – Perforce, Subversion, Git

TeX Live is a project of long history, starting somewhen back in the 90ies with CDs distributed within user groups till the most recent net-based distribution and updates. Discussion about using a VCS started very early, in 1999. This blog recalls a bit of history of the VCS for TeX Live, and reports on the current status of the Subversion and Git (svn mirror) repositories.

Perforce

Sebastian Rahtz, the then editor of TeX Live, asked in Nov 1999 about using a VCS, in particular Perforce. The proposal drew some opposition in favor of open source VCS like CVS, but despite this a Perforce server was set up at Dante in Feb 2000. From then on Perforce was used, and in May 2001 Sebastian announced the 1000 change to the perforce repository.

Switch to Subversion

In 2004 the source part was checked into CVS at Berlios, and remained there for long time, but was not actually used for development. In Feb 2004 there was a short discussion about moving to Subversion and some hosting (Sarovar or Sourceforge), but it was postponed after the release of TL2004. After the release Karl Berry started to move from Perforce to CVS on Sarovar in Jan 2005, with somehow bad experiences, and the decision to set up a subversion server on tug.org.

First appearance of the subversion checking of TeX Live is around Jan 2006 with timing tests and slow progress getting everything into the subversion repository. The final announcement of moving to subversion was in Mar 2006.

Since then and till now Subversion is our main VCS and most of our distribution scripts for network distribution etc require subversion, in particular strictly increasing revision numbers.

Git mirror

Despite having used Subversion for many years and in most of the projects I had, the appearance of Git and its ease of use made me switch practically all of my projects to it. With Perforce I hated that one needed to have server connection to commit anything (no go on trains etc), with Subversion I hated that branching was a PITA and costly. With Git I finally could branch easily, develop new features, and merge them later.

In time I felt the need to also use Git for TeX Live development, and made a git-svn checkout which I used privately for TeX Live Manager development. Some trials to push the repo (about 30Gb) to either GitHub or any other hosting service were rather unsuccessful, so in March 2017 I pushed the repo to my own server and announced it (the URLs there are wrong, though!).

Recently I was asked to include also all the branches and tags from the Subversion history, which was a bit a pain to set up, but in the end the git repository now has branches for the branches in Subversion, and the releases are properly tagged back to TeX Live 2007. The history is the same as with subversion, going back to 2005. Unfortunately it seems that the Perforce history is lost.

So here they are: web interface, anonymous git checkout

Statistics

Some numbers to close of this blog: We are currently at around 47000 commits starting from 2005 to day. Most busy committer is Karl Berry who does the biggest bunch of work updating packages from CTAN, which accounts to most of the changes in total.

Interesting detail is that the number of commits per year is actually decreasing with the top year being 2008 when the new TeX Live infrastructure and TeX Live manager was introduced:

More stats generated from the git repository on 20180110 can be found at https://www.texlive.info/tlstats/.

Closing

Although I am very happy with the current setup of git-svn and the git repository, git will not replace subversion in the foreseeable future, due to the reliance of most our distribution scripts on the strictly increasing revision numbers of subversion. I have been done some work to support both git and subversion, but that is highly uncompleted work and as long as nobody shows interest I guess it will not happen.

It shows that depending on the usage and integration, a distributed VCS, like git, is not necessarily always the optimal solution. Bridging the two systems and working nicely together helps, but I still need to keep my subversion checkout available for emergency cases.

3 Responses

  1. smcv says:

    `git describe` can give you a strictly increasing counter by counting commits since some old date, if that helps? For example in dbus.git:

    % git show –pretty=oneline 47229f733179ed2f9f5e8baffe855f09530ff94f
    47229f733179ed2f9f5e8baffe855f09530ff94f New repository initialized by cvs2svn.
    % git tag the-first-commit 47229f733179ed2f9f5e8baffe855f09530ff94f
    % git describe –tags –match the-first-commit
    the-first-commit-5257-g8bd8f200
    % git describe –tags –match the-first-commit | sed -e ‘s/the-first-commit-//;s/-.*//’
    5257

    Not exactly equivalent to a svn revision, but maybe good enough?

    • Hi Simon,
      yes I know this method and I have implemented already our revision computation mechanism on top of that, but it still is shaky, and we need to find for hundreds of thousands of files the last change revision, which complicates it a bit (while in svn it is trivial).
      It is doable, as I said, and I have started to work on that, but there are too many scripts that need to be checked, fixed, adapted …

  1. 2018/01/22

    […] onto my server. This is similar to the git-svn checkout of the whole of TeX Live as I reported here, but contains only the source […]

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>