I think the process of deploying code at the Wikimedia Foundation is interesting, a little because its part of the process that keeps Wikipedia going and a little because its an example of a organically grown process growing to be reasonably efficient. I mean organic in that there isn't an overall plan for how all things should be deployed. No architecture.

Even without a plan there is consistency. All production deployments follow a pattern:

  1. Merge code to a magic branch in a git repository
  2. Take some manual action to make that code available to the servers

For MediaWiki that magic branch is a "deployment" aka "wmf" branch. These branches are named like so: 1.26wmf14. This doesn't map to semver in any way. The 1.26 is incremented every six months and the wmf14 is usually incremented every week. 1.26 is a major release and wmf14 is a Wikimedia Foundation internal release. The closest analogy for wmf that you might be used to is beta. 1.26 comes after 1.26wmf26 and is meant to be more stable.

In any case, these wmf releases are somewhat of a moving target in that they are a branch and not a tag. Every week a new wmf branch is cut from master and deployed. If you have to deploy code outside of that schedule then you merge it to the wmf branch.

Now I should mention that I'm doing this from memory and this is in no way going to be a perfect recollection. This should contain up to date documentation on all this long after this post is out of date.

Getting back to my list of actions, I've described getting the code into a magic branch so on to the manual action! In this case the action is called scap which stands for, uh, "sync-common-all-php" and has a super awesome logo named Scappy. You log in to a special deployment host, use regular git commands to pull the code to a special subdirectory, and then run scap or one of its derivatives and it makes sure that the files that you put in that special subdirectory and synced to all the servers running MediaWiki.

Its not perfect, but its pretty good. You could totally wrap it in a big red button. Its just that no one has yet because its really not hard enough to be worth automating. At least I think that is the reasoning and I don't disagree. Its pretty simple. The hardest thing is when a part of the instructions changes you have to make sure everyone who's committed them to muscle memory relearns them.

Now MediaWiki extensions work just like MediaWiki's core - someone cuts a wmf branch every week and deploying code to that branch in production is just merging code to the wmf branch, waiting for the bot to make the submodule update in MediaWiki's core, and then doing the git/scap dance on the deployment host. MediaWiki configuration works just like MediaWiki extensions except that the special branch is always master.

Puppet code is pretty similar to MediaWiki except that is uses a different deployment host and the special branch is called production and the manual action is slightly different. And the list of people who can merge to production and take the action is different its the operations team rather than team that write the MediaWiki code.

Java code is more complex because Java code is almost always deployed as packaged JARs and git isn't great at dealing with large binary files. So deploying Java code is done with a combination of git-deploy and git-fat. There is still a magic branch in git but that branch contains git-fat pointers to the artifacts in a private releases repository in an Apache Archiva instance hosted in production. In this case the manual action is running git-deploy's commands. You have to prepare the artifacts and get them into Archiva so its more work than PHP code but Java's always been more work than PHP so... yeah. Anyway another difference here is that rather than deploying at the Wikimedia Foundation first much of that Java code is deployed to Maven Central before its deployed at the Foundation.

So, yeah, this isn't perfect but it gets the job done and I've certainly seen worse. One advantage I think it has over a big red button style is that its easier to get inside. You can go read the scripts that run the deployment and, if not understand them, at least reason about them.