Finally, the promise of a centralized localization interface for Drupal modules and themes looks to be coming true. I've started work on this project around two years ago under Google Summer of Code sponsorship and was continuing maintenance and improvements ever since. While I was spreading the word on it, not many people signed up to help clean up some possible performance problems, so it did not make into Drupal.org yet.
However, earlier this year I've got reviews from some key people in the infrastructure team, especially Gerhard Killesreiter, who persuaded me that setting this up is more important then it not being perfect yet. Software is evolving matter anyway, and we should improve as we see the problems. So I've started to set up localize.drupal.org. While we work out some of the kinks like single sign-on with drupal.org (one of the promises of the drupal.org redesign which will be delivered here), I thought it would be a good idea to discuss the implications.
So why do we need a localization server?
To recap, the underlying issue prompting us to set up a web interface for localization is manyfold:
- We can increase participation in translation by a huge deal if we don't require translators to use CVS (or don't depend on module maintainers to commit translators on the submitters behalf). Also, translators now need to generate their translation templates themselves, since the templates generated by module maintainers are outdated. We can eliminate most (in many cases all) of the gettext tools too and get translators to just focus on the text.
- When a module is released, translations are currently packaged with that. Since modules usually make string changes late in the release cycle as well, and usually give no previous heads up to translators on the release, the translations are not scheduled with the releases. Translators work after the fact, and their work is only delivered to users when a new release of the module is tagged and built.
- Although Drupal itself shares translations of strings among modules, the gettext file based solution does not allow for this, except if translators merge the strings directly. It is easily possible that modules translate strings differently or that translators need to translate strings again.
- CVS does not provide a submission and review workflow for translations. Unlike module code and patches, there is no way to review translation changes and maintain an (optional) approval workflow as it applies to module code. Translation updates are rarely submitted as patches, and asking translators to submit patches would just widen the toolset they should use again.
With all the disadvantages of the CVS and gettext .po file based translation system, we should still maintain a system where the transport mechanism is .po files, since that is the file format used by all versions of Drupal (including 7) to import and export translations.
So the localization server solves the above problems by using a centralized database of translations where equal strings and their translations are shared among modules (just like in Drupal), there is an optional approval workflow where moderators can approve suggestions or let certain or all people directly change translations of strings. It also provides a web service interface on top of this for Localization client users to submit translations directly from their workflow of translating English strings on their own site.
Since it will run with a single sign-on setup with drupal.org, anybody with a drupal.org account will be able to log in (not yet), and contribute. How's that for lowering the barrier from CVS and gettext tools?
What happens to packaging when not in CVS?
Thus the localization of projects on drupal.org will be removed from CVS, .pot and .po files will not be hosted there, but instead generated and packaged independently from the database of localize.drupal.org. And now comes the task not solved yet. Projects have branches and releases in those branches, so let's say a module has module-6.x-1.x and module-6.x-2.x branches with releases like module-6.x-1.0, module-6.x-1.4 and module-6.x-2.3. The different releases can have different sets of strings, even on the same branch. This is all nicely flattened on the localization server user interface, so people can filter for specific versions or just translate all versions.
But when thinking of the translation packages for users, we'd need to support translation packages for at least the latest stable versions of these branches updated as changes to translations are made. Think module-6.x-1.4-translations-1 which could become module-6.x-1.4-translations-2 when updated (if we think incrementing version numbers) or module-6.x-1.4-translations-20090730 which would become module-6.x-1.4-translations-20090806 next week (if we think timestamped snapshots of translations each week). There'd also be module-6.x-2.3-translations-X packages as well.
I'm assuming we can leave users of older stable versions without updated translations, since they do not update their code for the bugfixes either. This limits the number of "translation branches" to the number of branches the module has.
These branches need versioning however. We can either do snapshotting on a given time period or let the translation teams push new versions for projects when the time comes. However, since packaging each language per each project relase would be a diastrous amount of pacakages, keeping all translations for a module release in one versioned package seems more logical. So all language teams would release an updated translation at the same time. While this might sound limiting, it is still way ahead of the current situation where a module maintainer chooses the release date.
Ok, then how do site maintainers update translations?
Let's consider Drupal 7 first. If we think translations as separate packages from modules, themes and install profiles, then we'd need some kind of container for them in the Drupal file system. A top level "translations" directory maybe. As long as we version translations (and we have .info files for them), we can rely on update status module to provide update information and then people with tools like Drush or Plugin manager can update their translations that way. Translations are special among Drupal projects in that they only serve as a transport mechanism for database data, there is no living code, so we'd only use the packaging and updating infrastructure to facilitate the versioning. What might sound tricky here, is that we'd always need to grab the latest version of translations for the versions of Drupal modules we use. Great relief is that Drupal 7 just started to support version level dependencies today, so we can say what module version our translation is dependent on, so the right one can be picked.
What happens to Drupal 6? One option is that we keep supporting the CVS based translation interface for Drupal 6, but that would require that people actually commit translations from time to time from the localize.drupal.org database to the CVS repository (individually per project release). That sounds quite painful, so we can maybe take a queue from the Drupal 7 ideas (which were by the way based on the clever way how Features package their contents and how Drupal 7 install profiles provide their dependencies). So we can pretend that translations are of some type of project Drupal supports (probably a module) and build a glue module which would hide them on the modules page and support version level dependency for them on Drupal 6. We could then roll this system out incrementally, and migrate multilingual sites over to this system. We can combine this with letting people commit .po files to CVS, so those not willing to migrate to a new system to get translations will get some stuff, but will probably be quickly left out in the cold, if the new system proves to be as convinient as it looks like.
Ok, this might sound a bit crazy, but we are reinventing how translations work and while I do not have a live localize.drupal.org instance to guide you through to get more background, we should figure this out, while Drupal 7 is open for development, so we have an updated translation deployment in place. There are numerous existing localization server instances on the internet used by various the translation teams, so you can check those out for now.
Wouldn't you like to just select from a list of languages pulled from a remote web service on Drupal installation (if an internet connection is available) and get Drupal download the translations you picked for you? Would you like automated translation updates when modules update (which does not happen in Drupal 6)?
Let's reality check the above plan so we can run down on this path and deliver streamlined localization support in Drupal 7 (and maybe even backport many of the goodness to Drupal 6 via contributed modules)!