In the second part of my article series, before we got on a developer detour, we discussed that Drupal's software interface translation can be pre-provided and collaborated on by the community, but this time we turn to your own content. What's considered content on a Drupal site? Well, in a broad sense, anything that you enter beyond the software user interface translation. For this article, we will limit our discussion to nodes only, and move on to the rest of the structure and page building elements in later pieces.
Once you enable the built-in Locale module, an interesting small functionality is made available that we did not talk about yet. When you go to edit any content type (at Administration » Structure » Content types), you'll find a Mutilingual support selection widget on the (probably not so obvious) Publishing options vertical fieldset item. This setting really relates to submission options, not publishing options, but that is where it is now (see the related core issue).
But what does enabling multilingual support on a content type means? Well, you'll be able to assign a language to each node for enabled node types. This is useful if you'd like to run a personal multilingual blog for example, where you ocassionaly post in alternate languages, but you have no intention of translating the posts to different languages, just mark them as such.
Once you language enable a content type, a language selection dropdown will appear on node submission. This will include all enabled languages and a special item called Language neutral (which is made the default). The intent of this option is to let you submit nodes without relation to specific languages. Think of nodes you use for images for example, like photos of landscapes that belong to a category of taxonomy terms. The listing or explanations for the gallery would contain translatable text, but the photos could be just the pictures themselves, without translatable content, so you'd use language neutral nodes. (Using entire nodes for image galleries might not be best practice anymore, see the Media project that works with custom entity types).
Language neutral uses the language code 'und', which means undefined and is specified to be used for such cases by the ISO 639 standard. You'll see this language code in Drupal source code all around, so good to know.
Once you language enable a content type, conceptually you should be able to use Views to limit listings per language for example. Drupal core node listings are not language-aware, so merely enabling language support will not limit your front page content listings to certain language nodes only as you switch languages. Unfortunately Views will not let you filter for language in this stage either due to a bug that it bundles the language field filter with node translation. Once you have node translation (see below), views will provide language filtering options.
Drupal core carries on the built-in functionality from Drupal 6 to handle translations for nodes in basic ways. Once you enable the core bundled Content translation module on your site, you'll be able to go back to Administration » Structure » Content types and choose Enabled, with translation for multilingual support.
This will in itself not change anything on the node submission user interface. However, once you submit a node in a specific language with that type, a Translation tab will appear on the node edit form. This tab will let you get an overview of all languages on the site, and whether translations of the node are available in those. You'll also be able to add translations for the node or edit any translation. When you choose to translate the base node into a different language, some default fields and the language field will be prefilled for you.
When you edit the original node afterwards, you'll find a new Flag translations as outdated option, that will mark all translations as outdated. This is to be used when substantial editing is made to the base node. This is used by core and contributed modules to highlight outdated translations and help content providers fix up their translations. When a translation is updated the This translation needs to be updated checkbox on the translation should be unchecked. This flag allows translations to be edited without the requirement to also incorporate all source updates at once as well (which would be the case, if we'd only be able to compare last update timestamps).
The built in node translation does not do much else. It also alters the language switcher block links to point to different nodes when switching languages where translated variants are available and it displays links by the nodes themselves for other language versions. However, there were a whole set of modules built around this model in Drupal 6 including the nifty Translation overview module that is still being worked on for Drupal 7 at the time of this writing.
It is important to understand the concept behind this translation feature. The built-in translation module forms translation sets of nodes, where each set has a base node that is translated to the other languages. So the general idea is that you submit nodes in the source language first and translations come afterwards. Translation set relations are very simply managed in the database, but I have not yet seen contributed modules to expose advanced editing of this data (such as assigning existing nodes to existing translation sets, or switching to a different base node for the translation set). It should be fairly easy to write such a module though. In summary, the general concept is that translation sets are formed on top of nodes.
Key advtanges of this approach include well built support for node level content handling all across Drupal core and contrib. The node listing in Drupal allows for language filtering, Views has built-in support, rich node level access checking can be employed to limit permissions on translations, workflows and rules can be set up for nodes, and so on.
Key disadvantages include the above mentioned missing support for advanced translation set handling (which could be implemented in contributed modules) and that sharing data between nodes is cumbersome (and requires more contributed modules). Think of an image gallery again, but this time with captions. The images should be the same, but translations should be made available for titles, captions.
With the effort to try to solve some shortcomings of node level language support, Drupal 7 now supports languages on the field level! With a huge set of CCK functionality now built into Drupal core, many things on nodes, users, taxonomy terms, comemnts, etc. are fields. Drupal core has built-in support for field languages, you'll find the body field value in code via
$node->body['und']['value'] or for custom fields such as
$node->field_name['und']['value'] (for nodes in undefined languages). Field API lets developers to mark translatable fields, as is done for the default body field, however, no built-in user interface is provided for translatable fields.
Let's compare the concept of field translation to node translation first! As explained above, node translation sets up a set of nodes as a translation set, so all node level tools can be reused for individual translations. Field translation pushes translations inside the node, which allows for handling the whole set including all translations as one single node, and is more applicable when our use case calls for that approach. However, not all pieces of node information are fields.
To provide a user interface on top of field translation, the contributed Translation module suite was built. The suite includes a translation_node.module which replicates the core functionality, and a translation_upgrade.module providing a migration path from node translations to field translations. In a possibly confusing way, instead of leaving the core translation module alone, the base translation.module in the suite replaces the core translation module once enabled (due to using the same filename). That also means you need to run update.php (if you ran core translation.module before) to actually install the module, because technically there is no new module enabled from Drupal's perspective, it just 'moves' to a different place. There is a discussion about renaming the base module.
Now a separate content language detection option appears in Administration » Configuration » Regional and language » Languages, that you can just set to Interface for now, so it uses the same language as the interface. (I'd also suggest you use the URL method for interface as a start, and experiment with the rest later).
You'll also find new options titled Content translation settings under Administration » Configuration » Regional and language. You'll be able to translation-enable certain entity types in Drupal core here. Unlike Drupal 6's CCK, Drupal 7 comes with fields support for beyond nodes, and instead defines a generic base concept called entities, of which nodes are just one type of. We'll limit our discussion to nodes here, so just enable translation for nodes for now.
Remember, above I said Drupal core translation module lets you choose Enabled, with translation for node types. Now, the contributed content translation module allows you to choose Enabled, with content translation and Enabled, with node translation (if the translation_node.module is enabled). Pick the former. Once you content-translation enable a content type, fields will have a Users may translate this field switch, which lets you specify field translation per field. This is the major value-add of the module, you can cherry-pick the fields you'll need to have different in translations.
For a configured content type, adding a base node looks like as before, even the translation tab on the node will look familiar, but once you go to add or edit a translation, you are not actually submitting another node. The translatable fields of the node will be displayed but not most of the other values. The module lets you assign a separate URL alias for the translation (Drupal supports different URL aliases for the same path per language), but you'll not be able to edit the author of the translation, provide a submission date or do other administrative editing on the translation.
Once you have translations submitted, the language switcher block can be used to switch between the translations. If you chose to have different criteria for content and interface language, both will have their respective blocks. You'll notice that your node URL keeps being node/X, but the language of the node displayed will change based on the identified content language.
I'd explain the concept behind this method of translating content is that it creates kind of "language based subentities" under the node entity type. There are "the Hungarian translation of node 8" and "the German translation of node 8" all under node 8. Some non-field properties will be possible to make different in translations because they have specific implementations in the module, but not all.
There are other limitations and characteristics of the module as implemented currently which might change / improve over time. It does not yet support revisioning of translations. Because translations are second class citizens here, when you view the translated node, hitting the edit tab goes to edit the original node, not the translation.
The module keeps track of the submitter user, creation date, last update date, publication status, etc. of the translations in its own database table, and the original node retains the based node values of course. I looked for but could not find Views integration for these translations. I think Views possibly needs to have an entire "translatable node" type to know about, which takes into account the language code and the node number as a compound identifier when making listings, so it could do lists with the individual translations. Currently I could not find ways to list different translations of the same node in a view.
Nodes also have a pretty strong toolset for permissions, which fields are far from matching. Contributed modules as well as custom code can be used to fine-tune node level access for viewing, editing and deleting, but the same is not true for fields (yet?). For nodes, workflows, rules exist, so you can set up different translation nodes in different translation states and fire actions on transitions. These again need to take into account that node identifiers plus the language codes form a compound identifier and to be considered separate to achieve the level of feature set the node level translations offer.
Finally there are also relations of nodes to other things which are neither fields, neither duplicated by translation module. There is some special support for displaying only comments in the same language as the content language but the last comment timestamp is maintained as one single value for all languages. Same holds for menu items. The node form has means to assign the node to a menu item, but given this is a single node, it only has one menu item relation, so translations need to participate in the same menu too.
My understanding is that lots of tools need to be re-imagined to work with language variants under nodes, versus languages assigned to nodes. And it affects functionality all across the contributed module space. Many tools were made translation-set friendly based on the Drupal 6 translation feature set earlier, and translatable fields turns the model upside-down to introduce a completely different methodology to which they still need to adapt. And once/if they fully adapt, translated versions of nodes will need to replicate all of node's functionality to compete with the feature set, so they could have been nodes to begin with.
Keep the limitations of the two systems in mind when choosing your way of doing translations on your new Drupal 7 sites, and keep an eye on updates to tools to support either method better. To recap: node level translation comes with hard to share fields and hard to manage translation sets and field translation comes with missing functionality across existing common tools.
Third option: shared fields would be the ultimate solution?
My understanding is to take advantage of all existing tools (views, access checking, workflows, rules, etc), it would make sense to work on shared field instances, so that we can use the listing, access, workflow, editing, etc tools, and just share the data on the field level between entities for the types of fields we'd like to share data, instead of trying to go the (as far as I see) entirely too separate node translation way or the (in my view) way too integrated field translation method that is still in its infancy.
I do not have resources to work on this unfortunately. I wish I would have.
Site settings and layout up next
Huh, this was a lengthy piece wasn't it? I hope it provided you with good insight into where things stand with node translation. I'm sorry there is no golden answer yet. As we move forward there are more and more moving parts.
In the next piece, I'm planning to look at site settings and layout (blocks and friends). There are lots of interesting things there, I can assure you.