Here’s what I was working on… It was waiting for more answers (that never came) and it wasn’t polished yet, so it was left to gather dust. It’s actually two texts - first one to guide, second about need/fixes. A tad long. If there is anything usable, it can be copied to wiki [I don’t want to do it as it’s nice and concise atm].
What changed in translations and how to do them now?
Without going too deep on the history (see: “Zanata is down” Zanata translation server is down and links in there), the system and work for localizing Librem 5 has changed. The old translation tool, Zanata, has been replaced with Damned Lies (DL), what’s used for various Gnome apps translation. With this text I’m trying to ease both the transition to the new system from the volunteer translator perspective, as well as for those who haven’t yet started but are looking for a way to help out (not just Librem 5 but other open source projects too) - or just looking for something useful to do. One translator has the potential to help millions.
For clarification, translation is big part of localization (“l10n” for short – first and last letter with number of characters in-between). Sometimes they are used as synonyms (with internationalization, “i18n”), but since the work often includes more than just direct translation of words (as in, meanings and ideas) the more general term for the work can be used to express this. In some cases, a described method or information may be different due to varying local reasons (different commercial systems or regulatory requirements for example). Also, sometimes for technical reasons translations do not fit well (for example too long) and a similar-enough replacement may have to be created. More on this: https://en.wikipedia.org/wiki/Internationalization_and_localization
The ”new” system for managing files and statistics is called “Damned lies”, abbreviated DL (there is wikipedia page for that name, if you are interested). For clarity: DL is translation support system but I may also refer to “the Gnome system” and Gnome translation project (GTP) as those encompass a bit more, like how things, data and people are organized - a way of thinking and doing things.
And to emphasize, DL works fine for Gnome related projects, but they are also more directly connected to the DL system and get the translations automatically, if I’ve understood that right – unlike L5/Purism. It does not seem impossible to use DL, it just doesn’t seem to have been made to be used quite like this. And I guess in a hurry, the process wasn’t tested, evaluated and documented beforehand thoroughly. Thus, some hints and advice are needed (probably even after this novella).
Getting started
I hadn’t tried the DL site or workflow previously, so I’ve tried to describe here what it’s like to get set up for newbies, and how it differs from previous, for the benefit of those who already tried Zanata or similar website based service/tool. The only caveat that I can think of is, that some larger language groups may have more deliberate process to guide translations and translators, as well as that some minor languages or localities (remembering that same language may have differences between areas) may not have representation or it’s dependent on one other persons activity. This is due to there already being a lot of organized (and less organized) translators. In general, they are a helpful bunch and eagerly accept your help – even if it’s limited just to L5. Here is the process description for Gnome translations (https://l10n.gnome.org/help/vertimus_workflow/), where you’ll notice challenges in having separate people for each step and how teh disconnect of L5 in the commit/merge stage needs attention.
My suggestion is that you go through first DL’s wiki pages on translation, which are pretty concise (but do seem to have old advice too, here and there, as well as not relevant info for basic translations).
Suggested for starters:
- Joining: https://wiki.gnome.org/TranslationProject/JoiningTranslation
- General info: https://wiki.gnome.org/TranslationProject/ContributeTranslations
- Workflow visual: https://l10n.gnome.org/help/vertimus_workflow/
- What are po-files (see middle of the page, ignore top and bottom): https://wiki.gnome.org/TranslationProject/DevGuidelines/Localize%20using%20gettext%20and%20intltool
- More extensive: https://wiki.gnome.org/TranslationProject/LocalisationGuide
More indepth and additional info for later, links:
- https://en.wikipedia.org/wiki/Internationalization_and_localization#Standard_locale_data
- https://wiki.gnome.org/TranslationProject
- https://wiki.gnome.org/TranslationProject/Scripts
- https://wiki.gnome.org/TranslationProject/LanguageCompletionStatus
- https://wiki.gnome.org/TranslationProject/DevGuidelines
- https://wiki.gnome.org/Home
- https://en.wikipedia.org/wiki/Gettext
All the languages and their codes:
All the language teams (191 at the moment, see “survey” link for an overview):
https://l10n.gnome.org/languages/
https://wiki.gnome.org/TranslationProject/Survey (circa 2011)
… and ask questions at their channels about general stuff (app related to app specific addresses).
After reading a bit, sign up for an account, join your language team and wait for acceptance notification from team coordinator.
My experience was good, even though DL and the Gnome system seems a bit of a step back when comparing to web GUI translation sites. But, even though this changes a bit things for the translator, it helps the developers a lot, if I understand correctly: lots of volunteers to translate (although no guarantee that they translate L5 related modules), stable system (with a lot more languages and locales supported than previously) and better management of translation files for a project this complex (good for upstream, sharing, etc.).
Translation workflow and the process
The main difference in the work is that, where Zanata had the strings/messages (both terms are used in general, meaning phrases that consist of words or other symbols) that needed translating presented separately in a graphical web interface with the tools, DL gives you a .po textfile (portable object, a way to separate program and content strings/messages - https://en.wikipedia.org/wiki/Gettext) that you open in an editor. Zanata did have .po-files available too, but it is assumed most did not use them. A .po translation is based on a .pot-file which is the original/template source but ordinary translators should not need to work with them.
The workflow process for translation is approxiamtely
- login to DL
- you go to module (app) website and to your language (if it does not exist, request it)
- reserve the module in your language page
- download the .po
- use your editor to translate the file
- upload the .po
- make sure the file is not reserved anymore
To get the po-file that you want to work on (in this case on of the Librem 5 apps): go to your language team’s page --> see the listing of different releases (don’t get confused by that term) --> see “Librem 5 - Purism” --> click the numbers (links) next to it to open list of modules (the term means an apps/software) --> go to the app.
Alternatively you can open a list of all the possible modules in DL from the last line of the different releases but be careful to go to the current master version. From the selected module page you reserve the translation task and download the po (see wiki for details and/or ask team coordinator and/or channels). You can also download a .po just to look at it or for practice - there is no harm in that. As long as you don’t reserve it (which may prevent others form working on it).
Translation, editing tools
A file named ”<nameofapp.version.languagelocalecode>.po” of a module can be edited in a text editor but it is better to use a translation application like Poedit, Lokalize or Gtranslator (Gnome Translation Editor - nothing to do with that search engine company or their translation site). Some of them have extra features to help with the work. I’m going to comment shortly on first impressions using Gtranslator and Poedit here as they were easily available for me. Web based tools (hidden .po-files) have been tried and it is possible some languages use these in some translations.
Gtranslator is very simple editor for linux. It shows everything in neat boxes and marks missing and fuzzy translations. After you’ve done your sign-up to DL and install, open a po-file. First it helps you to create a profile with all the relevant info (note: you can get the “nplurals…” value from your team page). Poedit is available for most platforms. There are only small differences in usability after the newest version of Gtranslator also got the personal translation memory feature that can be used to translate recurring strings easily, which may be handy over time.
When editing textfile .po (without editor), you add your language after the msgstr (“message string”). Don’t touch the msgid lines of the original source (English) strings in the file, or the translation file will be corrupt. Notice also that there may be extra spaces here and there at the ends of words/stings. At least Poedit’s “validate” shows if there are unexpected technical differences that you need to check out before saving and uploading.
While you translate, if you see peculiar (“C-format”) strings, like %d, %H:%M, \n,\t, “something” etc., they are placeholders for things the program will fill etc. These are similar to what you may have seen in Zanata as well. Just copy and translate around them, but in general do not change them. At some point you may understand what they do and can adjust the translation accordingly.
Some time format related can be found from https://developer.gnome.org/glib/stable/glib-GDateTime.html (under “g_date_time_format”). An example of a translation that may need localization is “%Y-%m-%d”, which shows time in the format of “year-month-day” like 2020-07-25 (one standard format). But if it’s a format that your region does not use (mostly in mine everything is in day.month.year as in 27.07.2020), you could (if you are absolutely sure) change the order and separating characters (because it would be better readable to your region), provided that it’s not something that the app is expecting to be in that specific order (in which case do not change it – just copy it as is). A similar one – but which you need to be careful with – is changing the format of representing time in 12h (AM/PM) or 24h formats. If you do not know what it does, do not change it (just mark if “fuzzy”/”needs work”). You can ask about it afterwards and update the translation when you understand it.
Quality and support
After uploading the .po back, the translation is (hopefully) checked/proofread by a reviewer from your language team. There may be corrections and the .po may even need additional translation (if all was not translated). After this, it is submitted to be committed, as the new version of your language. The Gnome/DL system is hierarchical and tries to prevent that one person does not do all these steps for quality and security purposes. Of course, this requires enough active volunteers.
As always, it may be difficult to identify the context or meaning of a string (especially if it’s one word) in the .po. It would be good if the translator had the latest version available for reference (for instance running an image on a virtual machine). The developers should use as much as possible comments and context information (see Gnome dev documentation). If understanding a meaning or identifying function is difficult, the owners of the modules have left their emails on the module page but it’s better to file a bug report. Team coordinators are helpful too.
Other option (besides not translating it) is “fuzzy” (or ”needs work” as it’s named in Poedit), which means it is known it is not entirely correct and needs to be looked at. Translation may become “fuzzy” also when the original string or its function changes. In addition to marking something as “fuzzy”, in the editors there is the option of adding notes. These can be very helpful when several people – sometimes after years and long after the previous person – translate the same file. Short description of possible variations that could be used or the unclear context or other problem can be added here. But ask about the problematic string somewhere to solve it because no-one may find the note if they are not looking for it.
Once the new .po is uploaded back, the DL will display the new file and also provide a comparison page that show differences between old and new, if that interest you.
There seem to be ways to use translation dictionaries and machine translations to help, but it is the job of the translator to make sure the meaning is conveyed - not just the literal translation. For qood consistent quality and similarity, translator should use the same translations that have been used in other programs (“stock”) - so check examples from them too. Using identical translations helps users, as it gives familiarity with the app. Checking the result - testing - is encouraged (you may also find bugs to report to programmers).
Since there isn’t a built-in possibility for language specific discussions, new channels for that are needed. Some language teams have set up their own chats, forums and siten with localization info beyond Gnome projects. Use them as much as possible. For questions regarding DL, workflow and tools check the wiki, contact your language group or one of the Gnome community channels. If there is a dispute or a non-responsive coordinator, there is a process for that too in GTP. The https://forums.puri.sm/t/translations-and-virtual-touch-keyboards-tracking-localization will continue as the place for more general and L5 specific translation and localization related things.
In time, there will be new iterations of translations: fixing spelling errors, inventing better phrases for previously translated or translating new strings/messages. Changes may be reported via different channels: gitlab issue tracker system, developer email or they may appear in the .po via other means. Therefore it is important to keep an eye out for these messages and changes regularly. As the app updates, often the translations need to update too. If the apps do not update, it is good to see which one of the other Gnome modules might need work.
And read the wiki articles listed in the beginning.
Problems, challenges, todo-items – wish list to make it better
There are several interconnected and separate things that I’m picking (and nitpicking) here. Things that should be commented in some official way by Purism – some sooner, some later. To some of these there may not be an answer and some may be buried under something called ”Dogwood” that’s apparently (hopefully) taking all the resources. Also, as I said, I’m looking at this from the outside and have no idea if there are reasons based on other processes that direct translation aspects. I’m only learning DA and I’m not sure Purism is quite up to speed on how to use it effectively either. Or, these thoughts have been shared somewhere else and I’ve missed them. I also want to set limits to my level of involvement, effort and time I’m willing to donate (and hope for a painless as possible process to help), which I think is reasonable for others too, and why I’ve only gone ”this deep” into some of this.
Translations page (https://developer.puri.sm/Librem5/Contact/Contributing/Translations.html) is a bit… too concise. Surely we can have better than pointing to DL, which can’t help with a lot of things.
The FAQ, Get Involved (main Puri.sm pages) and Contribute sections are missing translations/internationalization info and links. And there isn’t a link to the developer wiki, so it’s difficult to find this mode of participation. Also, the “Get involved” page is not in drop down menus (or under contacts, which could also be logical for it - as is with community forum info), but only in the footer, which to me just seems weird.
L5 docs (as well as Purism About page and other product pages) could do with a description of internationalization efforts / policy / strategy. Some questions to answer could be: how long is there support for a language, how is it supported, what can Purism guarantee, what is the connection of understanding and connecting people to safety and security, which apps are part of Purism responsibility and which are someone else’s (and what for instance does Gnome do), how to check which languages are supported (translations, localization, input methods) and to what extent, how does Purism try to support quality, what are the overall processes, can some languages be supported more and how, what to expect / vision in the future (if a path exists), etc. It’s not wrong to also describe the limitations and challenges.
The translation effort would benefit greatly from announcement page (maybe a separate thread, possibly updated by staff only). I’d prefer a public page instead of an email list but I suppose that could work too. For instance, it was a surprice to find calls available to translate. But what’s more, it will be important for active participants to know when files have been updated and when there are deadlines (like when a new version is pushed out). Or when other events ad changes occur.
The DL module pages have personal contact emails for questions, which may be normal, but I’d like suggest some generic “translations@puri.sm” mail for more concerted effort and support (especially if there are going to be more apps to translate). As there may be things that would be good to tackle as one and work identically on all – no need to invent the word for wheel several times. And should a person lose email or stop work on an app, there would still be a working contact.
A guide for getting a community app translated is needed, as otherwise it’s likely that a lot of those L5 specific apps make the whole localization haphazard (image of L5 quality and all that), even with a large number of well translated Gnome apps. The dev examples has a page about translation files (https://developer.puri.sm/Librem5/Apps/Examples/General/Treasure/Overview/Translation_Files.html#examples-treasure-po-dir) but it could be expanded more about how to get translations and how to take translating/internationalization into account (for example: keep contexts simple and logical, keep texts concise, keep strings short but descriptive, expect translations to need more space, offer additional context for translators who may not know anything about the app, offer a contact where to ask about translations etc.). See also: the guidelines for developers regarding translations (which should be promoted) https://wiki.gnome.org/TranslationProject/DevGuidelines as well as process of adding translation support to an existing application https://people.gnome.org/~malcolm/i18n/
The Damned Lies translation site, with regards to L5 translations (which are of course not part of Gnome system/apps development), helps to manage translated files, but helps less the translation work. It was not built for that and it’s recognized by the current users too as a shortcoming. In this case the abnormal file management process (because this is not part of normal Gnome processes) may lead to a dead-end. Translators upload normally their translations to the l10n-server and may leave it there to be picked up (merged) or don’t bother at all, if they notice that the files should be sent/uploaded/committed elsewhere. It’s a show stopper. Guides and hints are missing from there and Purism pages on what to do. And if someone’s old translation is ready to be committed, others can not work on the newer versions (I’m unsure if and how this can be resolved).
It seems there a few developers that have been making translation commits to L5 because they use both systems, are more familiar with the systems and have access but several languages are in a limbo now. The move from Zanata may have worked for those, who were involved in DL and L5 development already but current situation is a disservice to those many that are less coding-inclined but would like to participate. This most likely will not be the end of this, as the process and information can be improved in many ways, as I’ve tried to hint here and there. Those few knowledgeables who have managed to merge translations, and know what to do, could write something and talk what the situation is with their languages.
Unlike in Zanata, there is no translation memory that is built over time, and the Gnome/DL system relies on proofreaders (https://l10n.gnome.org/help/vertimus_workflow/), which they already have organized. (A side note: they do not fully replace one another and Zanata could also use proofreaders, but they solve the same challenge). This may affect quality in positive (checks, usually by safe seasoned translators) in languages where there are proofreaders but also in the negative (no knowledge of L5, harder to keep translations in line/similar). For quality control and easy/quick fixes, contact (email/emails) needs to be available and preferably shown in the OS as well visibly. [edit: POedit does create a local translation memory that can be shared and exported]
In addition to above, Purism should pay professional translators to do (sometimes random) checks every now and then. A yearly translation audit would be a nice addition to any inclusiveness / corporate responsibility reporting. A professional service could take a look at the coordination, analyze quality, give recommendations on how to improve. And maybe have a couple of less active (or not started but interesting) languages professionally ”jump started”. This is not just for L5 but all of Purism products.
For those among us, who wish to participate. To be effective and to bring most benefit to users, translators can prioritize all the short one or two word strings which people mostly use. This should also enhance your languages statistics. But do not leave the longer strings for too long as it’s usually the more difficult aspects of an app where good translations are really needed. These are also often the ones with security implications as a user might not clearly understand what effect their selection might have (or they may avoid certain functions because they are unsure). There is no absolute number when an app is ”translated enough” to be usable – even though some 80/20 rule may apply. But, again, it may be inverse, as most users may know a little of a languages basic words and the need may be in the less used more complex strings. Mixed / partial translations may also add to confusion and lessen the feeling of control (and thus security) of the device as the user may not fully understand what a dialog or a setting is affecting / connected to.