Translations and virtual touch keyboards - tracking localization

JR-Fi · November 2, 2019, 3:42pm

To combine information from several sources for a better picture of localization effort (translations and languages, regional variations as well) at the moment. Both, input and output, need to be (at least mostly) translated for an average user (most here understand English well enough, so we are not average).

Currently there are virtual touchscreen keyboard layouts for the following languages in Librem 5:

de / German
el / Greek
es / Spanish
it / Italian
ja (kana) / Japanese
se / Swedish [sv]
us / English
fi / Finnish
no / Norwegian
(fr / French pending)
(ru / Russian in the works)

More conversations on them in Using non-latin language on Librem 5, How to translate Librem 5 software? and Librem 5 available languages.

And for comparison, a table:

The core (mostly used) programs (currently conveniently translatable by community with translation tool Zanata) phosh and chatty (or chats, as it seems to be changing).
In addition to those, I’ve also included translations tracking info (only clear UI - no “fuzzy” included which may increase percentage and usability) of a few most used programs: contacts management (gnome-contacts), web-browser (Gnome epiphany), email (Gnome evolution) and Pure maps (in Transifex).
To condense the table, some languages that have variations etc. have been combined if they have essentially equal percentages.
And for a better overall picture, after selecting those that seemed most translated, there are the general Gnome translation percentages by category as well (which include epiphany and evolution). Gnome has plenty more languages than are in this L5 list (and they have their own translation teams that can be joined too).

(edit: table updated 28th Nov 2019)

The points that are hopefully clarified from this:

Better idea what keyboard layouts are most wanted/needed and where activity is needed (as you need input method as well as translated apps - assuming most early adopters know English is besides the point).
Inversely: if there is a layout, it’s more motivating to translate and, thus, activity is needed to translate (identify “gaps” that can be easily fixed).
Better idea as to how international / compatible the L5 is (currently) and who could comfortably buy/use it.

These are all in development and subject to swift changes (for instance, chats [formerly chatty] has just updated with a few strings and many languages now need to update translations). The listed languages cover almost 4 billion people but several large language groups are missing - as are many more of the smaller.

As I expect some coordination and up to date tracking/info is needed in this department now that phones are about to be distributed to masses globally. It would be nice if something similar could crop up somewhere official too - preferably automated (one can wish).

For those wanting to contribute to translations: https://developer.puri.sm/Librem5/Contact/Contributing/Translations.html, and as an opinion, it’s pretty easy for anyone - no programming skills needed. Good tips on making keyboard layouts in this thread!

For those wanting to track the issue with Zanata [has reliability issues occasionally but works now]: Zanata translation server is down and zanatas future? (Purism staff is considering long term solutions).

(edit to add: I’ll update the table at some point next month)

nhu · November 2, 2019, 4:14pm

As far as I know Swedish and Finnish keyboards are identical.

JR-Fi · November 2, 2019, 4:20pm

Not all, but in this yes. I’ve asked it to be used. Very convenient. I wonder if there are other languages that can similarly copy / have identical layouts…?

nhu · November 2, 2019, 5:57pm

I have been using Danish and even German keyboards for Swedish and it is no problem as long as you remember a few keys that are different.

JR-Fi · November 2, 2019, 6:08pm

Well… There’s always what you can “make do” with, but I doubt that’s viable or convenient in the long run. One of the most important user experiences. Yet, there will probably always be some groups that have to default to something else - especially until someone has time/knowledge/language-skill/motivation combo to do something about it.

Factoid: only five more language groups needed to include 1,5 billion more people and only five people (one for each) could unlock those users with translating. One can potentially do much for many.

amosbatto · November 2, 2019, 11:15pm

Thanks @JR-Fi for the status update. You shamed me into updating the Spanish translation. The Zanata server seems to be working fine right now.

Unfortunately, it doesn’t have Quechua in the list of languages to translate.

Quarnero · November 3, 2019, 10:57am

Yes, but “make do” by using original language is just better option if ones translation is not as precise as it should be (in the long run). If we use Croatian language (upper right tab fina.hr/finadigicert) from Financial Agency (Fina), leading Croatian provider of financial and electronic services, as authority than it is obvious that digital(ni) certifikat is used as common word for certificate. For example within https://l10n.gnome.org/POT/epiphany.master/epiphany.master.hr.po we are finding translations for certificate simultaneously as vjerodajnica (pl=vjerodajnice) or potvrda (pl=potvrde) and certifikat (pl=certifikata) even though that direct translation of Certificate to Croatian is Uvjerenje. Why not to accept and introduce Croatian word: certifikat within Gnome as simple and direct as possible when replacing certificate? Should I drew a picture of certificate for my Croatian fellows that think they know better Croatian? Otherwise I cannot help (when time available). @JR-Fi do you have some proposal how to coordinate this so that (maybe) I or someone else can focus on doing some other (missing) translations? To sum up, Croatians have for (digital) certificate in singular four words: certifikat, uvjerenje (not used), potvrda and even elektroničku vjerodajnicu (eID). I would like to see that the first one is used (certifikat) as otherwise is everything broken translation (as someone finds it partially appropriate but not consistent thought the whole hr.po file/files). Currently this is just for the record (as for myself irrelevant) but I hope this issue is helping somehow if there is good will (and logic) from someone to translate correctly / do some important job properly (as it is/would be for someone else). @JR-Fi, please don’t judge every word that I wrote here as this initiative of yours is something great, thank you!

Quarnero · November 3, 2019, 11:02am

@amosbatto why not to take https://l10n.gnome.org/POT/epiphany.master/epiphany.master.pot (or https://l10n.gnome.org/POT/epiphany.master/epiphany.master.es.po) and https://l10n.gnome.org/POT/evolution.master/evolution.master.pot (or https://l10n.gnome.org/POT/evolution.master/evolution.master.es.po) files and save those as epiphany.master.qc.po / evolution.master.qc.po (qc is just some replaceable abb. for Quechua) and upload it here? This might be accepted from the Purism/Gnome community as well, I don’t know.

ruff · November 3, 2019, 11:13am

I’m pretty sure it is subjective, depending on person making translation. Someone used to use borrowed words from other languages, others insist on using local (pure) wording.
When you read the text written in common style you usually do not notice that and accept author’s vocabulary. But I agree when the same text (or interface in this case) uses mixed wording one would assume it is done to reflect different semantic meaning behind the words - eg. Uvjerenje to be used as paper identity certificate while certifikat as the PKI digital cert.

Quarnero · November 3, 2019, 3:05pm

Thanks @ruff on your reply and actual correction! You are absolutely right (uvjerenje=paper) and your Croatian is obviously better than mine . I actually proposed the same: to use certifikat and avoid potvrda or vjerodajnica when replacing certificate in order to reach more consistency and achieve clear continuity. Do you have some time to coordinate what is requested here (at least epiphany and evolution translation) properly? If you will I can do some of drafting and send it to you (or post here as draft … in more than several days) or vice versa do some revision of your drafting, but it will be tight with my time for doing this and therefore cannot be in lead/charge. Please contact author as necessary.

JR-Fi · November 3, 2019, 3:37pm

@Quarnero not a bad question. You just mixed up two different things: keyboard (input) and translations (“output”). Most languages with latin based alphabets can “make do” with US or other latin based keyboard but have difficulties with for instance accented characters. Translations is the area where you have the challenge of selecting the right word and I’ll try to answer that:

This - selecting the best possible term for a word that A) has several options B) has only almost matching options C) has other meanings as well as the original D) changes differently according to context E) all of them - is a common challenge in translations and localizations (as different locations/areas may have different cultural implications or preferences). This is normal, so don’t worry.

The main goal is that the translations are mostly understandable as a whole - sometimes you just have to live with “close enough”. We can only aim towards perfection but in reality translations are never exact - cultural ideas do not match, for instance, language does not work that way. From there on, it’s about trying to better the quality.

(edit to add: The term “fuzzy” is also used as a label to mark words/phrases that aren’t exact and are either “good for now” and/or need attention or more info)

The tools, like Zanata (Transifex is similar tool), allow to create several ways to sort out best option and it’s a bit iterative process (as in, a couple of tries/versions over time [as you get better ideas or feedback from developers and users]).

First, as several programs are translated into your preferred language, some common terms start to create a database (I think it was called translation memory in Zanata - I could remember wrong though) of how they are usually translated and that helps to establish logical translation - same term for same use every time (which help users to better understand the logic). A good way to get that started is to copy what have been used in similar programs and situations before.
Second, there is a comment section for each word (below translations in Zanata), where you can put notes, alternatives or site examples if something similar has been done elsewhere or if you are not sure - especially because sometimes the words or phrases do not give enough information what they are associated with (what actions, what images etc.). Sometimes it feels we are doing this blind, so again, learn to accept that “close enough” is sometimes best you can do and you (or someone who comes after you) can use those comments to select something else. It is not entirely impossible that you might even have to come up with a new term/word and then it’s good to have some notes on why it’s good.
Third, there is (a button in Zanata on the upper right side) a chat channel, where you can ask and discuss about best sounding option for something - if there are others working in your language on that same program.
Fourth, with a larger group of volunteers, it is good to organize into groups: translators and inspectors (with an optional coordinator, if the group is large and there are many apps to translate, just to make sure everything gets equal attention). Translators make the bulk of the work and inspectors check the details and debate semantics. Not all languages have this luxyry, in terms of both quality assurance and possibility to do different kind of participation, different kind of work.

To sum up, you do a good enough translation first, then you make it better. In my experience, try to be consistent and if necessary make an exception to make sure the user will understand (even if it’s odd or long). Only after that you can start making it short, concise, cool or whatever. Talk to your group/friends or check other apps or official dictionaries. Sometimes those are needed, as in:

long text does not fit easily to display,
long text may not be read or understood,
abbreviation may be better when a long word covers an image (takes up space on display),
official words/terms in your language may be/sound worse than “normal speak” or borrowed terms from other languages etc.

And one more thing: remember, there are some terms that are technical or part of code or similar and need to be left alone or be very specific (type or length).

Did this help in your problem?

I’m hoping that sooner, rather than later, there will be some coordination from Purism regarding translations. But expect that we can be pretty self-organized and dependent on our own activity within our own language groups.

(edit to add: I noticed you already had an answer from @ruff as well. I do suggest though, that you try to communicate language specific and word specific discussions via Zanata (or what ever tool is used) so that they are stored there A) for easy use and finding for all, and B) not to clog threads here [or set up a separate forum thread just for translations in you language])

JR-Fi · November 3, 2019, 3:55pm

@amosbatto Well, to activate even one, this has not been a waste Although I was hoping other emotion to motivate

I think there is a way to get Quechua, if you want it. I think it’s app maintainer (or who ever put the translation project/files to Zanata) who can add that language there. Quechuas language code is QU, isn’t it? I’ll send a message and see if I can find the right person to do that (if you know better, you can do that yourself too).

JR-Fi · November 3, 2019, 4:21pm

Regarding this: I have that same feeling with my language. I had to kinda force myself to make the “official” translation a generic and use as much as possible “pure” wording (in stead of “common speech” phrasing). My hope is that at some point I can copy the file and work a “sub-language” translation (think en → en_NZ, en_UK, en_US etc. or even → en_US_NY_BR as in english - american - new york - brooklyn accent/attitude). For someone more tech savvy a translation that has error messages in english and most things abbreviated, could be more convenient in daily use. But it would not work for average user.

dcz · November 3, 2019, 4:25pm

Thank you for putting this table together! I also appreciate the effort to give us info of what people are looking for.

You make some good points, your post is awesome, and I will bring it up internally.

Quarnero · November 3, 2019, 5:21pm

Clearly and as educational as it should be! I agree that my writing/problem here was kind of provocation just to make things right and you didn’t get me wrong. I never visited Zanata but now I can understand how this translation efforts work/function. I’ve learned here that everyone, with some other language background, should join Zanata ASAP and focus/concentrate on what is to be correctly translated (without presenting terms that are not in common usage when using software language) as this makes things easier for younger ones that are not familiar yet with English.

leo-numethik · November 3, 2019, 6:54pm

Is there any chance that a french keyboard layout comes out one day? (azerty or bépo)
We have some specific letters/accents that we use quite often: ç, ù, à, â, æ, ê, ë, œ, etc.

antpanlinux · November 3, 2019, 6:57pm

hi
take the italian keyboard (where is simil symbol) and change what you want.
regards

JR-Fi · November 3, 2019, 7:14pm

@antpanlinux that’s a good idea if they fit. Can @leo-numethik make sense of the layout below (taken from https://source.puri.sm/Librem5/squeekboard/blob/master/data/keyboards/it.yaml - there are four views with three rows of characters and I expect you’d need to change the last view) and work it to resemble a layout that works? If it doesn’t fit, well need someone smarter… [to add more places for characters]

The idea being, that with a layout proposition, someone (?) can make the necessary files and submit for approval. But for this step a native speaker is needed. I suggest also looking at current layouts in other devices for your particular language.

(old version folded to save space)

views:
    base:
        - "q w e r t y u i o p"
        - "a s d f g h j k l"
        - "Shift_L   z x c v b n m  BackSpace"
        - "show_numbers show_eschars preferences         space        , period Return"
    upper:
        - "Q W E R T Y U I O P"
        - "A S D F G H J K L"
        - "Shift_L   Z X C V B N M  BackSpace"
        - "show_numbers show_eschars preferences         space        ? period Return"
    numbers:
        - "1 2 3 4 5 6 7 8 9 0"
        - "@ # € % & - _ + ( )"
        - "show_symbols   , \" ' colon ; ! ?  BackSpace"
        - "show_letters show_eschars preferences         space        ? period Return"
    symbols:
        - "~ ` | · √ π τ ÷ × ¶"
        - "© ® £ $ ¥ ^ ° * { }"
        - "show_numbers   \\ / < > = [ ]  BackSpace"
        - "show_letters show_eschars preferences         space        ? period Return"
    eschars:
        - "á é í ó ú Á É Í Ó Ú"
        - "à è ì ò « » ù ! { }"
        - "show_numbers   \\ / < > = [ ]  BackSpace"
        - "show_letters show_eschars preferences         space        « » Return"

leo-numethik · November 3, 2019, 7:33pm

Most of what we need is effectively accessible in the italian mapping (just a story of reordering some letters), ç, ^ (â, ê, î, ô, û), æ (very rare in our language) and œ excepted.

Unfortunately, as it is, it is not possible to use it in a common french conversation (except by making phonetically equivalent voluntary mistakes).

I hope some developer will work on it in the future.

JR-Fi · November 3, 2019, 7:48pm

Do I understand right, that you need all of these (lower and upper case) characters?

â ê î ô û Â Ê Î Ô Û
à è ì ò ù À È Ì Ò Ù
á é í ó ú Á É Í Ó Ú
ç, æ œ [uppercases for all of these too?]

Can you find / take a picture of a touch keyboard where they have been laid out in a sensible way and post it here? This just a case of calculating the number of characters first and then seeing where to find space for them.