[[!meta title="Translation platform"]]
Until 2019, our (website) translation infrastructure relied on
translators [[being able to know how to use
Git|contribute/how/translate/with_Git]]. This was a pretty high entry
barrier for new translators, especially those who are not familiar with
Git or the command line.
This is the technical design documentation of our setup.
We also provide a dedicated [[documentation for translators on how to use
Weblate|contribute/how/translate/with_Weblate]] to contribute
translations. (This link will work once [[!tails_ticket 11763]] is
Terms used in this document
- Canonical Git repository: the main Tails Git repository that our
website relies on, in scripts often called "main repository" or "main
- Production server: the server that hosts our website
- translate.lizard: the VM that hosts our Weblate web interface, the
corresponding Git repositories, as well as the staging website.
[Corresponding tickets on Redmine](https://redmine.tails.boum.org/code/projects/tails/issues?query_id=321)
Setup of our translation platform & integration with our infrastructure
We are using our own [Weblate instance](https://translate.tails.boum.org/).
Weblate uses a clone of the Tails main Git repository, to which
translations get committed, once they have been approved by a user with
reviewer status. Non-approved translations live on Weblate's database
only, until they get reviewed. A staging website allows translators to
preview non-reviewed translations in context.
Approved changes are automatically fed back into our canonical Git
repository. This presents a major challenge, indeed we need to ensure
- no merge conflicts occur:
- such conflicts often occur on PO file headers which prevents Weblate
from automatically merging changes
- many contributors work on the same code base using different tools
(PO files can be edited by hand, using translation software such as
POedit, or they are generated by ikiwiki itself, which results in
- only PO files are committed.
- the committed PO files comply with shared formatting standards.
- no compromised code is introduced.
In order to integrate Weblate and the work done by translators into our
process, we have set up this scheme:
[[!img "lib/design/git_repository_details.svg" link="no"]]
Website and Weblate
Our website uses ikiwiki and its PO plugin. It uses markdown files for
the English original language and carries a PO file for each translated
language. Thereby we distinguish languages that are activated on our
website from languages that have translations but are not yet activated
on the website because they do not [[cover enough of a portion of
our core pages|contribute/how/translate/team/new/]] to be considered
We have defined [[a list of tier-1
languages|contribute/how/translate#tier-1-languages]], that we consider
to be of importance to our user base. No more languages shall be
activated in Weblate as our main Git repository carries reviewed, and
thus approved translations of all languages enabled on the Weblate
platform, while only part of them are active on the website.
Each PO file corresponds to a single component in Weblate, in order to
appear in the Weblate interface. For example, the component:
relates to the files support.mdwn, support.es.po, support.de.po, support.pot,
The repository used by Weblate is cloned and updated from the Tails main
repository, and its master branch. Changes generated on Weblate's copy
of the Tails main Git repository, located on the VM which hosts the
Weblate platform, are fed back to the Tails main repository, into the
master branch, automatically. This happens through a number of scripts,
checks, and cronjobs that we'll describe below.
There are several languages enabled, some of them with few or no
translations. As everything is fed back to the Tails canonical
repository, all files are available when cloning this repository:
git clone https://git-tails.immerda.ch/tails
If needed, for exceptional means, Weblate's Git repository can be cloned
or added as a remote:
git clone https://translate.tails.boum.org/git/tails/index/
At the server the repository is located in
Weblate can commit to its local repository at any time, whenever
translations get approved. Changes done in the canonical repository by
Tails contributors via Git and changes done in Weblate thus need to be
merged - in a safe place. This happens in an integration repository:
On the VM (translate.lizard), a third repository is used for the staging
Automatic merging and pushing
The integration work between the different repositories is done by a
script which is executed on the VM hosting Weblate as [a cronjob every
5 minutes](https://git-tails.immerda.ch/puppet-tails/tree/manifests/weblate.pp). The script
has the following steps which we will explain:
1. Canonical → Integration:
Update the integration repository with changes made on the
canonical repository (called "main" in the script).
2. Make Weblate commit pending approved translations locally
3. Weblate → Integration:
Integrate committed changes from Weblate into the integration repository
4. Integration → Canonical:
Push up-to-date integration repository to canonical repository.
5. Canonical → Weblate:
Pull from canonical and update the Weblate components.
6. Update Weblate's index for fulltext search
Whenever a contributor modifies a markdown (`*.mdwn`) file, and pushes
to master, the corresponding POT and PO files are updated, that is: the
translatable English strings within those files are updated. This
update happens on the production server itself, when [[building the
wiki|contribute/build/website]] for languages that are enabled on the
We need to ensure on the translation platform server, that PO files for
additional languages (that are enabled on Weblate but not on the
production website) are equally updated, committed locally, pushed to
the canonical Git repository. On top of this we need to update Weblate's
database accordingly, so that translations can be added for new or
modified English strings in those files, in all languages.
### Step 1: Canonical → Integration
**Update the integration repository with changes made on the canonical
The script fetches from the canonical (remote) repository and tries to
merge changes into the (local) integration repository. The merge
strategy used for this step is defined in [`update_weblate_git.py`](https://git-tails.immerda.ch/puppet-tails/tree/files/weblate/scripts/update_weblate_git.py): (XXX: Shouldn't this script be called update_integration_git.py according to this documentation?)
When this script is executed, it merges changes in PO files based on
single translation units (`msgids`). Merge conflicts occur when the same
translation unit has been changed in the canonical and the integration
repository (in the latter case, this would mean that the change has been
done via Weblate). In such a case, we always prefer the canonical
version. This makes sure that Tails developers can fix issues in
translations and have priority over Weblate.
Due to this procedure we never end up with broken PO files, however, we
may loose a translation done on Weblate.
Until here, only PO files of languages that are activated on our
production website will be merged, as the production website, i.e. the
canonical Git repository does not regenerate PO files of non activated
Because of this limitation of ikiwiki, once the activated language PO
files are merged, the script checks if PO files of additional,
non-production activated languages need updating. We do this by
generating POT files out of a PO file that we've previously defined as a
default language. We do this for all componenets. If the actual POT
file, generated on the production server differs from the POT file we've
just created, then every additional language PO file needs to be
On top of this, if the PO file of the default language (that is, its
markdown file) have been renamed, moved or deleted, than the PO files of
additional languages need to follow this step.
In summary, our script applies all changes detected on the default
language to the additional languages.
The described mechanisms always `touch` files and change metadata, such
as `mtime`. That's why Git would normally always create a new commit for
such a change, but often those commits don't change the content of
files. In order to omit these empty unnecessary commits our script also
detects when a `fast-forward` is possible (the master branch is updated
to HEAD of either the canonical or the integration branch). If only
Weblate or only modifications on the canonical repository introduces new
commits, a fast-forward can be done.
### Step 2: Trigger commits
Because Weblate tries to reduce the number of commits (aka. "lazy
commits"), we need to ask Weblate explicitly to commit every component
which has outstanding changes since more than 24 hours.
This is done by triggering Weblate to commit pending approved
translations using the internal command ([`manage.py
### Step 3: Weblate → Integration
**Merging changes from Weblate's Git repository into the integration
Weblate's Git repository is not bare. Hence we need to pull changes
committed to Weblate's Git repository and merge them into the
integration repository. This is done by the script
Changes already present in the integration repository are preferred over
the changes from the remote, Weblate repository. This is to allow fixes
done to PO files manually, via the canonical Git repository.
Again, PO file merges are done on translation units (`msgids`).
Furthermore, we make sure via the script that Weblate has only modified
PO files; indeed we automatically reset everything else to the version
that exists in canonical.
### Step 4: Integration → Canonical
**Pushing from the integration repository to our canonical repository,
After updating the Integration repository, we push the changes back to
Canonical aka puppet-git.lizard. After this the Canonical repository has
everything integrated from Weblate.
On the side of the canonical Git repository, gitolite has a special
to make sure that Weblate is only allowed to push changes on PO files.
This hook also checks and verifies the committer of each commit, to make
sure only translations made on the Weblate platform are automatically
pushed, and no other changes than those on PO files accepted. Otherwise
the push is rejected, for security reasons.
### Step 5: Canonical → Weblate
**Integrating the changes made in the Canonical Git repository into
After having merged changes from the canonical Git repository into the
integration Git repository, and integrated changes from Weblate there,
we can assume that every PO file now is up-to-date (in the Integration
and Canonical repository). Hence we can try to pull from the Canonical
repository using a fast-forward only (`git pull --ff-only`). Canonical and the
Weblate repositories can get new commits everytime, also while the cronjob is
running. It can happen, that a new commit on one side (Cannonical or Weblate)
makes it impossible to perform a fast-forward. If we can't fast-forward the git
repository, the cronjob is run 5min later anyways again, Than step 1,3 and 4 of
the cronjob fixes the reason why the fast-forward was not possible this time.
If the fast-forward was successful, we need to update Weblate's components
to reflect the modifications that happened on the side of Git, such as
string and file updates, removals, renames, or additions. This is
handled by another script:
We are not the only process touching the Weblate repository, as Weblate itself
creating commits and updating the master branch That's why the script is using
an own Git remote named `cron` to keep track of which commits need to look at
for Weblate component changes. This remote name is set in the
and used in the cronjob `update_weblate_components.py
### Step 6
This updates Weblate's index for fulltext search. It is recommended by
Weblate to run it every 5 mins.
In order to allow translators to see their non committed suggestions as
well as languages which are not activated on https://tails.boum.org we
have put in place a [staging website](https://staging.tails.boum.org/) .
It is a clone of our production website and is regularly refreshed.
On top of what our production website has, it includes:
- all languages available on Weblate, even those that are not enabled
on our production website yet;
- all translation suggestions made on Weblate.
- translators to check how the result of their work will look like
on our website;
- reviewers to check how translation suggestions look like on the
website, before validating them.
- check the sanity-check-website report:
### What is done behind the scene to generate a new version of the staging website?
This cronjob calls a script that extracts suggestions from Weblate's
database and applies them to a local clone of Weblate's Git repository,
after having updated the clone with newer data from Weblate's VCS.
After that we run `ikiwiki --refresh` using an dedicated `ikiwiki.setup`
file for the staging website.
None of the changes on this repository clone are fed back anywhere and they
Access control on the Weblate platform
- Every translation change must be reviewed by another person before
it's validated (and thus committed by Weblate and pushed to our
- This requirement must be enforced via technical means, for
translators that are not particularly trusted (e.g. new user
accounts). For example, it must be impossible for an attacker to
pretend to be that second person and validate their own changes,
simply by creating a second user account.
- It's acceptable that this requirement is enforced only via social
rules, and not via technical means, for a set of
- We need to be able to bootstrap a new language and give its
translators sufficient access rights so that they can do their job,
even without anyone at Tails personally knowing any of them.
- Suggested translations are used to build the [[staging
Currently implemented proposal
- In Weblate lingo, we use the [dedicated
workflow: it's the only one that protects us against an adversary
who's ready to create multiple user accounts.
- When not logged in, a visitor is in the `Guests` group and is
only allowed to suggest translations.
- Every logged in user is in the `Users` group. Members of this group
are allowed to suggest translations but not to accept suggestions
nor to directly save new translations of their own.
- A reviewer, i.e. a member of the `@Review` group in Weblate, is
allowed to accept suggestions.
- Technically, reviewers are also allowed to directly save new
translations of their own, edit existing translations, and
accept their own suggestions; we ask them in our
documentation to use this privilege sparingly, only to fix
important and obvious problems.
Even if we forbid reviewers to accept their own suggestions,
nothing would prevent them from creating another account, making
the suggestion from there, and then accepting it with their
- Reviewer status is global to our Weblate instance, and not
per-language, so technically, a reviewer can very well accept
suggestions for a language they don't speak. We will them in
our documentation to _not_ do that, except to fix important and
obvious problems that don't require knowledge of that language
(for example, broken syntax for ikiwiki directives).
If this ever causes actual problems, this could be fixed with
- How one gets reviewer status:
- We will port to Weblate semantics the pre-existing trust
relationship we already have towards translation teams that have
been using Git so far: they all become reviewers.
To this aim, we have asked them to create an account on Weblate
and tell us what their user name is.
- One can request reviewer status to Weblate administrators, who
1. Accept this request if, and only if, a sufficient amount of
work was done by the requesting translator (this can be checked on
the user's page, e.g.
In other words, we use proof-of-work to increase the cost of attacks.
2. Let <firstname.lastname@example.org> and all the other Weblate reviewers
know about this status change.
- Bootstrapping a new language
As a result of this access control setup, translators for a new
language can only make suggestions until they have done a sufficient
amount of work and two of them are granted reviewer status. In the
meantime, they can see the output of their work on the [[staging
- Is the resulting UX good enough? Would it help if we allowed them
to vote up suggestions, even if this does not result in the
suggestion to be accepted as a validated translation?
(At the moment, suggestion voting is disabled.)
This is important because it saves time for the translators, especially
in cumbersome documents, and helps us to be consistent not only with our
translations but, for example, with the Debian locales if we feed them
to the tmserver.
It is a very subtle way of increasing the quality of our translations.
It should give suggestions when you are translating, under the translation
window, tab 'Machine translation'.
We use tmserver for 'Machine translation'. You find the documentation
In order to update the suggestion we run
[`update_tm.sh`](templates/weblate/update_tm.sh.erb) via cronjob every month.
The tmserver can be queried like this [(see
It should give suggestions when you are translating, under the translation
window, tab 'Machine translation'.
- [[Enabling a new language|contribute/l10n_tricks#weblate-administration]]
Make sure to enable languages only if they are part of our tier-1
list or discuss the matter on the l10n-mailing list.
- Sysadmin: This documentation currently still lives in
translate-server.git and should be moved somewhere else.
Manually fix issues
We have our weblate codebase at
If commands have to be run, they should be run as user weblate (sudo -u weblate $COMMAND).
However, this VM is supposed to run smoothly without human
intervention, so be careful with what you do and please document
modifications you make so that they can be fed back to puppet.git or other
places if necessary.
Reload translations from VCS and cleanup orphaned checks and suggestions
The following commands mayby run manually if something went wrong:
- Reload all translations from proper folder on disk (eg. in case you
did some updates in VCS repository) to components of weblate
`sudo -u weblate ./manage.py loadpo --all`
Weblate installation and maintenance
A hybrid approach
The Tails infrastructure uses Puppet to make it easier to enforce and
replicate system configuration, and usually relies on Debian packages to
ensure stability of the system. But the effort to maintain a stable
system somehow conflicts with installing and maintaining Weblate, a
Python web application, which requires using up-to-date versions of
Weblate itself and of its dependencies.
Having that in mind, and taking into account that we already started
using Docker to replicate the translation server environment to
experiment with upgrading and running an up-to-date version of Weblate,
it can be a good trade-off to use Puppet to provide an environment to
run Docker, and to use a Docker container to actually run an up-to-date
From the present state of the Docker image, which currently uses
(slightly modified/updated) Puppet code to configure the environment and
then sets up Weblate, the following steps could be taken to achieve a
new service configuration as described above:
* Move the database to a separate Docker service.
* Remove all Puppet code from the Docker image: inherit from the
simplest possible Docker image and setup a Weblate Docker image with
all needed dependencies.
* Modify the Puppet code to account for setting up an environment that
has Docker installed and that runs the Weblate Docker image.
* Set up persistence for the Weblate git repository and configuration.
* Set up persistence and backups for the database service.
* Update the Puppet code to run tmserver (if/when it's needed -- latest
Weblate accounts for basic suggestions using its own database).
After that, we should have a clear separation between stable
infrastructure maintenance using Debian+Puppet in one side and
up-to-date Weblate application deployment using Docker in the other
side. The Docker image would have to be constantly maintained to account
for Weblate upgrades, but that should be easier cleaner than deploying
Weblate directly on the server.
Long-term maintenance plan
This is work in progress. A plan for the future maintenance of our
Weblate instance will be worked on in November 2019 and laid out here
before the end of the year.
Choosing a translation web platform
These are the requirements that we have defined for our translation web platform.
* provide a usable easy web interface
* be usable from Tor Browser
* automatic pull from main Git repo
* provide a common glossary for each language, easy to use and improve
* allow translators to view, in the correct order, all strings that
come from the entire page being translated, both in English and in
the target language
* make it easy to view the translations in context i.e. when translating
an entire page, all strings to be translated should only come from
this page. translators should be able to view the page in context.
* provide user roles (admin, reviewer, translator)
* be "privacy sensitive", i.e. be operated by a non-profit
* allow translators to push translations through Git (so that core
developers only have to fetch reviewed translations from there)
* provide support for Git standard development branches (devel, stable,
and testing) but we could also agree upon translation only master
through this interface
* provide checks for inconsistent translations
* provide feature to write/read comments between translators
* allow translating topic branches without confusing translators,
causing duplicate/premature work, fragmentation or merge conflicts
-- e.g. present only new or updated strings in topic branches;
* provide a feature to easily see what is new, what needs updating, what are translation priorities
* provide possibility to set up new languages easily
* send email notifications
- to reviewers whenever new strings have been translated or updated
- to translators whenever a resource is updated
* respect authorship (different committers?)
* provide statistics about the percentage of translated and fuzzy strings
* Letting translators report about problems in original strings, e.g.
with a "Report a problem in the original English text" link, that
e.g. results in an email being sent to -l10n@ or -support-private@.
If we don't have that, then [[contribute/how/translate]] MUST
document how to report issues in the original English text.