feature/killsdf branch merged

 

What was SDF and why did we kill it? LibreOffice source code contains translatable content in various file formats. It was desirable to present translatable content to translators in a single file format, so they did not have learn how to edit different file formats. Therefore back in the OpenOffice.org era SDF file format was invented. It was a simple tab separated text file. Localization tools extacted translatable content into SDF file format, and localized SDF files were merged back to source code during the build.

One cannot imagine simpler file format, than a tab separated text. When I translated OpenOffice.org into Hungarian, I built a tool set around it, and translated it happily. But not every translator was a programmer, or capable of using scripts. Translators demanded PO file format, which is a quasi-standard file format for localization in the open source world. Translate Toolkit and Pootle became part of the localization process, en-US SDF file was converted to POT (PO Template) files, and translated PO files were converted back to SDF manually.

When LibreOffice project started, I wanted to amend the process for two reasons.

  1. Huge SDF files blew the git repository. Git does not work well with large, frequently updated text files.
  2. Manual conversion from PO to SDF was a tedious job.

Therefore I implemented to store PO files in git, and convert them to SDF automatically during the build. In LibreOffice 3.4 we used po2oo from Translate Toolkit for the back-conversion. In LibreOffice 3.5 po2oo was replaced by po2lo written by Miklos Vajna, which was 30x as fast as po2oo.

But why convert translations from one format to another, why not use PO files directly? This is what was developed in feature/killsdf branch in the past few months. Most of the programming work was done by a young developer, Tamas Zolnai, who was a trainee at Novell Hungary in the summer, and now he is a fellow of FSF.hu Foundation (FSF.hu sponsors his work on LibreOffice). Now localization tools extract translatable content from the source code directly to PO files, and they read PO files directly during the build. SDF files are not generated any more. At the same time localization tools were refactored. Perl scripts have gone. All tools are in C++ now. Further cleanup and optimizations are on the way on master branch. I fixed the last remaining issues after the merge in Munich Hackfest 2012.

A cikk szerzőjéről

Tímár András 1999-ben kezdett foglalkozni a szabad szoftverek honosításával. Magalakulásakor csatlakozott az FSF.hu Alapítványhoz, ahol vezető tisztséget is vállalt. 2002 óta dolgozik az OpenOffice.org (2010-től a LibreOffice) magyar verzióin. 2011-től főállású LibreOffice-fejlesztő, jelenleg a Collabora Productivity Ltd.-nek dolgozik.

Hozzászólások

  1. Congratulations to all involved in this gargantuan work!

  2. Olivier Hallot szerint:

    Back in 2002, circa OpenOffice release 1.0.x, I worked in the translation for Brazilian Portuguese. That huge SDF file was loaded into a OO Calc file and I started translation inside the spreadsheet. To give an idea of the tweak, the UI SDF file was larger than 32.000 lines so I had to keep only the en-US part (25K lines) and translate in-place.

    Later, with the advent of po format, I used kBabel, and the major advancement was really the poote server, which off-loaded me from keeping and managing versions of the translation.

    Heroic days at that time

    Thanks Andras and Tamas for the nice work

  3. Super work..I was underway with a similar project under the AOO label, and have made some analysis on both the current situation and proposed a whole new workflow. The analysis is available on wiki.openoffice.org (http://wiki.openoffice.org/wiki/Localization_AOO, http://wiki.openoffice.org/wiki/Localization_AOO/new_proposal)

    I look forward to see the sources, and if some of the enhancements I have been thinking of can be integrated in your tool.

    Would you be interested in a cooperation to make it available also for AOO ?

    thanks for you initative, good work.
    Jan I.

    • Hi Jan,

      Thanks for your kind words.

      The source code of these tools is in LibreOffice git repository. You can re-use this source code, it is licensed under MPLv2. This is not entirely new code, we have just refactored existing l10ntools. The classes that handle .po files are new, and we have rewritten a few Perl scripts in C++. Having said that, tools depend on sal which may be incompatible to AOO sal. Due to nature of refactoring, it would have been waste of time to rewrite everything in order to use only standard C++ classes. So porting this to AOO would require reasonable effort from you. I don’t think that we have resources to release it as a self-standing tarball or something like that, but if you are interested, you could work with us to make the tools work with both code bases.

      Best regards,
      Andras Timar

  4. I would for sure like to work with you to make it a tool available for both platforms. I looked in the git core, but cannot find the code, I expected it to be l10n, can you tell where it is hidden.

    Can I e-mail you directly, as I do not get any notice when this page is updated ?

    thanks in advance.
    jan I