What was SDF and why did we kill it? LibreOffice source code contains translatable content in various file formats. It was desirable to present translatable content to translators in a single file format, so they did not have learn how to edit different file formats. Therefore back in the OpenOffice.org era SDF file format was invented. It was a simple tab separated text file. Localization tools extacted translatable content into SDF file format, and localized SDF files were merged back to source code during the build.
One cannot imagine simpler file format, than a tab separated text. When I translated OpenOffice.org into Hungarian, I built a tool set around it, and translated it happily. But not every translator was a programmer, or capable of using scripts. Translators demanded PO file format, which is a quasi-standard file format for localization in the open source world. Translate Toolkit and Pootle became part of the localization process, en-US SDF file was converted to POT (PO Template) files, and translated PO files were converted back to SDF manually.
When LibreOffice project started, I wanted to amend the process for two reasons.
- Huge SDF files blew the git repository. Git does not work well with large, frequently updated text files.
- Manual conversion from PO to SDF was a tedious job.
Therefore I implemented to store PO files in git, and convert them to SDF automatically during the build. In LibreOffice 3.4 we used
po2oo from Translate Toolkit for the back-conversion. In LibreOffice 3.5
po2oo was replaced by
po2lo written by Miklos Vajna, which was 30x as fast as
But why convert translations from one format to another, why not use PO files directly? This is what was developed in feature/killsdf branch in the past few months. Most of the programming work was done by a young developer, Tamas Zolnai, who was a trainee at Novell Hungary in the summer, and now he is a fellow of FSF.hu Foundation (FSF.hu sponsors his work on LibreOffice). Now localization tools extract translatable content from the source code directly to PO files, and they read PO files directly during the build. SDF files are not generated any more. At the same time localization tools were refactored. Perl scripts have gone. All tools are in C++ now. Further cleanup and optimizations are on the way on master branch. I fixed the last remaining issues after the merge in Munich Hackfest 2012.