Tuesday, February 5, 2008

Localizing Cocoa Software

Localizing software (the process of translating it to multiple languages) is an amazingly time intensive process.  Apple's bundle layout, bundle loading, and localizable strings files are all very helpful, but amazingly, things always manage to find a way to go south.  Today, for example, I switched all of the localized .lproj directories to use the ISO 639-1 two letter naming conventions.  Instead of the older "English," "French," and "Spanish," I now have "en," "fr," and "es."  Great, right?  Well, not quite.  Turns out I forgot to change the CFBundleDevelopmentRegion tag in the info.plist to "en" instead of "English," so any time a localization for a language was incomplete, the nibs failed to load or the localized strings files were ignored.  Oops.  You get the idea.  ;-)
Consider the process for localizing a single nib file (which a product like TubeTV contains about ten of).
1.  Look up command line options for "ibtool" since it's impossible to remember them exactly between run times.
2.  Run "ibtool" to extract the localizable strings from the nib.
2. (a) Optionally, delete about 90% of the strings ibtool extracts since they don't really need to be translated and make the translator's job more difficult.  Sorry to the current translators for TubeTV as I didn't realize this until after I sent you the files!  ;-)
3.  Submit files to people around the world (in different time zones) who have agreed to translate them for you.
4.  Wait for files to come back, look for portions which may have been missed, go back to step 3 if necessary.
5.  Look up different command line options for ibtool to import the translated strings into a new nib file making use of the English nib as a basis.
5. (a) If parse errors occur in the strings file (which they often do is the localizers missed a semicolon or quote), use the plutil command line tool to find them.  Correct and move back to step 5.
6.  Correct sizing on controls and GUI elements in the nib since translations are often longer than English.  This can be very time consuming depending on the layout and must be done every time you update anything.
7.  Test.
That was the procedure for one nib.  Multiply that by the number of nibs in the project, and additionally, by the nibs in any internally developed included frameworks (such as our version checking system).  Want to change anything in any of your nibs?  No problem, just change the English language nib, delete all the translated nibs, and go back to step 5!  Or you could opt to change each localized nib by hand!  Ouch.
Perhaps now you have a better understanding for why much of the freeware and shareware software for the Mac isn't localized: it's a huge time commitment and it makes creating future versions much more difficult.  It would be interesting to see if this process is easier or harder than the process on other platforms such as Windows (or Java).  Anyone who has experience with those, feel free to post something in the comments.

1 comment:

Kuba Suder said...

Regarding the comparison with other platforms - I played with Trolltech's Qt some time ago, and I must say I'm shocked that the localization in Cocoa takes so much effort. The difference is that in Qt, you never position widgets/controls manually, you just put them in vertical or horizontal containers and set their alignment, scaling behaviour etc. If you want to do localization, you extract strings, translate them, put them in separate language bundles, but there's no such thing as separate UIs for each language! Strings are just replaced automatically, and the widgets are repositioned/rescaled according to the rules you set, depending on specific phrase lengths.

The mere thought of having 10 separate UI designs (NIBs) differing only in the language used for the labels, which all need to be updated separately after each change in the application, is making me shiver. It's as non-DRY as possible. If you have 10 methods that do almost the same thing, do you keep them separate and copy-paste each change to all of them, or do you refactor this so that there's only one, configured by parameters?...

I still can't believe they couldn't come up with something more sensible...