In the post on weak verbs and passive voice, I mentioned that technical documentation benefits from attempts to prevent localization or translation problems. In this post, I wanted to speak a bit more about what can happen if you don’t consider localization as you write documentation and what steps you can take to make localization or translation easier and less expensive. Plus you really don’t want to feature in one of the nightmare stories that gets passed around as a sad, cautionary tale. This post is not as specifically API documentation focused as my prior post, but rather applies to most technical documentation.
Beware the “No Localization” promises
Requests for technical content should always include information on whether a document will be localized. Any time you are told to not worry about localization, don’t count on it. First, the “we won’t localize” is only as accurate as of that moment in time and the knowledge of the person or persons making the statement. It could change at any time. Secondly, and more importantly, just because the content producers don’t localize the language, this doesn’t mean the documentation’s users won’t need to use machine translation tools like Google Translate, etc. to perform their own translation to take the documentation from its original language to a language the user prefers.
Even though that user-driven translation may not technically be supported, you do the users of your documentation and your product no favors by not taking it into account. The more difficult a product is to understand, the more users will be driven away.
Whenever you are producing technical documentation, you should keep a few major localization concerns in mind and work to lessen any impact on future localization and translation efforts.
Some considerations are:
I’ve written before about the need to use clear and concise verbs in technical documentation and this is especially true for localization and translation. Choosing simple words with little nuance rather than more complex or nuanced words can make translation a lot easier.
For example, the word “use” has an equivalent in almost all languages that I know of that is equally free of nuances. It’s clear, easy to translate, and even when relying on machine translation, is hard to mishandle. On the other hand, one of the alternatives I was once forced to use is the word “invoke.” Putting aside the fact that the word is not correct in API documentation anyway, consider the fact that the “invoke” has strong connotations of religion and magic. How would that be translated? How would that be understood? There is a huge amount of room, just from that one word substitution, to create problems.
There are also a whole list of words that have connotations because of cultural reasons. One example of that is using the word “above” instead of “preceding” or “prior”. There are actually problems caused by the connotation of one thing being “above” another in some languages and some cultures.
You should also avoid contractions and possessives. These are difficult to localize and it’s better to avoid the problem entirely. This can sometimes cause problems when you are attempting to use a more conversational tone, but it’s better to avoid them as much as possible.
I have worked at companies that have extensive information in their style guides on words to avoid because of localization problems. I now also keep a private list of those words and phrases of particular issue. Outside of that, I attempt to avoid words with unnecessary nuances and write as clearly and precisely as possible. When in doubt, I try running a word or sentence through something like Google Translate to various other languages, then back to English in a different translation tool, to see what I end up with. If there is no translation or the translation I get is unclear, nuanced, or complex, I look for a simpler word or phrase.
Consistency and Content Reuse
This ties in a little to the issue of word choice, as I elaborated about in the previous section. Always keep your documentation consistent in vocabulary and terminology. Consistency is important anyway, but it has additional value in localization. The fewer ways you refer to aspects of the product or product use in the documentation, the easier it will be to translate.
Content reuse helps both consistency and translation efforts because you can translate the reused content once and not every time it is used. You should look for opportunities to write content once and use it many times.
Illustration Captions and Text
Illustrations and screenshots can be a big issue with localization. Considerations include the language in use in the screenshot or illustration, the language on the call-outs, captions, or notations, and the actual content of any illustrations.
If you are using a set of tools that can accommodate it, putting all call-outs, captions, and notations on a separate layer, so only that layer has to be switched out for localization, can help. If you do not have a tool set that can accommodate this, you can number the areas in a screenshot or illustration, and then create a numbered legend that can then be localized.
Screenshots should be in the localized language if the product itself is localized. If the product is not localized but you are producing translated or localized documentation, you may want to put extra effort into explaining what the screenshot is showing.
Whenever you use photos or illustrations, you should also be concerned about cultural issues you can run afoul of. There are an enormous number of possible issues in this arena but many can be lessened by a few guidelines. Some of these guidelines can be challenging, depending on the project you are documenting, but my basic list is:
- Don’t use drawings or photos of people or parts of people.
- Don’t use maps of the world or portions of it.
- Don’t use specific languages, countries, flags, or political information or symbols.
- Don’t use animals or food products.
You may still run across issues but these guidelines should help you avoid the worst issues.
Although actual APIs are not usually translated from their original language, API documentation is translated. Because APIs often return values or information that can have locale-specific content, specific care needs to be taken to clearly document specify things like:
- Default timezone(s) and any daylight savings offsets.
- Default dateline.
- Input parameters.
- Output formats.
I know at least some people looked at this list and thought that all of these should already be documented. Trust me, they aren’t in a lot of cases. In other cases the documentation may say “returns current date / time” – but doesn’t specify what format (local or GST) or which date and time (local or server).
In addition to these specific documentation needs, a lot of API documentation is produced directly from the code. This sometimes involves tools like Swagger but sometimes information is simply pulled from the source code and slapped into a technical documentation format and sent out. Plus there are messages returned directly from the API like status messages or error codes. All of these also need to be examined and made as localization and globalization friendly as possible. Because of the nature of API documentation, these are often neglected and can end up causing localization nightmares.
I make it a practice to look at every piece of documentation, no matter the source, and determine whether or not it is localization ready.
Always keep globalization and localization in mind in any documentation you write. It will save a lot of pain and rewriting on your part, cost on your company’s part for translations, and frustration on the part of your customers.