Internationalisation, or i18n as a cooler abbreviation, is the process of developing a piece of software or app so that it can easily be translated and localised into other languages – and it’s much easier if you do it right from the beginning.
But why is it necessary?
Internationalisation can be seen as the process that enables a company to operate in foreign markets surpassing the boundaries of domestic markets. It can work as a repeatable and successful activity to differentiate your company further as a problem-solver, allowing get to markets faster and more efficiently.
Internationalization consists of the planning and execution that needs to be included in the development of software that lets the software support languages and locale formatting (like numerical formats, dates, times, currencies, postal addresses and more). Applications not only have to be capable of displaying any language, to correctly allow the input, storage, processing and retrieval of that multilingual/multi-locale data.
However, even though internationalization can help bring your content in the surface of a bigger-target-customer lake there are some common mistakes observed that can be proved deadly costly to fix if the company wishes to continue swimming peacefully among the other fishes.
Trying to localize an application that hasn’t firstly been internationalized cannot lead to other than a bitter end (both for your application itself and your pocket). As Adam Asnes, CEO of Lingoport, has very clearly indicated:
“Chances are very high the software is going to break. […] If, by sheer luck, the application still works, they will not be able to leverage the translation when they go to a new version. There is no way this is going to have a happy ending in the long run.
Every character you see on the screen corresponds to a set of zeros and ones which get “interpreted” into what you read on the screen. How an application supports character encoding determines whether it will actually work in any language, such as Chinese, Japanese, French, German, etc. This is where terms like Unicode or ISO-Latin apply. If your application does not recognize, for example, Chinese characters, then when you decide to switch to a Chinese locale, you will see unidentified squares and unprintable characters instead of hieroglyphs you expect to see.
Graphics and images are a moving trap when it comes to internationalization as they can cause a lot of costly losses if not chosen wisely. An inadequate choice of image are the images that contain text and numbers. The reason is that the text (and graphic) must be localized for each locale as well, while the numbers can have different connotations in different locales (13 number has negative connotation in Europe and United States, while the number 4 connotes deadly events in both Japan and China). Also, images depicting elements of social culture (hand gestures, body parts, sexual and gender-specific contents, ethnic, racial, religious references, etc.) and material culture (monetary, food, streets, road signs, even date and time references) of a certain locale are difficult to localize or even offending to other locals and therefore must be replaced or omitted.
Along with externalizing all user-visible strings, localization best practices include never hard coding strings and avoiding concatenated strings. Hard-coded elements are difficult to localize because they do not show up until the localized software is compiled and executed. On the other hand, it is also important to keep sentences in a single string. When a sentence is broken up into several strings, the strings do not necessarily appear consecutively in the localizer’s string table and therefore it will cause translations issues. Both these common mistakes can cause a great deal of headache while localizing and thus can cost more money.
Each programming language has its own set of functions or methods that do things in a certain way such as how a date is interpreted, or how many bytes a character can contain. For example there are limitations in C/C++ and there are dependencies based on the character encoding choice (e.g. Unicode UTF-8). Other programming languages such as Java and C# have less of these issues, but still have their own hidden snares. These factors and limitations should be examined and while localizing there maybe replacements needed to be made so as the locale requirements will be supported. Wrong choice of programming language can cause multiple localizing problems and be proven time and money wasteful.
The use of other application components such as databases, reporting mechanisms, email generators and more for the creation of a software is a common practice. However, these components have their own internationalization support issues, which can be another pain in the software developer’s neck. Therefore, the choice of the integrated components must be made with great concideration and cost effective oriented thinking.
The user interface (UI) of the application should always be flexible and neutral. A good practice to minimize the amount of resizing work needed during localization is to design the source language dialog boxes with as much room to spare as possible. One way to make this possible is to extend text frames with extra room to accommodate text expansion during translation while keeping their aesthetic. The importance of this can be easily understood comparing the different linguistic structures of the languages. For example For European languages, a typical translated sentence is normally about 30-40% longer than the English original, and the increase may be as much as 100-200% for short UI terms. On the contrary, Chinese translation is usually shorter than English however, if the English application uses abbreviation with UI terms, the Chinese translation will definitely need more space to display. If these factors are not taken into consideration, your quality assurance process will be definitely long and costly.