The locale handling is central to most of xIUA. It was done along the lines of the setlocale model except that the locale setting are thread specific. It is not exactly like setlocale because the locale resources must be freed when the application is through with the locale. Because of this, xIUA opens and closes locales rather than setting them. There is another difference in that xIUA may have several open locales per thread.

The locale manager uses thread locale storage to manage locales. This avoids the problem of passing this information to every function that may use xIUA or ICU services. xIUA has two separate structures. One per thread and one per locale. The thread structure is not only used to track the locales but for work area storage management. The storage manager is designed to minimize the malloc/free usage by reusing storage yet, on the other hand, it does not retain very large buffers. This is tied into the xIAU function, which calculates the working storage needs, if any, of the specific call.

Functions that are locale dependent use the current locale. If no locales are open for the thread a locale using the operating system default locale is created. xIUA locales contain more than most locales. In addition to the standard language, country, character set information they also contain time zone and data type (UTF-32, UTF-16, UTF-8 or code page).

Locale information can be passed between systems by calling xiua_GetLocaleName that will produce a string containing the locale information that can be used by another system performing a xiua_OpenLocale. The format of the string is:

ll [_CC ] [.MM ] [ @VV] [#TT]

ll = language, CC = country, MM = charmap, VV = variant, TT = time zone

This follows the same format as the POSIX locale except for the addition of the time zone information. For example: en_US.windows-1252#America/Los_Angeles

The charmap specification in the locale does not represent the data format if the data is in Unicode. Instead it represents the code page to use when converting data between Unicode and code page. In addition to this locale information the xiua_OpenLocale must specify the actual type of data (UTF-32, UTF-16, UTF-8 or code page).

It is best to use UTF-8 to exchange data between systems. With other Unicode formats such as UTF-16 and UTF-32 you can have a the data in either big or little endian format depending on the way the specific computer hardware stores integer numbers in memory. UTF-8 does not have this problem.

The xiua_OpenLocale opens a new locale if none exists with the same language, country, variant and data type and sets the locale as the current locale. If it finds that the locale is already open it will set the locale as the current locale. It returns a handle for the locale. This handle can be used to refer to the locale. It uses the time zone of the locale that was current at the time that the locale was opened. If there was no time zone set in that locale, the system default time zone is used.

The time zones are the same that are used for Java. The three or four letter time zones are not used because they are not unique nor are they explicit. The AST time zone can be Alaska Standard Time or Atlantic Standard Time. MST can be Moscow Summer Time or Denver Mountain Standard Time, which has Daylight Savings Time or Phoenix time which has no Daylight Savings Time. Call xiua_SetTZone to set the time zone of any open locale.

To make it easier to select locales and time zones, xIUA provides routines to enumerate the available locales and time zones. It also provides routines to display the locale and tine zone entries. For example you can display a list of locales in the language or the locale or in English. It also provides an option to list it in both English and the local languages. This option is useful if the user select a language that either has a script that will not display on the clients system or in a language that the client cannot read. It is presumed that most users can recognize the name of their own language in English.

When done with a locale the thread must close it. When the locale is closed and another locale is open, one of the other locales will become the current locale. The locale that is selected to become the next current locale is either the locale designated as the default locale or the locale that was current when the locale was opened. There are uses for both types of closes.

Closing and returning to the default locale it the normal way of processing. But supposedly you have a sub routine that does file I/O using a specific locale and data type. You open this new locale convert the data to the new locale type and the close the locale reverting to the locale that was active when this locale was created.

XIUA_Locale * Data_locale;
XIUA_Locale * File_locale;
UBool new_locale;
UErrorCode * icu_err;
icu_err = xiua_CurrentStatus();
Data_locale = xiua_SetLocaleHdl(NULL);
File_locale = xiua_OpenLocale("ja_JP.EUC-JP",XDFCPUNIX);
new_locale = (icu_err == U_ZERO_ERROR);
xiua_LocaletoNative(buffer2,1024,buffer1,-1,Data_locale);
... Process the data ...
if (new_locale) xiua_CloseLocale("ja_JP.EUC-JP",XDFCPUNIX,FALSE);

The xiua_SetLocaleHdl is a function that is use to set a locale to current by specifying the locale handle. Since no handle is passed it just returns a handle to the current locale. The xiua_OpenLocale open a new locale and sets it as the current locale. If the locale is already open, it will return a U_LOCALE_REOPENED_WARNING instead of U_ZERO_ERROR. The xiua_LocaletoNative converts the string of data in buffer1 to EUC-JP data in buffer2. Native referrers to the current locale encoding, which is now EUC-JP because the last open made this locale the current locale. By specifying FALSE on the xiua_CloseLocale the default locale was not set as the current locale. Instead the locale that was active at the time of the open was set to current.

Locales can be opened in the following date modes:

Note that you can easily tailor these mode definitions for your own specific requirements.

XDFUTF16

UCS-2/UFT-16 data

XDFUTF32

UCS-4/UFT-32 data

XDFUTF8

UFT-8 data

XDFCODEPAGE

Code page data

XDFCPWIN

Windows code page data

XDFCPUNIX

Unix code page data

XDFCPMAC

Mac code page data

XDFCPOSA

Other OS (A) code page data

XDFCPOSB

Other OS (B) code page data

xIUA provides several code page data types so that you can simultaneously open separate locales for different characters sets. Normally you would use XDFCODEPAGE as a data type for code page locales. But if you were using the locale-to-locale data conversion routines in xIUA to convert from EUC-JP to Shift_JIS you would probably open a "ja_JP.EUC-JP",XDFCPUNIX locale and a "ja_JP.Shift_JIS",XDFCPWIN locale. All conversions to or from a code page require ICU converters. The type of conversion requires two separate ICU converters because it is going from one code page to another. The transformations from one type of Unicode format to another are handled by xIUA. So if you converted from UTF-8 to UTF-8 it would not use two converters. In fact since they are the same format xIUA just copies the data. It even detects when you open two locales with different names for the same characters set. For example it knows that "cp1252" and "windows-1252" are the same code page.


X.Net Inc. Home Page

xIUA Home Page