Solaris Internationalization Guide For Developers
  Buscar sólo este libro
Descargar este libro en PDF
CHAPTER 2

Contents of the Base Solaris Product



Summary of the Base Product

The base English Solaris 2.6 product includes a number of partial European locales as well as the en_US.UTF-8 locale.
Solaris 2.6 includes the en_US.UTF-8 locale, which looks the same as English. For the European locales, it can handle different sets of languages in a single application.
The File System Safe Universal Transformation Format, or UTF-8, is an encoding defined by X/Open as a multi-byte representation of Unicode. The en_US.UTF-8 locale is the first locale that uses UTF-8 as the codeset to support multiple scripts in the Solaris system. UTF-8 is a variant of UNICODE 2.0. UTF-8 provides input and output support for all Solaris single-byte locales.
The partial locales provide the basic mechanism for entering, displaying, and printing local languages. Messages appear in English.
The partial locales can be split into two groups: the core set and the extended set. The core set is packaged in SUNWploc (operating system locale) and SUNWplow (window system locale). Since these packages are part of the end user cluster, they are installed automatically. The extended set of locales is packaged in SUNWploc1 (operating system locale) and SUNWplow1 (Window system locale). SUNwpldte has CDE support for the Eastern European locales.
SUNWploc1 and SUNWplow1 are available on the entire cluster only. SUNWploc1 and SUNWplow1 need to be added to your system before you can use the locales in the second group.

Core Set of Locales

The core set of locales are installed automatically. The core sets are listed in TABLE 2-1.
TABLE 2-1 SUNWplocSUNWplow
LocaleLanguageCountryEncoding
deGermanGermanyiso-8859-1
en_AUEnglishAustraliaiso-8859-1
en_CAEnglishCanadaiso-8859-1
en_UKEnglishUnited Kingdomiso-8859-1
en_USEnglishUnited Statesiso-8859-1
en_US.UTF-8EnglishUnited StatesUTF-8
esSpanishSpainiso-8859-1
es_ARSpanishArgentinaiso-8859-1
es_BOSpanishBoliviaiso-8859-1
es_CLSpanishChileiso-8859-1
es_COSpanishColumbiaiso-8859-1
es_CRSpanishCosta Ricaiso-8859-1
es_ECSpanishEcuadoriso-8859-1
es_GTSpanishGuatemalaiso-8859-1
es_MXSpanishMexicoiso-8859-1
es_NISpanishNicaraguaiso-8859-1
es_PASpanishPanamaiso-8859-1
es_PESpanishPeruiso-8859-1
es_PYSpanishParaguayiso-8859-1
es_SVSpanishEl Salvadoriso-8859-1
es_UYSpanishUruguayiso-8859-1
es_VESpanishVenezuelaiso-8859-1
frFrenchFranceiso-8859-1
itItalianItalyiso-8859-1
svSwedishSwedeniso-8859-1

Extended Set of Locales

The extended set of locales is not installed automatically. If you want to use locales listed in TABLE 2-2, you need to install these manually.
TABLE 2-2 SUNWploc1SUNWplow1
LocaleLanguageCountryEncoding
czCzechCzechoslovakiaiso-8859-2
daDanishDenmarkiso-8859-1
de_ATGermanAustriaiso-8859-1
de_CHGermanSwitzerlandiso-8859-1
elGreekGreeceiso-8859-7
en_IEEnglishIrelandiso-8859-1
en_NZEnglishNew Zealandiso-8859-1
etEstonianEstoniaiso-8859-1
fr_BEFrenchBelgiumiso-8859-1
fr_CAFrenchCanadaiso-8859-1
fr_CHFrenchSwitzerlandiso-8859-1
huHungarianHungaryiso-8859-2
ltLithuanianLithuaniaiso-8859-4
lvLatvianLatviaiso-8859-4
nlDutchNetherlandsiso-8859-1
nl_BEDutchBelgiumiso-8859-1
noNorwegianNorwayiso-8859-1
plPolishPolandiso-8859-2
ptPortuguesePortugaliso-8859-1
pt_BRPortugueseBraziliso-8859-1
ruRussianRussiaiso-8859-5
suFinnishFinlandiso-8859-1
trTurkishTurkeyiso-8859-9

New Unicode Locale: en_US.UTF-8

The en_US.UTF-8 locale enables programming that can input and output scripts in multiple single-byte languages. This is the first locale with this capability in the Solaris operating environment. For more detailed information, see Chapter 6, "Internationalization Framework in Solaris 2.6."
This locale uses UTF-8 (Universal Character Set Transformation Format for 8 bits) encoding, which was developed by the X/Open-Uniforum Joint Internationalization Working Group (XoJIG). This standard has been adopted by the Unicode Consortium, the International Standards Organization, and the International Electrotechnical Commission as a part of Unicode 2.0 and ISO/IEC 10646-1. The en_US.UTF-8 locale supports the CDE environment only, including the Motif and CDE libraries. This locale is part of the developer cluster.
The locale supports computation for every code point value, which is defined in Unicode 2.0 and ISO/IEC 10646-1. In Solaris 2.6, language script support is limited to pan-European locales. Input method support has been enabled for the following langauge scripts only. Due to limited font resources, Solaris 2.6 software includes only character glyphs from the following codesets:
  • ISO 8859-1 (most Western European languages, such as English, French, Spanish, and German)
  • ISO 8859-2 (most Central European languages, such as Czech, Polish, and Hungarian)
  • ISO 8859-4 (Scandinavian and Baltic languages)
  • ISO 8859-5 (Russian)
  • ISO 8859-7 (Greek)
  • ISO 8859-9 (Turkish)

New User Locales in Base Solaris

The base English Solaris 2.6 includes the following new locale support:
TABLE 2-3
CountryLocale-NameISO codeset
Austriade_AT (German Partial Locale)8859-1
Estoniaet8859-1
Czechcz8859-2
TABLE 2-3 (Continued)
CountryLocale-NameISO codeset
Hungaryhu8859-2
Polandpl8859-2
Latvialv8859-4
Lithuanialt8859-4
Russiaru8859-5
Greeceel8859-7
Turkeytr8859-9
These locales are supported through the SUNWploc1 (for operating system support), SUNWplow1 (for OpenWindows support), and SUNWpldte (for locales support) packages, which are part of the entire cluster. The fonts for these packages have the format SUNiXxf.
  • iX represents the ISO 8859 codeset.
  • xf indicates whether the font is optional or required.
SUNWi1rf contains the required font and SUNWi1of contains the optional font for an ISO 8859-1 codeset locale. These packages are in different clusters; install the entire cluster or selectively add the appropriate packages. After the packages have been installed, users can login through dtlogin to either OpenWindows or CDE and use the characters associated with their locale.

Multiple Key Compose Sequences for New Locales

The Solaris 2.6 operating environment supports compose sequences to create the diacritical marks used in writing the scripts covered in the following codesets:
  • ISO 8859-2 (Latin2) Czech, Polish, and Hungarian
  • ISO 8859-4 (Latin4) Latvian and Lithuanian
  • ISO 8859-9 (Latin5) Turkish
These are the new diacritic characters which can be created with the following keys and the Compose key.
  • diaeresis = citation ( " ) (for example, Compose + A + " = Ä)
  • caron = v (for example, Compose + E + v = E caron)
  • breve = u
  • ogonek = a
  • doubleacute = > greater
  • degree symbol = O + 0 (oh plus zero)
  • currency symbol = 0 + x (zero plus x)

Keyboard Mapping for Greek and Russian Scripts

The Solaris 2.6 operating environment supports new keyboard mapping for Greek and Russian, which allows Greek or Russian script input with the appropriate Sun keyboard.
  • ISO 8859-5 Russian
  • ISO 8859-7 Greek

New Keyboard Support in Solaris 2.6

The folowing locales have keyboard layouts for sparc (X-server) and X86 (Xserver PLUS console):
  • Czech
  • Hungary
  • Poland
  • Latvia
  • Lithuania
  • Russia
  • Greece
  • Turkey
[X-server is CDE and OW, console is command line]

Changing Between Keyboards on SPARC

Support for changing layouts in Solaris is achieved only by using the dip-switch settings under the keyboard. The keyboard layout determined by the dip switches. A list of keyboard layouts and corresponding defined dip-switch settings is at /usr/openwin/share/etc/keytables/keytable.map.
The following table is for a type 4 keyboard (1=switch up 0=switch down).
TABLE 2-4
Dip Switch in HexKeyboardSetting in Binary
51Hungary5.kt110011
52Poland5.kt110100
53Czech5.k110101
54Russia5.kt110110
55Latvia5.k110111
56Turkey5.kt111000
57Greece5.kt111001
58Lithuania5.kt111011
Changing the layout from US/UK to Czech is done by changing the dip-switch settings to the setting defined in the file (the file defines them in hex - this needs to be converted into binary as it was done above) and then re-booting.
Russian and Greek keyboard support can be toggled on and off using the Sparc Compose key (Ctrl+Shift+F1 on x86).

Changing Between Keyboards on x86

On x86, a keyboard is selected during the kdmconfig part of install. To change this at any time after installation, use kdmconfig:
  1. Exit CDE/OW to command line

  2. Type kdmconfig -u (in other words, kdmconfig unconfigure)

  3. Type kdmconfig to run the program

  4. Follow instructions to get a new keyboard layout

There are no 'utilities' for either Sparc or x86 (apart from standard Unix tools such as xmodmap, pcmapkeys) bundled by ELC into Solaris 2.6 for switching keyboards.

New Locales in the Base Installation

The installation window in the base Solaris 2.6 offers several English language locales. To use 8-bit characters, install one of the en_XX options. The locale used in the installation becomes the default system locale.
TABLE 2-5
Locale NameLanguage/TerritoryCodeset
CAmerican English7-bit
en_AUAustralian English8-bit
en_CACanadian English8-bit
en_UKUK English8-bit
en_USAmerican English8-bit

Using Jumpstart

To enable JumpstartTM for the new 8-bit locales, add the line locale xx (substituting the appropriate 8-bit locale for xx, for example, en_US) to the Jumpstart profile file. For complete instructions, see Chapter 4 of Automating Solaris Installation, available from SunSoft Press. Current Jumpstart users should set the default locale to bypass the language prompt during installation.

How to Use iconv Command

The iconv command converts the characters or sequences of characters in file from one codeset to another and writes the results to standard output. If there is no conversion for a particular character, it is converted into an underscore '_' in the target codeset. See the iconv man page for more information.
The following options are supported:
  • -f fromcode Symbol of the input codeset.
  • -t tocode Symbol of the output codeset.
To convert a mail file from one encoding into another, use the iconv command:

  example% iconv -f from_codeset -t to_codeset mail.codeset > mail.codeset