以 PDF 格式下載這本書 (3243 KB)
Chapter 4 Supported Asian LocalesThis chapter provides information on localization related information for the Japanese, Indic, and Thai languages. The sections in this chapter are: Japanese LocalizationThis section describes Japanese locale-specific information. Japanese LocalesFour Japanese locales, which support different character encodings, are available in the current Solaris environment. The ja and ja_JP.eucJP locales are based on the Japanese EUC. The ja_JP.eucJP locale conforms to the UI-OSF Japanese Environment Implementation Agreement Version 1.1 and the ja locale conforms to the traditional specification from earlier Solaris releases. The ja_JP.PCK locale is based on PC-Kanji code (known as Shift_JIS) and the ja_JP.UTF-8 is based on UTF-8. See the eucJP(5) man page for a map showing Japanese EUC and the character set. See the PCK(5) man page for the map showing PC-Kanji code and the character set. Japanese Character SetsThe supported Japanese character sets include:
JIS X 0212–1990 is not supported in the ja_JP.PCK locale. JIS X 0213–2000 is supported in the ja_JP.UTF-8 locale only. Not all characters defined in the JIS X 0213–2000 are available. Only those characters defined in the Unicode 4.0 character set are available. Vendor-defined characters (VDC) and user-defined characters (UDC) are also supported. VDCs occupy unused (reserved) code points of JIS X 0208–1990 or JIS X 0212–1990. UDCs occupy the same code points as VDCs, except those code points allocated for VDCs. Japanese FontsThree Japanese font formats are supported: bitmap, TrueType, and Type1. The Japanese Type1 font includes only JIS X 0212 for printing. The Type1 font is also used by UDC. Japanese bitmap fonts are described in the following table. Table 4–1 Japanese Bitmap Fonts
Japanese TrueType fonts are described in the following table. Table 4–2 Japanese TrueType Fonts
Japanese Input SystemsATOK12 is the default Japanese input system in the current Solaris environment. ATOK12 is available for all of the Japanese locales and all of the UTF-8 locales when the Japanese locale is installed. The Wnn6 Japanese input system is also available for all of the Japanese locales. You can switch the input system from the desktop menu. The kkcv Japanese input system is available for Japanese Solaris 1.x BCP support. The following procedure describes how to enter Japanese text with the ATOK12 input method. How to Use the ATOK Input Method
Terminal Setting for Japanese TerminalsTo use Japanese locales on a character-based terminal (TTY) you must use terminal settings to make line editing work correctly.
Japanese iconv ModuleSeveral Japanese code set conversions are supported with iconv(1) and iconv(3). See the iconv_ja(5) man page for details. User-Defined Character SupportThe user-defined character utility sdtudctool handles both outline (Type1) and bitmap (PCF) fonts. Some utilities are also available to migrate the UDC fonts that were created by old utilities in prior releases, such as fontedit, type3creator, and fontmanager. Differences Between Partial and Full LocalesThe following components are only available in the Japanese full locale environment with the Languages CD:
Indic LocalizationPhonetic lookup based input method (Shabdalipi) and continuous phonetic input method are available for all Indic languages which are supported in the UTF-8 locale. The input methods and virtual keyboards allow you to enter Indic text in all of the CDE applications. The following data flow illustrates the workings of the Indic input
process. How to Use the Indic Input Methods
Indic KeyboardsThe following figures show the keyboard layouts that are available for the Indic input method. The following figure shows the layout of the Bengali keyboard. ![]() The following figure shows the layout of the Devanagari keyboard. ![]() The following figure shows the layout of the Gujarati keyboard. ![]() The following figure shows the layout of the Gurmukhi keyboard. ![]() The following figure shows the layout of the Kannada keyboard. ![]() The following figure shows the layout of the Malayalam keyboard. ![]() The following figure shows the layout of the Tamil keyboard. ![]() The following figure shows the layout of the Teluga keyboard.
Understanding the MappingsThe images in Mapping for the Continuous Phonetic Based Input Method show the mappings between English tokens and their equivalent codepoints in each of the target scripts supported. The CONSONANT category means the mapping is between the English tokens and consonants of the script. The VOWEL category means that mapping from English tokens and vowels of the script. The OTHER category includes mapping of characters that do not exhibit the properties of consonants and vowels (whose form does not change depending on the surrounding character). The keywords CONSONANT, VOWEL and OTHER also mean that these characters are part of Unicode standard. The section SPECIAL CONSONANT, SPECIAL VOWEL or SPECIAL OTHER means that though in principle these characters display the properties of consonants, vowels or others they are not officially part of the Unicode standard and are font dependent. They are assigned codepoint values in Unicode Private User Area. They are supported in Solaris UTF-8 locales and the mapping may not work in a different platform. These mapfiles are not the same as the ones in your system, but slightly edited ones for removing unneeded keywords for the context of this discussion. In the VOWELS and SPECIAL VOWELS section, an independent form and a dependent form is displayed for the same English token depending on the context. See How the Continuous Phonetic Input Method Works. The malayalam script contains a special ‘CHILLU’ section, that is actually the SPECIAL OTHER category. Mapping for the Continuous Phonetic Based Input MethodThe following figures show the existing mappings from English to the phonetic equivalent characters in the target Indic scripts. Use these illustrations as a reference until you know all the mappings for the script that you use. Mappings given here are intuitive, so you should be able to input most of the characters without looking up the illustration. Note – In these mappings, special characters such as ‘.’ and ‘|’ included as part of the mapping are escaped with a ‘\’ character. If not escaped, the ‘|’ character acts as a separator when more than one token represents the same UTF-8 character. Figure 4–1, Figure 4–2, and Figure 4–3 show the English to Bengali mappings for consonants, vowels, and others. Figure 4–1 Map for Bengali Consonants
Figure 4–2 Map for Bengali Vowels
Figure 4–3 Map for Bengali Others
Figure 4–4, Figure 4–5, and Figure 4–6 show the English to Gujarati mappings for consonants, vowels, and others. Figure 4–4 Map for Gujarati Consonants
Figure 4–5 Map for Gujarati Vowels
Figure 4–6 Map for Gujarati Others
Figure 4–7, Figure 4–8, and Figure 4–9 show the English to Gurmukhi mappings for consonants, vowels, and others. Figure 4–7 Map for Gurmukhi Consonants
Figure 4–8 Map for Gurmukhi Vowels
Figure 4–9 Map for Gurmukhi Others
Figure 4–10, Figure 4–11, and Figure 4–12 show the English to Hindi mappings for consonants, vowels, and others. Figure 4–10 Map for Hindi Consonants
Figure 4–11 Map for Hindi Vowels
Figure 4–12 Map for Hindi Others
Figure 4–13, Figure 4–14, and Figure 4–15 show the English to Kannada mappings for consonants, vowels, and others. Figure 4–13 Map for Kannada Consonants
Figure 4–14 Map for Kannada Vowels
Figure 4–15 Map for Kannada Others
Figure 4–16, Figure 4–17, and Figure 4–18 show the English to Malayalam mappings for consonants, vowels, and others. Figure 4–16 Map for Malayalam Consonants
Figure 4–17 Map for Malayalam Vowels
Figure 4–18 Map for Malayalam Others
Figure 4–19 and Figure 4–20 show the English to Tamil mappings for consonants and vowels. Figure 4–19 Map for Tamil Consonants
Figure 4–20 Map for Tamil Vowels
Figure 4–21,Figure 4–22, and Figure 4–23 show the English to Telugu mappings for consonants, vowels, and others. Figure 4–21 Map for Telugu Consonants
Figure 4–22 Map for Telugu Vowels
Figure 4–23 Map for Telugu Others
How the Continuous Phonetic Input Method WorksFor each Indic script, a ‘virama’ or equivalent sign combined with a consonant gives the half form (or ready to combine form) of the consonant. Whenever a multiple key combination corresponding to a consonant is typed, the consonant + virama form is output, symbolizing that the characters are ready to combine. Consonants, at initial input, will assume their half form and will be a full syllable or their variation when followed by a vowel. Two consecutive consonants remain as the ready to combine half forms. Half forms can be converted by the layout engine as a single combined character or can remain as those independent forms that are also syntactically valid for every language. Any vowel that forms the beginning of a word or is followed by another vowel appears in independent form. A vowel that immediately follows a consonant assumes dependent forms. Characters that do not change shapes in any context are called others. These characters are neither consonants nor vowels. Digits and other punctuation marks that do not form a part of a character are mapped one to one. Using these principles, a parser is written that will parse the input into these different categories and output the language-specific Unicode codepoints. The continuous phonetic input method engine does not deal with layout or rendering, which will be done by other modules in the system. Thai LocalizationThe current Solaris environment supports three Thai input levels and four Thai keyboard layouts. Thai Input MethodsThe following Thai input methods are supported in this release. These input methods are specified in the Thai IT Standard for character sequence checking.
The passthrough level, with no sequence check, is the default in this release as it was in previous Solaris releases. You can use the F2 function key to switch from one input level to the next. Thai Keyboard LayoutsFour different keyboard layouts are supported for the Thai input method.
Thai Input Method Auxiliary WindowThe Thai input method auxiliary window supports the following functions and utilities:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||