Part I Features Overview
This part introduces the features of the Simplified Chinese
Solaris Operating System (Solaris OS).
Chapter 1 Overview of Features
The Simplified Chinese Solaris Operating System (Solaris OS) is the internationalization
and the localization of the current Solaris Operating System and the Common Desktop Environment
(CDE) window system.
This chapter describes the new features and the language support that
are available in the Simplified Chinese Solaris release.
New Localized Features
New to this release is the Wubi input method, support for the 3.2 version
of the Unicode Standard, and improvements to the mp print
filter. This filter replaces the xetops and the xutops utilities.
-
Wubi Input Method. One of the main advantages of Wubi and
other shape-based input methods is a very low repetition rate. A single Wubi
code seldom represents more than one character, meaning that you can enter
text more quickly.
Under the authorization of Wangma Company, the following Wubi features
are available in the Solaris 10 release:
-
GB18030-2000 character set
support – The GB18030 Chinese character set is the national character
encoding standard issued by Chinese government in 2000. The Wubi input method
supports the GB18030-2000 character set. Wubi makes working with the smaller
character sets contained in GB18030-2000 easier.
-
Easy character set switching – Solaris Wangma Wubi divides
GB18030 into three character sets: GB2312, GBK and GB18030. You can use keyboard
shortcuts to switch between character sets as you type.
-
New radical mechanism for Simplified and Traditional Chinese. –
Patented by professor Wang Yongmin, who invented the Wubi input method, this
new mechanism was developed from the old radical system, version 86. With
no additional training, users of Wubi version 86 can access three times more
characters of the same encoding and the same typing rules.
-
Unicode 3.2 support. The zh_CN.UTF-8
(zh.UTF-8) locale has been updated to support the new 3.2
version of the Unicode Standard. The new version introduces an additional
1,016 new characters and contains various normative and informative changes.
Unicode 3.2 also defines the following newly invalid UTF-8 byte sequences:
-
0xED as the first byte.
-
0xA0 to 0xBF as the second byte.
These sequences exclude the surrogate code points between U+D800 and
U+DFFF. To comply with the new definition, the Simplified UTF-8 iconv modules have been enhanced to detect the newly defined UTF-8
invalid byte sequences.
-
In the current Solaris release,
the mp printing utility replaces the xetops
and the xutops utilities.
Note –
The xetops and xutops printing
utilities are no longer supported in the Solaris Operating System. The utilities were formerly
used to convert Simplified Chinese text files to PostScript. The conversion enabled
the printing of Simplified Chinese characters to PostScript printers with no resident
Asian fonts. The xetops utility was used in the zh_CN.EUC/zh locale and in the zh_CN.GBK/zh.GBK locale. The xutops
utility was used in the zh_CN.UTF-8/zh.UTF-8
locale.
The mp printing utility was first released with the
Solaris 9 Operating System.
Language Support
The current Solaris release builds inherent internationalization features
into every localized product. Localization facilities support the ANSI C recommendations
for internationalization and localization that define the locale and related
categories.
Locale Attributes
A locale contains the culturally specific information
and conventions of the language for a particular global region. Each process
in the Solaris Operating System has the following set of locale attributes:
-
Locale settings, which provide the locale
and setlocale commands you use to list and set attributes
before you start a process from the command line.
For example, the Simplified Chinese locales and the English/ASCII locale
both have a category that defines the display of time and date according to
the cultural format, as well as the actual Simplified Chinese or English/ASCII
characters for the time and date.
-
Code sets, which support coding conventions for the GB2312
and the GB18030 character sets. These sets enable you to input, display, and
print Simplified Chinese text in file names, system messages, and terminal (TTY),
email, and data file content.
-
htt input method server, which handles Simplified Chinese
input for the Solaris Operating System. The htt server receives your
keyboard input and converts it to Simplified Chinese characters that are used
in Simplified Chinese applications.
Simplified Chinese Locales
The Simplified Chinese Solaris Operating System provides simultaneous support
for the locales in the following table. The locales look the same to the end
user, but the internal character encoding is different.
Table 1–1 Simplified Chinese Locales
|
Locale
|
Description
|
|
zh_CN.EUC (zh)
|
Simplified Chinese EUC (GB2312)
|
|
zh_CN.GBK (zh.GBK)
|
Simplified Chinese GBK
|
|
zh_CN.GB18030
|
Simplified Chinese GB18030-2000
|
|
zh_CN.UTF-8 (zh.UTF-8)
|
Simplified Chinese UTF-8 (Unicode 3.2)
|
Simplified Chinese Code Sets
The following table lists supported code sets for each Simplified Chinese
locale.
Table 1–2 Simplified Chinese Code Sets
|
Locale
|
Code Set
|
|
zh_CN.EUC (zh)
|
gb2312
|
|
zh_CN.GBK (zh.GBK)
|
GBK
|
|
zh_CN.GB18030
|
GB18030–2000
|
|
zh_CN.UTF-8 (zh.UTF-8)
|
UTF-8
|
Simplified Chinese Input Methods and Fonts
The Simplified Chinese Solaris Operating System provides input methods and
fonts for the locales shown in the lists and tables in this section.
The following input methods are supported for the zh
locale:
The following input methods are supported for the zh_CN.GB18030 locale:
For a complete list of fonts supported for the Simplified Chinese locales,
see Bitmap and TrueType Fonts.
Input Method Auxiliary Window
The input method auxiliary window supports the following functions.
Locale Categories
In the Simplified Chinese Solaris Operating System, you can use the following general
and specific categories as defined by ANSI C for the Simplified Chinese and English
locales:
-
General LC_ALL setting that invokes all
of the categories for locale-related aspects of the environment.
-
Specific settings for particular aspects of the environment,
which include the following categories:
-
LC_CTYPE
-
LC_TIME
-
LC_NUMERIC
-
LC_MONETARY
-
LC_COLLATE
-
LC_MESSAGES
For example, the Simplified Chinese and the English/ASCII locales have the LC_TIME category that defines the display of the time and date according
to the cultural format, as well as the actual Simplified Chinese or English/ASCII
characters used in the display.