Contained Within
Find More Documentation
Featured Support Resources
| Download this book in PDF (247 KB)
Appendix A Codeset Conversions
A.1 Codeset Conversions
The following table provides a detailed listing of the supported code
conversions.
Note -
Unicode* includes all of the following codesets: UTF-8, UCS-2, UCS-2BE,
UCS-2LE, UCS-4, UCS-4BE, UCS-4LE, UTF-16, UTF-16BE, UTF-16LE.
ISO 8859 codesets can also be referenced without the ISO prefix; for
example, ISO 8859-1 = 8859-1.
Table A-1 Supported code conversions
|
Code
|
Code
|
Description
|
|
Unicode*
|
ISO 646
|
Unicode* <--> ISO 646 (ASCII)
|
|
Unicode*
|
ISO 8859-1
|
Unicode* <-->
ISO 8859-1 (Latin-1)
|
|
Unicode*
|
ISO 8859-2
|
Unicode* <-->
ISO 8859-2 (Latin-2)
|
|
Unicode*
|
ISO 8859-3
|
Unicode* <-->
ISO 8859-3 (Latin-3)
|
|
Unicode*
|
ISO 8859-4
|
Unicode* <-->
ISO 8859-4 (Latin-4)
|
|
Unicode*
|
ISO 8859-5
|
Unicode* <-->
ISO 8859-5 (Cyrillic)
|
|
Unicode*
|
ISO 8859-6
|
Unicode* <-->
ISO 8859-6 (Arabic)
|
|
Unicode*
|
ISO 8859-7
|
Unicode* <-->
ISO 8859-7 (Greek)
|
|
Unicode*
|
ISO 8859-8
|
Unicode* <--> ISO 8859-8 (Hebrew)
|
|
Unicode*
|
ISO 8859-9
|
Unicode* <-->
ISO 8859-9 (Latin-5)
|
|
Unicode*
|
ISO 8859-10
|
Unicode* <-->
ISO 8859-10 (Latin-6)
|
|
Unicode*
|
ISO 8859-13
|
Unicode* <-->
ISO 8859-13
|
|
Unicode*
|
ISO 8859-14
|
Unicode* <-->
ISO 8859-14
|
|
Unicode*
|
ISO 8859-15
|
Unicode* <-->
ISO 8859-15
|
|
Unicode*
|
KOI8-R, KO18-U, koi8-r, koi8-u
|
Unicode* <--> KOI8-R, KO18-U, koi8-r, koi8-u (Cyrillic)
|
|
UTF-7
|
UCS-2, UCS-4, UTF-8
|
UTF-7 <-->
UCS-2, UCS-4, UTF-8
|
|
UTF-8
|
UCS-2, UCS-4, UTF-16
|
UTF-8 <-->
UCS-2, UCS-4, UTF-16
|
|
UTF-8
|
UCS-2BE, UCS-2LE, UCS-4BE, UCS-4LE, UTF-16BE, UTF-16LE
|
UTF-8 <--> UCS-2BE, UCS-2LE, UCS-4BE, UCS-4LE,
UTF-16BE, UTF-16LE
|
|
UCS-4, UCS-4BE, UCS-4LE
|
UCS-2, UCS-2BE, UCS-2LE, UTF-16, UTF-16BE,
UTF-16LE
|
UCS-4, UCS-4BE, UCS-4LE
<--> UCS-2, UCS-2BE, UCS-2LE, UTF-16, UTF-16BE, UTF-16LE
|
|
UTF-8
|
UTF-EBCDIC
|
UTF-8 <-->
UTF-EBCDIC
|
|
UTF-8
|
IBM-037, -273, -277, -278, -280 -284, -285, -297, -420 -424, -500, -850, -852
-855, -856, -857, -862 -864, -866, -869, -870 -875, -880, -921, -922 -1025,
-1026, -1046, -1112, -1122
|
UTF-8 <--> various IBM code pages (PC and EBCDIC)
|
|
UTF-8
|
CP850, CP852, CP855, CP857, CP862, CP864, CP866, CP869, CP874, CP1250, CP1251,
CP1252, CP1252, CP1253, CP1254, CP1255, CP1256, CP1257, CP1258
|
UTF-8 <--> various Microsoft code
pages
|
|
UTF-8
|
eucJP
|
UTF-8 <--> Japanese EUC (JIS X0201-1976, JIS X0208-1983 and JIS
X0212-1990)
|
|
UTF-8
|
PCK
|
UTF-8 <--> Japanese PC Kanji (a.k.a. SJIS)
|
|
UTF-8
|
ISO-2022-JP
|
UTF-8 <--> Japanese MIME charset
|
|
UTF-8-Java
|
eucJP
|
UTF-8-Java to Japanese EUC (JIS X0201-1976, JIS X0208-1983 and JIS
X0212-1990)
|
|
UTF-8-Java
|
PCK
|
UTF-8-Java to Japanese PC Kanji (a.k.a. SJIS)
|
|
UTF-8-Java
|
ISO-2022-JP.RFC1468
|
UTF-8-Java to Japanese MIME charset (one-way conversion)
|
|
UTF-8
|
ko_KR-euc
|
UTF-8 <--> Korean EUC (KS C 5636 and KS C 5601-1987)
|
|
UTF-8
|
ko_KR-johap
|
UTF-8 <-->
Korean Johap (of KS C 5601-1987)
|
|
UTF-8
|
ko_KR-johap92
|
UTF-8 <-->
Korean Johap (of KS C 5601-1992)
|
|
UTF-8
|
ko_KR-iso2022-7
|
UTF-8 <-->
Korean MIME charset (ISO-2022-KR)
|
|
UTF-8
|
ko_KR-cp933
|
UTF-8 <-->
IBM MBCS CP933 ko_KR-euc
|
|
UTF-8
|
gb2312
|
UTF-8 <--> Simplified
Chinese EUC (GB 1988-1980 and GB 2312-1980)
|
|
UTF-8
|
iso2022
|
UTF-8 <--> Simplified
Chinese MIME charset (ISO-2022-CN)
|
|
UTF-8
|
GBK
|
UTF-8 <--> Simplified
Chinese GBK
|
|
UTF-8
|
zh_TW-euc
|
UTF-8 <-->
Traditional Chinese EUC (CNS 11643-1992)
|
|
UTF-8
|
zh_TW-big5
|
UTF-8 <-->
Traditional Chinese Big5
|
|
UTF-8
|
zh_TW-iso2022-7
|
UTF-8 <-->
Traditional Chinese MIME charset (ISO-2022-TW)
|
|
UTF-8
|
zh_TW-cp937
|
UTF-8 <--> IBM MBCS CP937
|
|