ISO/IEC 8859-11:2001, Information technology — 8-bit single-byte coded graphic character sets — Part 11: Latin/Thai alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 2001. It is informally referred to as Latin/Thai. It is nearly identical to the national Thai standard TIS-620 (1990). The sole difference is that ISO/IEC 8859-11 allocates non-breaking space to code 0xA0, while TIS-620 leaves it undefined. (In practice, this small distinction is usually ignored.)
ISO-8859-11 is not a main registered IANA charset name despite following the normal pattern for IANA charsets based on the ISO 8859 series. However, it is defined as an alias of the close equivalent TIS-620 (which lacks the non-breaking space), and which can without problems be used for ISO/IEC 8859-11, since the no-break space has a code which was unallocated in TIS-620. Microsoft has assigned code page 28601 a.k.a. Windows-28601 to ISO-8859-11 in Windows. A draft had the Thai letters in different spots.
As with all varieties of ISO/IEC 8859, the lower 128 codes are equivalent to ASCII. The additional characters, apart from no-break space, are found in Unicode in the same order, only shifted from 0xA1 to U+0E01 and so forth.
The Microsoft Windows code page 874 as well as the code page used in the Thai version of the Apple Macintosh, MacThai, are variants of TIS-620 — incompatible with each other, however.
Character set
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
0x | ||||||||||||||||
1x | ||||||||||||||||
2x | SP | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / |
3x | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? |
4x | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O |
5x | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | _ | |
6x | ` | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o |
7x | p | q | r | s | t | u | v | w | x | y | z | { | | | } | ~ | |
8x | ||||||||||||||||
9x | ||||||||||||||||
Ax | NBSP | ก | ข | ฃ | ค | ฅ | ฆ | ง | จ | ฉ | ช | ซ | ฌ | ญ | ฎ | ฏ |
Bx | ฐ | ฑ | ฒ | ณ | ด | ต | ถ | ท | ธ | น | บ | ป | ผ | ฝ | พ | ฟ |
Cx | ภ | ม | ย | ร | ฤ | ล | ฦ | ว | ศ | ษ | ส | ห | ฬ | อ | ฮ | ฯ |
Dx | ะ | ั | า | ำ | ิ | ี | ึ | ื | ุ | ู | ฺ | ฿ | ||||
Ex | เ | แ | โ | ใ | ไ | ๅ | ๆ | ็ | ่ | ้ | ๊ | ๋ | ์ | ํ | ๎ | ๏ |
Fx | ๐ | ๑ | ๒ | ๓ | ๔ | ๕ | ๖ | ๗ | ๘ | ๙ | ๚ | ๛ |
Code values D1, D4-DA, E7-EE are combining characters.
Vendor extensions
Code page 874 (IBM) / 9066
IBM code page 874 (CP874, IBM-874, x-IBM874), also known as Code page 9066 (IBM-9066), differs from ISO/IEC 8859-11 in only nine symbols shown boxed in the following table:
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
Ax | ่ | ก | ข | ฃ | ค | ฅ | ฆ | ง | จ | ฉ | ช | ซ | ฌ | ญ | ฎ | ฏ |
Bx | ฐ | ฑ | ฒ | ณ | ด | ต | ถ | ท | ธ | น | บ | ป | ผ | ฝ | พ | ฟ |
Cx | ภ | ม | ย | ร | ฤ | ล | ฦ | ว | ศ | ษ | ส | ห | ฬ | อ | ฮ | ฯ |
Dx | ะ | ั | า | ำ | ิ | ี | ึ | ื | ุ | ู | ฺ | ้ | ๊ | ๋ | ์ | ฿ |
Ex | เ | แ | โ | ใ | ไ | ๅ | ๆ | ็ | ่ | ้ | ๊ | ๋ | ์ | ํ | ๎ | ๏ |
Fx | ๐ | ๑ | ๒ | ๓ | ๔ | ๕ | ๖ | ๗ | ๘ | ๙ | ๚ | ๛ | ¢ | ¬ | ¦ | NBSP |
Code page 1161
Code page 1161 (CP1161, IBM-1161), is a variant of IBM code page 874. The only difference is the euro sign (€) in position DEhex (222).
Code page 874 (Microsoft) / 1162
Windows code page 874 (windows-874, MS874, x-windows-874), known as Code page 1162 (CP1162, IBM-1162) by IBM, is used by Microsoft Windows. It differs from ISO/IEC 8859-11 only by adding the nine symbols shown in the following table:
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
8x | € | … | ||||||||||||||
9x | ‘ | ’ | “ | ” | • | – | — |
Mac OS Thai
This is the variant used on the Classic Mac OS.
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
8x | « | » | … | ่ | ้ | ๊ | ๋ | ์ | ่ | ้ | ๊ | ๋ | ์ | “ | ” | ํ |
9x | • | ั | ็ | ิ | ี | ึ | ื | ่ | ้ | ๊ | ๋ | ์ | ‘ | ’ | ||
Ax | NBSP | ก | ข | ฃ | ค | ฅ | ฆ | ง | จ | ฉ | ช | ซ | ฌ | ญ | ฎ | ฏ |
Bx | ฐ | ฑ | ฒ | ณ | ด | ต | ถ | ท | ธ | น | บ | ป | ผ | ฝ | พ | ฟ |
Cx | ภ | ม | ย | ร | ฤ | ล | ฦ | ว | ศ | ษ | ส | ห | ฬ | อ | ฮ | ฯ |
Dx | ะ | ั | า | ำ | ิ | ี | ึ | ื | ุ | ู | ฺ | WJ | ZWSP | – | — | ฿ |
Ex | เ | แ | โ | ใ | ไ | ๅ | ๆ | ็ | ่ | ้ | ๊ | ๋ | ์ | ํ | ™ | ๏ |
Fx | ๐ | ๑ | ๒ | ๓ | ๔ | ๕ | ๖ | ๗ | ๘ | ๙ | ® | © |
See also
Footnotes
References
- "IANA Character Sets".
- "js-codepage, Getting codepages". GitHub. 12 October 2021.
- Everson, Michael. "Proposed ISO 8859-11".
- Whistler, Ken (2002-10-07), ISO/IEC 8859-11:2001 to Unicode, Unicode Consortium
- IBM; Unicode Consortium. "convrtrs.txt". International Components for Unicode. v. 59180.0.1.
Yes ibm-874 == ibm-9066. ibm-1161 has the euro update.
- "Code page 874 information document". Archived from the original on 2017-01-16.
- "CCSID 874 information document". Archived from the original on 2016-03-27.
- "CCSID 9066 information document". Archived from the original on 2016-03-27.
- IBM. "Code Page CPGID 00874" (PDF). REGISTRY: Graphic Character Sets and Code Pages.
- Code Page CPGID 00874 (txt), IBM
- "Converter Explorer: ibm-874_P100-1995". International Components for Unicode. Unicode Consortium.
- "Code Page 01161" (PDF).
- "CCSID 1161 information document". Archived from the original on 2016-03-27.
- "Code page 1162 information document". Archived from the original on 2016-03-17.
- "CCSID 1162 information document". Archived from the original on 2016-03-27.
- "Code Page 01162" (PDF).
- Steele, Shawn (1998-02-28). "cp874 to Unicode table". Unicode Consortium, Microsoft.
- Code Page CPGID 01162 (txt), IBM
- International Components for Unicode (ICU), ibm-1162_P100-1999.ucm, 2002-12-03
- Apple (2005-04-05). "Map (external version) from Mac OS Thai character set to Unicode 3.2 and later". Unicode Consortium.
External links
- ISO/IEC 8859-11:2001
- ISO/IEC 8859-11:1999 - 8-bit single-byte coded graphic character sets, Part 11: Latin/Thai character set (draft dated June 22, 1999; superseded by ISO/IEC 8859-11:2001, published December 15, 2001)
- Windows code page 874
- ISO-IR 166 Thai character set (July 13, 1992, from Thai Standard TIS 620-2533 (1990))
- Standardization and Implementations of Thai Language PDF 175k
Character encodings | |
---|---|
Early telecommunications | |
ISO/IEC 8859 |
|
Bibliographic use | |
National standards | |
ISO/IEC 2022 | |
Mac OS Code pages ("scripts") | |
DOS code pages | |
IBM AIX code pages | |
Windows code pages | |
EBCDIC code pages | |
DEC terminals (VTx) | |
Platform specific |
|
Unicode / ISO/IEC 10646 | |
TeX typesetting system | |
Miscellaneous code pages | |
Control character | |
Related topics | |
Character sets |