Misplaced Pages

ISO/IEC 8859-11

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Thai character encoding, based on ASCII

ISO/IEC 8859-11:2001, Information technology — 8-bit single-byte coded graphic character sets — Part 11: Latin/Thai alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 2001. It is informally referred to as Latin/Thai. It is nearly identical to the national Thai standard TIS-620 (1990). The sole difference is that ISO/IEC 8859-11 allocates non-breaking space to code 0xA0, while TIS-620 leaves it undefined. (In practice, this small distinction is usually ignored.)

ISO-8859-11 is not a main registered IANA charset name despite following the normal pattern for IANA charsets based on the ISO 8859 series. However, it is defined as an alias of the close equivalent TIS-620 (which lacks the non-breaking space), and which can without problems be used for ISO/IEC 8859-11, since the no-break space has a code which was unallocated in TIS-620. Microsoft has assigned code page 28601 a.k.a. Windows-28601 to ISO-8859-11 in Windows. A draft had the Thai letters in different spots.

As with all varieties of ISO/IEC 8859, the lower 128 codes are equivalent to ASCII. The additional characters, apart from no-break space, are found in Unicode in the same order, only shifted from 0xA1 to U+0E01 and so forth.

The Microsoft Windows code page 874 as well as the code page used in the Thai version of the Apple Macintosh, MacThai, are variants of TIS-620 — incompatible with each other, however.

Character set

ISO/IEC 8859-11
0 1 2 3 4 5 6 7 8 9 A B C D E F
0x
1x
2x  SP  ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ] _
6x ` a b c d e f g h i j k l m n o
7x p q r s t u v w x y z { | } ~
8x
9x
Ax NBSP
Bx
Cx
Dx ั ำ ิ ี ึ ื ุ ู ฺ ฿
Ex ็ ่ ้ ๊ ๋ ์ ํ ๎
Fx

Code values D1, D4-DA, E7-EE are combining characters.

Vendor extensions

Code page 874 (IBM) / 9066

IBM code page 874 (CP874, IBM-874, x-IBM874), also known as Code page 9066 (IBM-9066), differs from ISO/IEC 8859-11 in only nine symbols shown boxed in the following table:

IBM code page 874/9066 (differences from ISO-8859-11)
0 1 2 3 4 5 6 7 8 9 A B C D E F
Ax ่
Bx
Cx
Dx ั ำ ิ ี ึ ื ุ ู ฺ ้ ๊ ๋ ์ ฿
Ex ็ ่ ้ ๊ ๋ ์ ํ ๎
Fx ¢ ¬ ¦ NBSP
  Differences from ISO 8859-11

Code page 1161

Code page 1161 (CP1161, IBM-1161), is a variant of IBM code page 874. The only difference is the euro sign (€) in position DEhex (222).

Code page 874 (Microsoft) / 1162

Windows code page 874 (windows-874, MS874, x-windows-874), known as Code page 1162 (CP1162, IBM-1162) by IBM, is used by Microsoft Windows. It differs from ISO/IEC 8859-11 only by adding the nine symbols shown in the following table:

Code page 1162 (IBM) / 874 (Microsoft): difference from ISO-8859-11
0 1 2 3 4 5 6 7 8 9 A B C D E F
8x
9x
  Differences from ISO 8859-11

Mac OS Thai

This is the variant used on the Classic Mac OS.

Mac OS Thai
0 1 2 3 4 5 6 7 8 9 A B C D E F
8x « » ่ ้ ๊ ๋ ์ ่ ้ ๊ ๋ ์ ํ
9x ั ็ ิ ี ึ ื ่ ้ ๊ ๋ ์
Ax NBSP
Bx
Cx
Dx ั ำ ิ ี ึ ื ุ ู ฺ  WJ  ZWSP ฿
Ex ็ ่ ้ ๊ ๋ ์ ํ
Fx ® ©
  Differences from ISO 8859-11

See also

Footnotes

References

  1. "IANA Character Sets".
  2. "js-codepage, Getting codepages". GitHub. 12 October 2021.
  3. Everson, Michael. "Proposed ISO 8859-11".
  4. Whistler, Ken (2002-10-07), ISO/IEC 8859-11:2001 to Unicode, Unicode Consortium
  5. IBM; Unicode Consortium. "convrtrs.txt". International Components for Unicode. v. 59180.0.1. Yes ibm-874 == ibm-9066. ibm-1161 has the euro update.
  6. "Code page 874 information document". Archived from the original on 2017-01-16.
  7. "CCSID 874 information document". Archived from the original on 2016-03-27.
  8. "CCSID 9066 information document". Archived from the original on 2016-03-27.
  9. IBM. "Code Page CPGID 00874" (PDF). REGISTRY: Graphic Character Sets and Code Pages.
  10. Code Page CPGID 00874 (txt), IBM
  11. "Converter Explorer: ibm-874_P100-1995". International Components for Unicode. Unicode Consortium.
  12. "Code Page 01161" (PDF).
  13. "CCSID 1161 information document". Archived from the original on 2016-03-27.
  14. "Code page 1162 information document". Archived from the original on 2016-03-17.
  15. "CCSID 1162 information document". Archived from the original on 2016-03-27.
  16. "Code Page 01162" (PDF).
  17. Steele, Shawn (1998-02-28). "cp874 to Unicode table". Unicode Consortium, Microsoft.
  18. Code Page CPGID 01162 (txt), IBM
  19. International Components for Unicode (ICU), ibm-1162_P100-1999.ucm, 2002-12-03
  20. Apple (2005-04-05). "Map (external version) from Mac OS Thai character set to Unicode 3.2 and later". Unicode Consortium.

External links

Character encodings
Early telecommunications
ISO/IEC 8859
Bibliographic use
National standards
ISO/IEC 2022
Mac OS Code pages
("scripts")
DOS code pages
IBM AIX code pages
Windows code pages
EBCDIC code pages
DEC terminals (VTx)
Platform specific
Unicode / ISO/IEC 10646
TeX typesetting system
Miscellaneous code pages
Control character
Related topics
Character sets
ISO standards by standard number
List of ISO standardsISO romanizationsIEC standards
1–9999
10000–19999
20000–29999
30000+
Categories: