KSC and UHC

KS C 5601-1987 (Wansung)
- EUC-KR
- ISO-2022-KR
Unified Hangul Code (Extended Wansung)
Links

KS C 5601-1987

A 94x94 character set for Korean, aka. Wansung. KS stands for Korean Standard. Microsoft uses KS C 5601-1987 in the sense of Unified Hangul Code.

KS C 5601-1987 has the following encoding forms:

EUC-KR (default)
ISO-2022-KR

Contents

EUC-KR

Extended Unix Code for Korean. An 8-bit encoding form, the default encoding of KS C 5601-1987.

Byte-ranges for

     single-byte ASCII:  0x21-0x7E
     double-byte KSC:    0xA1-0xFE

Test your browser by selecting Korean in View : Character Coding, or View : Encoding. The text in the right column should match the GIF in the left column. Click here if it does not.

GIF	Text
	�ѱ�

Contents

ISO-2022-KR

A 7-bit encoding form of KSC used in email. It uses the same bytes (0x21-0x7E) to encode single-byte ASCII and double-byte KSC characters.

An ISO-2022-KR encoded plain text file or the body of an ISO-2022-KR encoded email message must begin with the following escape sequence:

<Esc>$)C

where <Esc> is the Escape byte (0x1B).

Bytes are interpreted as ASCII characters unless you "shift out" to KSC with the Shift-Out byte (0x0E). You can return to ASCII with the Shift-In byte (0x0F).

If you are using Mozilla or Netscape 6 then you can decode the text in the right column below by selecting Korean (ISO-2022-KR) in View : Character Coding, or View : Encoding. Click here if you cannot.

GIF	Text
	$)CGQ1[

Internet Explorer or Netscape Communicator can only display ISO-2022-KR encoded web pages if their encoding is specified in the META tag. Try this page.

Contents

Unified Hangul Code

UHC, or Extended Wansung, is a superset of KS C 5601-1987, incorporating all the Hangul characters of Johab. It has an 8-bit encoding form with the following byte-ranges:

     Single-byte ASCII:       0x21-0x7E
     UHC first byte range:    0x81-0xFE
     UHC second byte ranges:  0x41-0x5A, 0x61-0x7A, 0x81-0xFE

Contents