
Hello!

Welcome to a further very alpha release of Unicode::Map

To be found at: http://wwwwbs.cs.tu-berlin.de/~schwartz/perl/


FURTHER MODULES

   You will need module "Startup.pm" to run the map utility coming along
   with this distribution. You can find this also at the address above.

   By coincidence Gisle Aas and me did the same job. We'll coordinate
   for that not too much mess will be around. Gisle's module is called:
   Unicode::Map8 and can be found at your favorite CPAN site.


DESCRIPTION

   This module converts strings from and to 2-byte Unicode UCS2 format. 
   Available character sets, their names and their aliases are defined in 
   the file "REGISTRY" in the Unicode::Map hierarchy. 

   Character mapping is according to the data of binary mapfiles in
   Unicode::Map hierarchy. Binary mapfiles can also be created with this
   module, so that you could install your specific character sets. There is a
   special utility "mkmapfile" provided to ease this task.

   Normally it is sufficient to map 1 character to 1 unicode character and
   vice versa. Apple defines some 1 character to n unicode character
   mappings, so that this handling is implemented also. 

   Performance of this module is ok, except for the loading of eastern asia
   map files. Modules structure will be improved. 

   You should have a look at utility "map" coming along with this.


CPAN

   I didn't put the module into CPAN until now. Anyway I propose to 
   settle Map.pm as:

   Unicode::Map  aupO  Maps characters from and to unicode


Contact: Martin schwartz@cs.tu-berlin.de

Comments welcome! 


Defined character sets:

01: ADOBE-DINGBATS
02: ADOBE-STANDARD (Adobe-Standard-Encoding, csAdobeStandardEncoding)
03: ADOBE-SYMBOL
04: APPLE-ARABIC
05: APPLE-CNTEURO
06: APPLE-CROATIAN
07: APPLE-CYRILLIC
08: APPLE-DINGBAT
09: APPLE-GREEK
10: APPLE-HEBREW
11: APPLE-ICELAND
12: APPLE-JAPAN
13: APPLE-ROMAN
14: APPLE-ROMANIA
15: APPLE-SYMBOL
16: APPLE-THAI
17: APPLE-TURKISH
18: APPLE-UKRAINE
19: BIG5
20: CP037 (csIBM037, ebcdic-cp-ca, ebcdic-cp-nl, ebcdic-cp-us, ebcdic-cp-wt)
21: CP1026 (IBM1026, csIBM1026)
22: CP1250 (windows-1250)
23: CP1251 (windows-1251)
24: CP1252 (windows-1252)
25: CP1253 (windows-1253)
26: CP1254 (windows-1254)
27: CP1255 (windows-1255)
28: CP1256 (windows-1256)
29: CP1257 (windows-1257)
30: CP1258 (windows-1258)
31: CP437 (437, csPC8CodePage437)
32: CP500 (csIBM500, ebcdic-cp-be, ebcdic-cp-ch)
33: CP737
34: CP775 (IBM775, csPC775Baltic)
35: CP850 (850, IBM850, csPC850Multilingual)
36: CP852 (852, IBM852, csPCp852)
37: CP855 (855, IBM855, csIBM855)
38: CP857 (857, IBM857, csIBM857)
39: CP860 (860, IBM860, csIBM860)
40: CP861 (861, IBM861, cp-is, csIBM861)
41: CP862 (862, IBM862, csPC862LatinHebrew)
42: CP863 (863, IBM863, csIBM863)
43: CP864 (IBM864, csIBM864)
44: CP865 (865, IBM865, csIBM865)
45: CP866 (866, IBM866, csIBM866)
46: CP869 (869, IBM869, cp-gr, csIBM869)
47: CP874
48: CP875
49: CP932
50: CP936
51: CP949
52: CP950
53: GB12345-80
54: GB2312-80
55: IBM038 (CP038)
56: ISO-8859-1 (CP819, IBM819, ISO-IR-100, ISO_8859-1:1987, L1, LATIN1)
57: ISO-8859-10 (ISO-IR-157, ISO_8859-10:1993, L6, LATIN6)
58: ISO-8859-2 (ISO-IR-101, ISO_8859-2:1987, L2, LATIN2)
59: ISO-8859-3 (ISO-IR-109, ISO_8859-3:1988, L3, LATIN3)
60: ISO-8859-4 (ISO-IR-110, ISO_8859-4:1988, L4, LATIN4)
61: ISO-8859-5 (CYRILLIC, ISO-IR-144, ISO_8859-5:1988)
62: ISO-8859-6 (ARABIC, ASMO-708, ECMA-114, ISO-IR-127, ISO_8859-6:1987)
63: ISO-8859-7 (ECMA-118, ELOT_928, GREEK, GREEK8, ISO-IR-126, ISO_8859-7:1987)
64: ISO-8859-8 (HEBREW, ISO-IR-138, ISO_8859-8:1988)
65: ISO-8859-9 (ISO-IR-148, ISO_8859-9:1989, L5, LATIN5)
66: JIS-X-0201
67: JIS-X-0208
68: JIS-X-0212
69: MS-CYRILLIC
70: MS-GREEK
71: MS-ICELAND
72: MS-LATIN2
73: MS-ROMAN
74: MS-TURKISH
75: NEXT (NEXTSTEP, NeXT)
76: Shift-JIS
77: US-ASCII (ANSI_X3.4-1968, ANSI_X3.4-1986, ASCII, IBM367, ISO646-US, ISO_646.irv:1991, cp367, csASCII, iso-ir-6, us)

