中文佛典造字讨论 FAQ

~---------- Forwarded message ----------
Date: Sat, 2 Sep 1995 13:29:57 +0800 (CST)
From: David Chiou <b83050@cctwin.ee.ntu.edu.tw>
Subject: Chinese Characters FAQ (about Buddhism)

以下是近来至各处搜集来的中文内码相关文件中，比较重要的。
目前佛典的内码选用以及造字问题，是佛典输入的瓶颈，以下
资讯供各位师兄参考。

ps. 若此 mail alias 的师兄，有在寺院工作或是对於相关
    中文输入的讯习（内码、造字等）很有兴趣的人，
    请回函告知末学一声，以将您加在佛典输入的佛教机构 mail alias 中。
    有些关於中文内码的技术性问题，将不会在目前的 mail alias 内发布。
    （目前只有 corbon copy 至台大佛学研究中心、香光寺
      自衍法师、农禅寺果光法师等几位法师，及几位特别热心的师兄的帐号而已。）


以下即是佛典相关中文内码的重要 FAQ:
（末学上次曾转贴数十封相关的信件给以上佛教机构，
  提供作为参考。如果有师兄对此特别有兴趣的话，
  可以向末学索取更详细的文件，或直接加入佛教机构
  的名单中。）


=========================================================================
Date: Sat, 13 May 1995 10:07:34 +0800
From: Shann Wei-Chang <shann@math.ncu.edu.tw>
作者简介：中央数学系单维彰教授，对於国学极有兴趣，对於 UNIX 系统亦非常熟，
          参与网路上内码的讨论已多年。
Subject: internal code

大刚,

方才读了你的 report, 有关佛典输入碰到的罕见字问题.  你知道我在 CCNET 和
CHPOEM 的 mailing list 上很久了, 我们常常讨论这一类的问题.  关於它的
解决方案, 其实是没有共识的定案, 而且我自己的想法也随时间改变 (不知是不是
越变越成熟就有待时间考验了).

让我告诉你我现在的想法, 以资参考.  第一, 我不喜欢 Big-5 和当初设计它的
那一帮人, 这是典型的劣币驱良币的例子.  但是, 随著对事实的认知与妥协
 (这应该是与年龄有关), 我开始承认, 任何想要普遍流传的中文电子档案,
 必须与 Big-5 相容; 直接相容, 毋须转码或特殊处理.

要 output 特殊字比 input 简单, (input for search, for instance).  但是,
一篇电子文件通常只有字码, 而不附带字型 (glyph, the bitmap binary file
or in other formats).  如果文件是放在磁片或光碟上流通, 这个问题比较小,
但是我们总希望同样的文件, 应该能在极少的变动下放到网路上流传.  这时候,
文件与阅读器就是两码子事.  这是最需要花力气的地方.

我目前的想法是, 基本上使用 Big-5 码, 碰到罕用字, 用 Escape sequence
隔开, 就像海外留学生常用的 HZ 码, 或是日本的 JIS 标准, 以及大部份 UNIX
工作站之援的 EUC.  如果使用端的阅读器无法识别这个 Escape sequence, 或是
没有相对应的字型, 则读者可能看到一串乱七八糟的字, 但是通常这些字应该
不多, 不至於影响整个文章的内容.  至於该用哪些字串作为 Escape sequence?
我国的 CNS 码已经在国际上注册, 我们应该尽量跟随这个标准, 不能跟的时候,
应该运用网路大众传播的力量, 加上政治游说的力量, 把我们选定的 Escape sequence
设成标准.  至於罕用字该如何编码, 同样应该先参考中央标准局在 1992 年公布
的标准交换码.  这个码的编排符合国际标准, 目前共有七个字面, 还有很多括充
的空间, 每个字面依国际标准排入 94*94 个字码 (two bytes, each byte is
between 33 and 126, decimal inclusive).  第一二字面所选定的字基本上与
Big-5 相同, 但改正了几个 (也许是所有的) 错误.  第三到七字面定义了三万
多个罕用字, 或体字, 异体字, 和一些只出现在算命先生的命名学上的奇奇怪怪
的字: 它们的字码以及字型.  八到十六字面空著, 第十二字面是 user defined.

我的学识不足以凭断这些在第三到第七字面的字是否完整或排序妥当, 因为它们
全是我不认识的字.  如果佛经里的字还有在这里找不到的, 我建议不要用第十二
字面, 而是运用佛教团体的政治力量去争取一个字面, 例如十三, 作为宗教罕用字面.
因为, 所谓 user defined, 到最後一定是一团没用的稀泥.

至於罕用字的输入, 很明显的, 必须发展对应的中文输入软体以及字型.  在 X window
上已经有一套作法可循, 其他系统上也不该有技术上的困难.

我们的政府不知道在做什麽, 以台湾的自许为电脑王国的地位, 我们的国家交换码到
1986 才首次公布, 而且又沟通不良, 导致市场上没人理它 (不理政府似乎是近代两
岸中国人的共同特徵).  我想, 即使现在, 还是很多圈子里的人没听说过这个标准,
或是听说了但是没考虑过要用它.  倒是资策会和一些公家单位开始 (也许是被迫)
使用它, 国外的一些公司开始支援它, 因为它毕竟是在国际上注册的国家标准码.

时间匆促, 写了些别字, 但此 editor 不容易更正, 请原谅.

-Shann

========================================================================
Date: Mon, 28 Aug 1995 22:57:15 +0800 (CST)
From: David Chiou <b83050@cctwin.ee.ntu.edu.tw>
Subject: Recommend Chinese Code -- CNS



下文即是关於各种内码的简介，取自花园大学禅学 WWW:
http://www.iijnet.or.jp/iriz/irizhtml/irizhome.htm

（一些重要的内容，我会随手附上中文翻译，不过不保证没翻错。
  一切得以原文为准。）

     _________________________________________________________________

Chinese character codes: an update
中文内码的探索：修改版

    by Christian Wittern
    作者简介：日本京都花园大学禅学中心（即「电子达摩」刊物发行者）
              的资深人员。花园大学禅学中心对於佛典电子化的全世界
              联络工作，自 1992 年以前即开始进行，可是当今国际上
              最大的联络网。

     _________________________________________________________________

    Summary

   This article presents an update to Christian Wittern's and Urs App's
   articles concerning Chinese character codes (Electronic Bodhidharma
   No. 3). In those articles, Urs App argued that database creators must
   make the most crucial distinction between master data and user data.
   Master data should be of the highest quality, recording even minute
   detail like studio recording equipment. User data, on the other hand,
   must conform to what codes and equipment we presently have. Christian
   Wittern's article compared different codes and concluded that CCCII, a
   very large Taiwanese code that also includes Japanese and Korean
   letters, seems to be the best choice for the master data set of
   Chinese text databases.

   摘要

   本文改进了 Christian Wittern 先生和 Urs App 关於中文内码的评析
   （刊载於「电子达摩」期刊第三期）。在该文中， Urs App 表示资料库
   的建立者必须对於 master data 及 user data 作下非常非常重要的决定。
   Master data 必须具有最高的品质，如同录影器材记录下每分钟的画面一般；
   另一方面， user data 必须顺从於那种内码是我们现有的。

   Christian Wittern 先生的文章比较了几种不同的内码，结论是：
   「 CCCII（一种非常庞大的台湾的内码，并且包含了日本及韩国字）
      似乎是中文内码的 master data 的最佳选择。」


   We shelled out US $ 2000 for a CCCII board, only to discover that both
   the code itself and its implementation are seriously flawed. We thus
   had to continue using Big-5 for all practical purposes while looking
   for better solutions. Finally, Christian decided that the only
   practical approach at this time was to build on Big-5 (and other
   national codes such as JIS) and extend them through code references
   that are both stable and portable. His ingenious approach forms the
   basis of the IRIZ KanjiBase and its encoding scheme -- a scheme which
   will be as useful after the introduction of Unicode as it proves to be
   right now. (U.A.)

   我们花下了美金 2000 元，买了一个 CCCII 的板面，结果发现该码本身及
   它的附属设备，都具有严重的瑕疵。因此，我们在实际的状况上，只好继续
   使用 BIG-5内码，等著继续寻找更好的解决方案。最後， Christian 先生
   决定了，现时唯一实际可行的方法是建立在 BIG-5 （及日本国内普遍流行的
   JIS 码）上面，并且藉由既稳定又具可携性的「内码参照表」（code references）
   来扩展它们。他的这项聪明提议产生了「IRIZ 汉字库」的基础，以及「IRIZ
   汉字库」的「转译器」——一种在将来 Unicode 引进後，能够如同现在我们
   证明它有够实用的转译器。

     _________________________________________________________________

     * Some kanji codes for computers
         1. Japanese JIS Codes
         2. Taiwanese Big5
         3. Taiwanese CNS
         4. CCCII and EACC
         5. Unicode

     ＊一些电脑上的汉字内码：
         1. 日本 JIS 内码
         2. 台湾 BIG-5 内码
         3. 台湾中央标准局 CNS 内码
         4. CCCII内码及 EACC 程式
         5. Unicode

     * More information is available at ifcss.org in Ross Patterson's
       document CJK Codes and in Ken Lunde: Understanding Japanese
       Information Processing p35ff.

     ＊在 ifcss.org(.jp) 上有更多有用的资讯，就是 Ross Patterson 先生的
       「 CJK 内码」一文，及 Ken Lunde先生的：「了解日本在处理 p35ff 上
       的资讯」文件。

     _________________________________________________________________

Development of kanji codes for computers
电脑汉字内码的发展

  Japanese JIS Codes
日本 JIS 码

   The first character code designed to make the processing of
   ideographic characters on computers possible was the JIS C 6226-1978.
   It was developed according to the guidelines laid down in the ISO
   standard 2022-1973 and became the model for most other code standards
   used today in East Asia (the most notable exception is Big5). Covering
   approximately 6500 characters, this standard has been revised two
   times, in 1983 and 1990, where the assignment of some characters where
   changed and a few added. Revising a standard is about the worst thing
   a standard body can do and has caused much grieve and headache among
   manufacturers and users alike. Today we finally have fonts that bear
   the year of the standard they cover in their name, so that users can
   know which version is encoded in that font and select if accordingly.
   Our texts and tools are based on the latest version.

   The version of 1990 has become known under the name JIS X 0208-1990
   and has been together with an additional set of 5800 characters (JIS X
   0212) the base of the Japanese contribution to Unicode.

   The JIS code is almost never used in computers as it was defined;
   rather, some changes are made in the way the code numbers are
   represented. This is necessary to allow JIS be mixed with ASCII
   characters and, as in the case of ShiftJis (or MS-Kanji, the most
   popular encoding on personal computers) with earlier Japanese
   encodings of half-width kana. East Asian text is thus most frequently
   based on a multibyte encoding, a character stream that contains a
   mixture of characters represented by one single byte and of characters
   represented by two bytes.

   In addition to the characters in the national standard, many Japanese
   vendors have added their own private characters to JIS, making the
   conversion between these different encodings difficult beyond belief.

  Big5
（中文 BIG-5 码）


   There are different legends about the beginnings of Big5; some say
   that the code had been developed for an integrated application with 5
   parts, and others say it was an agreement of five big vendors in the
   computer industry. No matter which one is true (and it might as well
   be something else), the Taiwanese government did not realize the need
   for a practical encoding of Chinese characters timely enough.
   Government agencies had apparently been involved also in the
   development of Big5, but it was only in 1986 that an official code was
   announced, a time by which Big5 was already a de facto standard with
   numerous applications in daily use.

   关於 BIG-5 内码开始的传说，有许多不同的版本：有人说此内码是由一个
   整合五个部份的应用软体所产生的，又有人说它是五个大型的电脑厂商所
   共同约定的。不管哪一个传说是真的，台湾政府并未即时了解中文内码
   的重要性及须求性。虽然政府机关很明显地也参与了 BIG-5 的开发工作，
   不过直到 1986 年，官方的内码才正式对外宣布，这时 BIG-5 内码早已是
   为数极多的日常应用软体所采用的标准了。


   Big5 defines 13051 Chinese characters, arranged in two parts according
   to their frequency of usage. The arrangement within these parts is by
   number of strokes, then Kangxi radical. As Big5 was apparently
   developed in a great hurry, some mistakes were made in the stroke
   count (and thus placement) of characters, and two characters are twice
   represented. On the other hand, some frequently used characters were
   left out and were later implemented by individual companies.

   All implementations agree on the core part of Big5, but different
   extensions by individual vendors aquired much weight, most notably in
   the case of the ETEN Chinese system that was very popular in the late
   eighties and early nineties. As there is no document that defines Big5
   apart from the documentation provided by the vendors with their
   products, it is impossible to single out one standard Big5. This was
   actually a big problem in the process of designing Unicode -- and it
   remains one even today.

   （这一段讲到 BIG5 无法统一标准的大问题，直到今日还是如此，在将来
     Unicode 制定时亦会造成麻烦。）



  CNS X-11643-1986 and CNS X-11643-1992
（中央标准局 CNS X-11643-1986 及 CNS X-11643-1992）

   This is the Chinese National Code for Taiwan. In the form published in
   1992, it defines the glyph-shape, stroke count and radical heading for
   48027 characters. For all these characters a reference font in a 40 by
   40 grid ( and for most of them also in 24 by 24 grid ) is available
   from the issuing body. These characters are assigned to 7 levels with
   the more frequent at the lower levels and the variant forms at the two
   top levels. The whole architecture reserves space for five more
   standard levels and four level are reserved for non-standard, private
   encoding, bringing the total to 16 levels, with a hypothetical space
   for roughly 120 000 ideographs. On top of the currently defined ones,
   one more level with about 7000 characters is currently under revision
   and expected to be published in the course of 1995. This will bring
   the total number of assigned characters to roughly 55000.

   这是台湾的中央标准码。在 1992 年发布的格式上，它为 48027 个中文字
   定义了 glyph-shape，stroke count，以及 radical heading 。对於这些
   所有的中文字，并有相应的 40 x 40 格子的字型（大部份的亦有24 x 24
   字型）附在发表的内容上。

   这些中国字被分配至七个字面，以最常用的字摆在下层字面，以及变异的
   字体摆在上面二层字面。中央标准码的技术，使它保留了五个以上的标准字面
   以及四个非标准、私人用字面，使得它总共可以有 16 个字面，并且对於粗略
   算来 120 000 个字号有个假设的空间。

   在目前已定义的最上层字面（第七层），一层多的字面（具有约 7000 个字）
   正在加以重新审核，并且打算在 1995 年公布。这将使得它所指定的中文字元
   可达到将近 55000 个字。


   The overall structure has already been outlined; but how does the CNS
   code relate to other code sets in use in East Asia, e.g. the Korean
   KSC, the Japanese JIS, and the mainland Chinese GB? And what about
   Unicode?

   这整体的结构已经被勾画出来了。但是 CNS 码与其它东亚所用的内码
   （例如韩国 KSC 码、日本 JIS 码、中国大陆简体 GB 码等）有什麽
   关系呢? 和 Unicode 的关系又如何呢?


   The answer to this is somewhat disappointing: Although CNS defines
   roughly eight times the number of characters, more than three hundred
   characters present in the Japanese JIS are still missing from the CNS.
   In relation to GB, the CNS misses roughly 1800 simplified characters.
   With this it is also clear that the CNS code will miss quite a number
   of Unicode Han characters. Upon closer examination, the reason is soon
   obvious: CNS in its higher levels occasionally defines some
   abbreviated forms, but in general it does not include characters
   created as a result of the modern character reforms. I consider this a
   serious drawback and an obstacle to a true universal character set.
   But this seems to h处理这项须求。实际的工作
   显示了延用已习惯的工作环境（配合字型、编辑器等）是多麽的重要。
   因此，我现在提倡使用一种目前国际通行的内码（台湾BIG5 或日本 JIS）
   配合「IRIZ汉字库」，是比起采用 CCCII 来得好的方案。


     _________________________________________________________________



    1. Before launching large database projects, one ought to find out
       what has already been done in the area and study its qualities and
       defaults. Often one learns much by asking programmers and database
       designers what they would do differently if they could start all
       over again. In the field of Buddhist studies, the Electronic
       Buddhist Text Initiative tries to help in this coordination and
       learning process.

     This may sound trite, but it is a fact that even major projects in
     the field are unaware of what is happening elsewhere □and sometimes
     even in their own institution. On the recent field trip organized by
     the Electronic Buddhist Text Initiative, we found for example that
     the people managing the Chinese University of Hong Kong concordance
     project were not aware of the very similar effort in Oslo; and a
     long-time resident scholar at the Academia sinica found out through
     us that important materials for a Chinese text he has been
     translating are on his institute掇 computer. That electronic
     versions of a text exist does not mean much in itself; one must
     evaluate data quality, accessibility, and suitability for one掇
     project.
    2. One must classify data input projects by the amount of data
       involved and their destination. Thus one must distinguish between
       small amounts of data and large amounts of data, data destined for
       individual users or small groups and data destined for large user
       groups and institutions, etc. The present guidelines apply to
       large input projects that contain many full-form Chinese
       characters and are aimed at a large and diverse group of users.

     Failure to make such distinctions may lead to inadequate demands for
     data quality, search strategies, etc. For example, certain automatic
     or half-automatic methods of scanner input can be quite useful and
     efficient for an individual user prepared to spend a substantial
     amount of time for data correction; but the very same method may
     prove totally inadequate for large-scale institutional data input
     because of the high cost of error correction. Similarly, a
     relatively high number of mistakes may not bother some users but is
     unacceptable for data that are to be distributed to other users.
     Again, the use of many self-defined characters can be acceptable for
     individuals but not for institutions.
    3. It is of the greatest importance to make basic decisions at the
       beginning of a project and to discuss them with specialists. In
       making these decisions, both present and future possibilities of
       use must be kept in mind. This applies particularly to the choice
       of source text, text editing, annotation, basic data character
       (character encoding, data format, non-standard character handling,
       etc.), and hard/software environments. Such questions must be
       discussed by a team of specialists at the outset of a large
       project, i.e. before the main input activity starts, and an action
       plan should be approved by the whole team.

     Failure to do this can result in gigantic waste of money. Several
     Chinese text databases I know of started out with little planning;
     mostly they were designed to fit the hardware and software
     environment of some years ago at a specific location. Later, when
     trying to convert the data to present requirements and for use by
     other institutions, they found that automatic conversion was not
     possible or corrupted the data set. Prior planning and consultation
     with specialists could have prevented this. Another example: tagging
     data during the input or correction / editing process can improve
     the value of a database enormously, for example in making it
     possible to look for all plant names or place names in the whole
     Pali canon. Doing something like this at a later point would be
     another major enterprise that could have been avoided through
     careful planning.
    4. If the electronic text is (or may at a later point in time be)
       destined for international users and a variety of hardware and
       software environments, it is necessary to make a basic data set
       (master data set) that can later be automatically converted into
       any necessary code or format. It is important to treat this master
       data set as a separate entity whose input conditions, character
       code, hardware environment, etc. can be very different from that
       of the eventual user, just as studio quality music recording and
       editing equipment is different from the reproduction equipment of
       the consumer.

     With Chinese text, the difference shows particularly in the way rare
     characters and different national standards are handled.
     Institutions that do not separate master data and user data
     invariably produce data that follow the low standards of character
     codes now used on PCs (JIS, GB, BIG-5, etc.; see the article in this
     number by C. Wittern). Of the institutions visited on the recent
     field trip, those who did not distinguish between master and user
     data all suffer from data quality problems which will become even
     more serious as larger codes become available. Those who were wise
     enough to make this distinction are: the libraries of Taiwan
     National University and Hong Kong University of Science and
     Technology (both use master data in CCCII code and user data in
     BIG-5) and the Chinese Academy of Social Sciences (master data in
     their own 45,000 character code, user data in various formats). Just
     like master tapes in the music business, master data must be of such
     quality that it can be used in many different environments, present
     and future. Most of the Chinese text data so far input in Japan,
     Korea, and mainland China will have about as much future as the
     recording of a concert made on a Walkman.
    5. In order to assure such convertibility and adaptability, the
       master data must contain the greatest possible amount of
       information. This is an important factor of data quality. In the
       case of Chinese, Korean, or Japanese data (or any other text set
       that maip, we
     met programmers who admitted that they have never actually used the
     database they have been working on for years...
    9. Databases are made for users; therefore the wishes, working
       environment, and likely working habits of users must be carefully
       studied and respected. For example, most users search while
       writing a paper or book; therefore it must be possible to use the
       database concurrently with a word processing program. Any large
       text database should also let the user attach notes and tags to
       the main text. Such notes should also be searchable, printable
       (together with the text or separately), savable as separate files
       with location tags, and portable to updated versions of the
       electronic text. Search engines must also be adapted to many
       users□needs. Therefore it must be flexible and adaptable to a
       variety of users□preferences (just like word processing programs)
       rather hard-coded. Search results should be viewable and printable
       and file saveable in a variety of formats according to the user掇
       wishes. Since the main aim of databases is the retrieval of
       information, such retrieval should be carefully planned with many
       options for the user.

     In projects whose input takes many years of work, one must make
     programmers produce multiple test versions of search software and
     have scholars and other prospective users evaluate it even while
     input is going on. If necessary, data structure decisions have to be
     reevaluated. Users should have a say in all important software
     decisions, and programmers should assist users to evaluate test
     versions and to formulate their wishes by telling them about
     alternative possibilities.
    Author:Urs App
    Last updated: 95/04/23



==========================================================================
Date: Mon, 24 Jul 1995 23:39:11 +0800
From: Shann Wei-Chang <sq他的文件来看，似乎没有
绝对乐观的解决方法。的确令人苦恼。

未来的一至二周，我将投入全力写一份中文 TeX 的使用手册，然後要协助工读生
和计中写 accounting 的处理 scripts.  哎，多说无益，总之我很想帮忙但是实在
无能为力。

>     不过那位倚天的工程人员刘明威先生表示，得等有一定数量的
> 佛教团体支持此一扩充的构想後，刘先生才会去进行程式修改的工
> 具，以免到头来白忙一场。
> 
>     照这样子来看此 Big-5 的改良版本恐怕会有问题? 不实用?
> 因此一般 user 使用的仍然是旧的 Big-5 版本...
> 因此这个版本既不如 CCCII, Unicode 等能提供 "全数" 的造字，
> 又不像 Big-5 般的流通，似乎只能作过渡之用?

我不太懂这一段话的意义。 CCCII 的问题 Wittern 已经说得很清楚 (我以前没这麽
清楚，只是在理论推理上，认为它不是一个好主意，现在 Wittern 给了很明确的技术
资料，说明它不是一个好主意), 但我不认为 Unicode 能提供全数的造字, 它毕竟是
一个固定大小 256*256 的字板，造字的个数是有上限的；而且这个码还要全世界来分
著用，不可能把所有造字空间都给了我们吧？还有，你说做过渡之用，指的是谁？
是改良的 Big-5 吗？可是你刚才不是才说倚天现在不能拿出来用吗？

我很赞成 Wittern 文章中 (或是另一人写的，总之是你附的那篇) 所说的，资料
要分内码 (master data) 和 外码 (user data)。如果你接受这个观念，那麽即刻
可以选一个最适当的字码来制造 master data。甚至不必理会任何标准码。而我
个人的建议 (一个不参与工作的人说这麽多建议，实在很心虚) 是，跟以前一样，
尽量用 CNS, 不足的字自行定义，用跳脱码表示你们的特殊造字，在 PC 上有很多
造字程式共您们用，在 UNIX 上大家一律用 X Window 的 bitmap 或 BDF 格式即可。
一旦内码造成了，与外码的对应只是一张表格的问题。


>    请问一下，以 CNS 标准输入的文件，在 BIG-5 下面可以看吗?

CNS 标准的两个 bytes 都是 low bytes (0 .. 127), 这是 ISO 的标准。
不同字面的 CNS 用跳脱码，所以基本上和 Big-5 是截然不同。但是在 PC 上
倚天提供 CNS 码，他的意思是 shift-CNS (like shift-JIS). 他只用 CNS 的
第一二两个字面, 第一字面 shift 第一个 byte 128...255, 第二字面把两个 byte
都 shift.  故严格来说倚天所给的 CNS 码也不是标准的.

而且 CNS 的前两个字面和 Big-5 也不是 order-preserving one-to-one mapping,
所以即使是 shift-CNS 也不等於 Big-5.  去年我曾花了至少一个下午去搞清楚
Big-5 和 CNS plans 1,2 的差异，并确定 Big-5 的错误之处，我曾写一份报告
post 给 CCNET-L, 现在没时间找出旧稿.

但有一个 betty 程式可以及时把 shift-CNS 转成 Big-5 (vice versa),  但它只在
UNIX 上执行.

>    请教一下，不知 CNS 的中文系统要如何取得呢?

问倒我了。除了倚天上的 shift-CNS 我没见过其他的 implementations.  这当然
不是 PD 程式。我猜资策会和某些政府单位一定有这软体，只是商场上它毫无立足
之地，所以一般的使用者看不到这种产品。在 UNIX 上我想我知道如何配合 CXTERM
implement 一份 shift-CNS 的中文环境，至於自造字的跳脱码处理，我想可以修改
betty 程式来 implement.  Betty 的作者在清大 (希望他还没毕业), 可以请他
指导。

-Shann


/End of lin