cjk-latex.el and UTF-8 (was: Re: [Cjk] CJK and babel problem)
=?big5?B?wfq5Rq1A?= Danai SAE-HAN
danai.sae-han at skynet.be
Mon May 12 13:20:21 CEST 2003
From: jsbien at mimuw.edu.pl (Janusz S. Bie,Bq)
Subject: Re: [Cjk] CJK and babel problem
Date: 12 May 2003 06:02:25 +0200
Message-ID: <87vfwgke26.fsf at mimuw.edu.pl>
Dear all
jsbien> > There is something bizarre with Emacs though (version 21.3 with
jsbien> > Mule-UCS, on Debian/unstable): pasting for example a Traditional
jsbien> > Chinese name in a utf-8-unix encoded file and saving it, will save the
jsbien> > pasted characters in ``Chinese'', and Emacs remembers this in the
jsbien> > buffer. That file will have no problems recognizing the need for a
jsbien> > Big5 font.
jsbien>
jsbien> I don't think I understand you. You say that the pasted Chinese
jsbien> characters are saved as Chinese, but what about other characters? Are
jsbien> there any in the buffer?
jsbien>
jsbien> I am afraid that you are confusing the codings of files
jsbien> and buffers. Buffers are always coded in Emacs internal
jsbien> representation, `buffer coding system' means only the coding to be
jsbien> tried first when saving the file.
Heh, sorry if my explanation was too cryptic. I have put a few
example files at http://users.skynet.be/so000618/cjk/utf8_cjklatex.tar.bz2
to clarify my case.
Reproduction of the error:
As example file I took that Buddhistic text again because it's one of
my shorter documents. ;)
If you examine the file foxuejingshiyu.tex in emacs-mule encoding, you
will notice that it mostly contains Simplified Chinese characters and
PinYin. Line 32 is the only one containing Traditional Chinese
characters. I processed it with cjk-write-file, and had no problems
compiling it (make sure you change the font names before processing).
Now go back to Emacs, and type `C-x RET f' and choose `utf-8-unix'
(I have Mule-UCS 0.84.99rc3 installed). Change the `coding' variable
in the Local Variables beneath, and save the file to something else
(e.g. foxuejingshiyu-utf-1.tex), and process it again with
cjk-write-file. Strangely enough the .cjk file compiled without
errors, using all of my GB and Big5 font definition (.fd) files.
Now close that buffer, load foxuejingshiyu-utf-1.tex again and save
it under another name (e.g. foxuejingshiyu-utf-2.tex) (or just copy
it). Now process it with cjk-write-file, and you will probably get
errors about missing fonts or you will get a DVI with HBF (bitmap)
fonts (jsso12), of course of those characters that appear in Kanji
form too, so simplified characters like yu3 [语] and sha1 [杀] are
still using fonts from the .fd files in GB and Big5
(cfr. foxuejingshiyu-utf-2.ps).
Plausible cause:
Why did foxuejingshiyu-utf-1.cjk compile, and foxuejingshiyu-utf-2.cjk
didn't, or at least didn't use the font definitions from GB and Big5
like foxuejingshiyu-utf-1.cjk?
My guess is that cjk-latex.el produces the .cjk file from Emacs'
internal buffer rather than from the stored file. That is why the
first .cjk file still used the GB and Big5 font definitions for every
character, and the second .cjk file didn't.
Proposition:
Now what I propose is to emulate the behaviour of
foxuejingshiyu-utf-1.cjk. Pretending that a certain part of the file
is written in a specific "encoding" so we can use font definitions
other than those specified in c70*.fd files. This could be done with
switches in the style of \CJKenc.
Example:
\newswitch{GB}
\CJKencfamily{GB}{gkai}
...
...
\newswitch{Bg5}
...
I personally think such a command would be useful, especially because
UTF-8 is slowly but steadily becoming the encoding of preference for
most people.
Yours sincerely
/Danai Sae-Han
$(0X'I/5%
More information about the Cjk
mailing list