[Cjk] conversion frontend

PILCH Hartmut phm at a2e.de
Tue Jan 18 14:00:32 CET 2000


Since meanwhile I am writing everything in UTF-8, every TeX file I write has
to go through a filter before being processed.  Here is the script I use to
handle this.  The file suffixes .uctex, .ujtex etc could become standardised.

I would have liked to use 'recode -d' to convert accented characters into
TeX notation, but unfortunately recode insists on converting everything and
either fails on the CJK characters or, when I give the -f option, omits them.

#!/bin/bash
usage ()
{
cat <<EOF 
cjkxxconv v 0.1
by Hartmut Pilch <phm at a2e.de>

   cjkxxconv $INFIX < file.${INFIX}tex > file.tex

convert from an editable TeX file to a file that can be compiled with
Werner Lemberg's CJK TeX macro package, depending on the INFIX switch ($INFIX).

Valid values of the INFIX switch ($INFIX) are

	 b5:	Bg5
	 sj:	SJIS
	 cf:	CEF
	 c5:	CEF5
	 cs:	CEFS
	 ub:	UTF-8 encoded Bg5
	 uj:	UTF-8 encoded SJIS
	 uc:	UTF-8 encoded GB
	 uj:	UTF-8 encoded JIS
	 uk:	UTF-8 encoded KS

This routine is recommended for use in makefiles that specify the
dependency of TeX documents (generating everything from editable TeX
to PS, EPS, G3-Fax etc based on what was modified last).  Since LaTex
expects *.tex files, whenever LaTeX needs something different from
what you want to edit, I suggested that you name your editable file
*.$INFIX.tex and use a makefile-based menu system like my TEXM to
create and view your target document.  TEXM uses cjkxxconv.

cjkxxconv is only a frontend to various conversion programs created
by Werner Lemberg as well as the iconv utility, which is present on
GNU/Linux systems starting from GLibC version 2.1, and on most
commercial Unices.
EOF
}

ub5conv () { iconv -f utf-8 -t big5 $@ | bg5conv; }
usjconv () { iconv -f utf-8 -t sjis $@ | sjisconv; }

msgexit () { local E=$1;MSG="$@";exit $E; }
exitmsg () { local E=$?;if test $E = 0;then echo OK >&2;else echo KO;echo $MSG;usage;fi;exit $E; }

trap exitmsg EXIT

INFIX=${1:-xx};
shift || msgexit 10 "need an infix";

case "$INFIX" in 
  (b5) CONV=bg5conv;; 
  (sj) CONV=sjisconv;; 
  (cf) CONV=cefconv;; 
  (c5) CONV=cef5conv;; 
  (cs) CONV=cefsconv;; 
  (ub) CONV=ub5conv;; 
  (us) CONV=usjconv;; 
  (uc) CONV="iconv -f utf-8 -t euc-cn";;
  (uj) CONV="iconv -f utf-8 -t euc-jp";;
  (uk) CONV="iconv -f utf-8 -t euc-kr";;
  (*) msgexit 8 "wrong infix $INFIX";;
esac;

eval $CONV $@;





More information about the Cjk mailing list