Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!rphroy!ox.com!msen.com!emv From: jwb@monu6.cc.monash.edu.au (Jim Breen) Newsgroups: comp.archives Subject: [sci.lang.japan] New Version of JDIC released Message-ID: <1991Jun20.132608.28851@ox.com> Date: 20 Jun 91 13:26:08 GMT Sender: emv@msen.com (Edward Vielmetti, MSEN) Reply-To: jwb@monu6.cc.monash.edu.au (Jim Breen) Followup-To: sci.lang.japan Organization: Monash University, Melbourne, Victoria, Australia Lines: 293 Approved: emv@msen.com (Edward Vielmetti, MSEN) X-Original-Date: 18 Jun 91 00:53:48 GMT X-Original-Newsgroups: sci.lang.japan Archive-name: text/japanese/jdic/1991-06-18 Archive: monu6.cc.monash.edu.au:/pub/Nihongo/jdic*.zoo [130.194.32.106] Original-posting-by: jwb@monu6.cc.monash.edu.au (Jim Breen) Original-subject: New Version of JDIC released Reposted-by: emv@msen.com (Edward Vielmetti, MSEN) Version 1.3 of JDIC (Simple Japanese/English Dictionary Display Program) is released, and is now in the pub/Nihongo directory on monu6.cc.monash.edu.au (130.194.32.106). (This directory also contains a copy of MOKE1.1) V1.3 fixes known bugs, interworks properly with MOKE's environment and includes immediate romaji -> kana translation when searching by yomikata. I will NOT be emailing the distribution; there is so much trouble with mungeing mailers that it does not seem to be worth the effort. For those interested, I am including the .doc file below: ---------------------------------------------------------------- J D I C Simple English Japanese Dictionary Display ========================================== Version 1.3 (June 1991) Introduction ------------ This program provides a simple English/Japanese (kana & kanji) display of selected entries of a dictionary file. While it will work (more or less) with any text file containing a mix of Japanese and English words, it has been designed specifically to operate on a dictionary in the "EDICT" format used by the MOKE (Mark's Own Kanji Editor) Japanese text editor. The executable code and documentation of JDIC is hereby released to the "public domain". All usage of this program is at the user's risk, and there is no warranty on its performance. All the Japanese displayed is in kana and kanji, so if you cannot read at least hiragana and katakana, this is not the program for you. Installation ------------ This program is distributed as a "zoo" archive (jdic13.zoo) containing the following files: jdic.exe (the executable) jdic13.doc (this documentation file) edict (a sample Japanese/English dictionary file) *.bgi (Borland Graphics drivers for various cards) The files will need to be unpacked and copied into a directory on your hard disk. If you are storing them in the same directory as your MOKE files (e.g. \kanji) be careful not to overwrite MOKE's "edict" file. In addition, the 16-bit JIS font files "k16jis1.fnt" and "k16jis2.fnt" must be in this directory. These latter files are not included in this distribution. If you use MOKE, you have them already. If not you will need to track them down at one of the FTP sites. The executable (jdic.exe) will need to be stored in a directory on your path if you wish to invoke JDIC from any directory. This simplest approach is to add \kanji to your path. The following environment variables may be set (note that they are exactly the same environment variables used by MOKE.) bgi (the directory containing the bgi files. E.g. c:\tc or c:\kanji. If this is not present, the bgi files must be in the directory in which JDIC is invoked.) mokerc (the directory containing the moke.rc file. E.g. c:\kanji. If this is not present, the current directory will be searched for a file called moke.rc, and the directory details extracted.) jgraphic (set this to ATT400 if you have an AT&T high-resolution card. Otherwise it will default to CGA. NB: MOKE does not use this variable.) If you wish to operate JDIC from other directories, you must have a file "moke.rc" containing the following line: kanjipath directory-path (e.g. C:\KANJI) to tell the program the location of the control and font files. The environment variable "mokerc" must be used to specify the directory containing "moke.rc". (If you use MOKE, you will have a moke.rc file already.) Operation --------- JDIC must operate on a PC or AT with a graphics card. It has been written using Turbo C 2.0, and has been tested on VGA, CGA, ATT and HERC cards. Auto-detection is used to determine the type of graphics card. The invocation of JDIC is: jdic [uses a dictionary called "edict"] or jdic dicname [where dicname is the name of your dictionary] The default dictionary "edict" is, of course, the name of MOKE's English/Japanese dictionary file. It will be located in the directory specified in your moke.rc file. If you use an alternative dictionary, it can be in any directory. JDIC also needs an index file ".jdx". If is not present it will be created. JDIC saves the length of the dictionary file and the JDIC version in the .jdx file, and if it detects that either have changed, it will insist on recreating the index file. Otherwise the dictionary look-up will be useless. Operation is very simple. After loading the dictionary, index and font files, the full-screen working window is displayed with the "Enter Search String:" prompt. Type a few letters from the *start* of the word(s) you are seeking. JDIC does not match on strings in the middle of words. The scan is case-insensitive. A multi-line display is produced for all the matches against the string. The display format is: Japanese(in kanji and/or kana) [yomikata in kana] english1, english2, etc. A line is only displayed once per search, regardless of the number of hits. After a search, a further prompt occurs at the bottom of the screen giving you the option of quitting (Q), requesting another search (A) or, if there is still more information to display, requesting the next screen-full (M). You will notice an "(A)" in the top lefthand corner of the screen. This is to indicate you are entering search strings in ascii (i.e. in English). If you press F3 before entering a string, you toggle between (A)scii, (H)iragana and (K)atakana. (Why F3?, well that is the key that MOKE uses for this function.) To enter a search string in kana, type it in romaji and it will be converted to kana as you type. The romaji->kana translation is almost identical to that used in MOKE, i.e. for a small "tsu" you can type either a double consonant, e.g. "shippai", or "t-", e.g. shit-pai, and for "n" you can type "n'" if necessary (e.g. as in "kon'yaku"). Most of the time just typing ordinary Hepburn or kunrei romaji works. Note that the romaji must follow the kana style for long vowels. Tokyo must be toukyou, NOT tookyoo. The matching of kana strings insensitive to whether they are katakana or hiragana. The ONE difference between them is that typing a "-" in hiragana gets a "u", and in katakana gets a "-", just as in MOKE. The display is in "dictionary" order for the words matched, i.e. alphabetical for the ascii search, and EUC order for the kana search. EUC order is very close to the "gojun" kana order in Japanese dictionaries except that it separates the syllables with nigori and maru. There is also an "Unlimited Display Mode" which is invoked by pressing F1 before or during the entering of the search string. In this mode you will just keep scrolling through the dictionary instead for stopping when you run out of matching strings. Also in this mode entries are displayed every time there is a match in the index table (normally an entry is displayed once only.) This mode is useful for doing maintenance on the dictionary, and for just browsing. Dictionary ---------- Clearly to be of any use, JDIC must have a reasonably good dictionary. Unfortunately there are no good machine readable dictionary files in the public domain yet. Included with this distribution is the tiny EDICT file from MOKE 1.1 (the shareware version). There is a bigger, but still rather limited EDICT supplied with MOKE 2.0 release, however Mark Edwards, the author, has not placed it in the public domain. JDIC's author is compiling a supplement to MOKE (2.0)'s EDICT which will fill in the gaps, but unless you buy MOKE 2.0 (after all, it's only $US50) you will miss out on a lot. (If anyone feels like contributing to a public domain dictionary in EDICT format, the author is willing to collate and distribute it. Just email the pieces.) The dictionary file must use the "EUC" coding for Japanese characters. MOKE's EDICT does this, so that was the coding adopted in JDIC. Files using JIS codings can be converted to EUC using MOKE itself, or Ken Lunde's "JIS.C" program. The format each entry of EDICT is: Japanese [yomikata] /english1/english2/..../ If the word is in kana alone, the yomikata is omitted. Technical --------- JDIC holds the complete dictionary in RAM, along with the first 3490 bitmaps of the JIS character set and the index table. The index table contains an entry for each word in the dictionary, sorted in alpha/kana order. This enables a fast search to be done, and for the display to be in alphabetical order by keyword. Common words like: "of", "to", "the", etc. and grammatical terms like: "adj", "vi", "vt", etc. are not indexed. If a kanji is required that is not in the ~3000 most common ones, it is read from disk into a circular cache buffer. This happens rarely. JDIC can cope with dictionaries up to about 180 kbytes (MOKE's EDICT is about 60k). If a larger dictionary ever comes available, another version could operate with the dictionary on disk. The parsing and sort to set up the index table would be slower, but the searching will still be quite fast. Changes in Version 1.1 ---------------------- o ATT graphics card handling. o fixes to the parsing of kanji/kana strings. The result is that the .jdx file is about 20% larger than in V1.0. Changes in Version 1.2 ---------------------- o fixes to the kana->romaji code to handle "nyu" properly. o facility to use dictionaries other than "edict". o Unlimited Display Mode. Changes in Version 1.3 ---------------------- o immediate romaji->kana conversion (suggested by David Cowhig). o examination of the "bgi" and "mokerc" environment variables, and the "moke.rc" control file. It Doesn't Work! ---------------- Oh dear. If you do not get the introductory message, you probably have a corrupted .exe. Try and get a clean copy. Also your environment might have trouble with the output of a Turbo C 2.0 compilation/link. If you actually get started, but cannot find any thing, even when you put "a" as a search key, delete your .jdx file and start again. If it still doesn't work, mail the author a sample of your dictionary. Acknowledgements ---------------- A message from the author: I wrote this program to gain experience in handling and displaying the Japanese character set, and to exploit the dictionary that came with my copy of MOKE. I also wanted to brush up my C skills. I make no claims for it, but I am pleased how it turned out. I will consider releasing the source (if anyone is actually interested in it) at a later date. I welcome suggestions, comments and constructive criticism. I wrote about two-thirds of this program. Great lumps of it were lifted with minor modifications from "KD" (Kanji Driver), which was written by Izumi Ohzawa at Berkeley, in particular the JIS handling module (kjis.c) which was a port of "jis.pas" by Seiichi Nomura and Seke Wei. Ken Lunde's "japan.inf" and his elegant "jis.c" explained the workings of EUC and old/new JIS codes. Mark Edwards' MOKE remains the tour de force in this field, and an inspiration for us all. I regard JDIC as a humble and minor accessory to MOKE. (I use tables lifted from two of the ".hlp" files in MOKE to drive the romaji->kana code.) Jim Breen Department of Robotics & Digital Technology Monash University Melbourne, Australia (jwb@monu6.cc.monash.edu.au) May-June 1991 -- Jim Breen AARNet:jwb@monu6.cc.monash.edu.au Department of Robotics & Digital Technology. Monash University. PO Box 197 Caulfield East VIC 3145 Australia (ph) +61 3 573 2552 (fax) +61 3 573 2745 JIS:$B%8%`!!%V%j!<%s(J -- comp.archives file verification monu6.cc.monash.edu.au -rw-r--r-- 1 886 729 88014 Jun 13 17:35 /pub/Nihongo/jdic13.zoo found jdic ok monu6.cc.monash.edu.au:/pub/Nihongo/jdic*.zoo