Xref: utzoo comp.text:8021 comp.text.tex:5447 Path: utzoo!mnetor!tmsoft!torsqnt!news-server.csri.toronto.edu!cs.utexas.edu!uwm.edu!wuarchive!mit-eddie!bloom-beacon!eru!hagbard!sunic!kth.se!cyklop.nada.kth.se!news From: grodan@cyklop.nada.kth.se (Mats G L|fdahl) Newsgroups: comp.text,comp.text.tex Subject: Re: MS-Word <--> (La)Tex??? Message-ID: Date: 13 Feb 91 09:51:21 GMT References: <37343@netnews.upenn.edu> <6787@dnlunx.pttrnl.nl> Sender: news@nada.kth.se (Mr News) Organization: Royal Institute of Technology, Stockholm, Sweden Lines: 69 In-reply-to: stan@dnlunx.pttrnl.nl's message of 11 Feb 91 23:16:42 GMT stan@dnlunx.pttrnl.nl (Stan van de Burgt) writes: matsuda@linc.cis.upenn.edu (Kenjiro Matsuda) writes: >Hi, sorry for cross-posting this lazy novice question. Does anyone know any >programs that can convert the MS-Wordly formatted files to LaTex format ones >automatically and vice versa, maybe on Mac? I was just informed that there is >one that does this sort of stuff for between MacWrite <--> troff, so I gather >there must be a similar kind of program somewhere in the cyberspace. I've seen this question before on the mac groups and on this group. Up to now I've seen no answer. I'd like to know more about this question. Are you looking for such a utility just for printing purposes, i.e. should the output of La(TeX) just resemble the word output as good as possible? Or should the LaTeX source should be as well-structured and readable as possible? The latter is not as trivial as you might think! Also, how should things like pictures, tables, formulas, etc be processed? I've been looking for this kind of program, too. I would like it to try to be as smart as possible about logical constructs. It doesn't need to finish the job, just to make the translation process easier for me. I think the proper medium to start from is a file output from MS-Word in the Rich Text Format (RTF (interchange format in the MS-Word menu)). The basic capabilities of the translation program should be something like: 1) Identifying paragraphs, and if possible items. 2) Finding TeX/LaTeX control sequences for special characters. 3) Identifying section headings, if possible with the proper section/subsection/subsubsection nesting. 4) Identifying figures, producing figure environments with captions. 5) Identifying tables, producing table environments with captions, and if possible some rudimentary table with the entries in the right positions, and in math mode if needed. 6) Finding and replacing RTF logical constructs with the appropriate LaTeX logical constructs. 7) Finding font changes, especially to italics, that could be translated into {\em ...}. 8) Doing its best with formulas. Apart from translating special characters, as mentioned above, it would be nice if it could distinguish between inline formulas and displayed ones. Fractions, and roots might also be possible to handle correctly. 9) Deleting all other MS-Word control sequences Figures and complicated tables could be left out. They can be input by hand in LaTeX or as postscript files in \special or in any other way the user chooses. With long tables, however, there would be no harm done if the program did its best. At least one would not have to type in all the entries in a long table a second time. If translating into the correct table structure is to difficult for the program, just lines with all entries in a row, preceded by % characters would be a great help. If you or anyone else would write such a program, I'd be very interested in the result, and would be happy to assist with testing. -- ----------------------------------------------------------------------------- Mats Lofdahl, Stockholm Observatory, S-133 36 Saltsjobaden | +46 - 8 16 44 75 ----------------------------------------------------------------------------- Internet: lofdahl@astro.su.se | Bitnet: grodan@sekth | Sunet: royacs::lofdahl -----------------------------------------------------------------------------