Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!thunder.mcrcim.mcgill.edu!snorkelwacker.mit.edu!bloom-beacon!eru!hagbard!sunic!mcsun!ukc!mucs!cns!usenet From: rogersh%p2h@uk.ac.man.cs Newsgroups: comp.sys.acorn,eunet.micro.acorn Subject: Re: DOS/UNIX/etc <-> ADFS Filename Mapping Message-ID: <1991Jan16.103434.797@cns.umist.ac.uk> Date: 16 Jan 91 10:34:34 GMT References: <7850@castle.ed.ac.uk> Sender: usenet@cns.umist.ac.uk (News System) Organization: Murder Inc. Lines: 69 In article <7850@castle.ed.ac.uk> as@castle.ed.ac.uk (A Stevens) writes: >One serious complication when porting code or file-structures >to and from the Arch is its unfortunate combination >of short (10 char) filenames and an absence of filename extensions. > >To convert filenames from DOS (effectively 11 chars if you >count the extension) or UNIX or VMS some kind of transformation >has to be performed. The catch is not that this is impossible, >just that there are so many different ways to do it, >so programs tend not work in the same way. I have already had to tackle this problem. Basically there are several stages to converting a UNIX filename (or MSDOS since apart from the delimiter there is no difference) to ADFS. 1) Sort out the directory info. This means dealing with things like: .././../.info /.././../info /etc/../tmp/info etc. At this stage it is also necessary to do something about characters in the filenames. The method I use is to convert all filenames with certain common UNIX/MSDOS single character extensions (e.g. x.c, x.s, x.h) to s.x, c.x, etc. and otherwise just convert the '.' to a '_' along with all other ADFS-illegal characters '#$@' etc. E.g. info.Z @.Z.info info.tmp @.info_tmp /.././etc/../tmp/info $.tmp.info tmp/sort.c @.tmp.c.sort ../Makefile_src @.^.Makefile_src ./124$$.etc @.124___etc 2) Sort out long names in the resulting path. This is done by first removing vowels except from the first letter. E.g. Makefile_tmp Mkfile_tmp Afternoon_test_data Aftrnn_tst_dt Then if the component is still too long chop out sufficient characters from the 2cnd character onwards: Aftrnn_tst_dt Ann_tst_dt This has the result of in almost all cases preserving uniqueness with multiple filenames with the same root and different extensions, but also preserves the maximum meaning in the filename due to the selective removal of vowels first. 3) Check if we ought to create a directory for a single character suffix filename. If the file is to be opened for creation (perhaps implicitly) then we need to create the directory. Otherwise the access operation merely fails and we don't need to bother (indeed if we did it would waste disk space). Unfortunately 3) implies that the conversion needs to be integrated into a set of common file access routines. In unixlib it is called by open(), creat(), stat(), etc. and performs filename conversion transparent to the user. There is a global flag which can turn conversion on and off, and the routine can also be called directly. Starting any unixlib program with the environment variable UNIX set, automatically sets conversion on, else by default it is off. [ H.J.Rogers (INTERNET: rogersh%p4%cs.man.ac.uk@cunyvm.cuny.edu) ] [ ,_, (BITNET/EARN: rogersh%p4%cs.man.ac.uk@UKACRL.BITNET) ] [ :-(_)-o (UUCP: ...!uunet!cunyvm.cuny.edu!cs.man.ac.uk!p4!rogersh) ] [ _} {_ (JANET: rogersh%p4@uk.ac.man.cs) ]