Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!uunet!ogicse!iwarp.intel.com!news From: merlyn@iwarp.intel.com (Randal L. Schwartz) Newsgroups: comp.lang.perl Subject: Re: Simplifying paths. A hairy regex Message-ID: <1991Mar2.202521.29658@iwarp.intel.com> Date: 2 Mar 91 20:25:21 GMT References: <11249@cae780.csi.com> Sender: news@iwarp.intel.com Reply-To: merlyn@iwarp.intel.com (Randal L. Schwartz) Organization: Stonehenge; netaccess via Intel, Beaverton, Oregon, USA Lines: 61 In-Reply-To: muir@cae780.csi.com (David Muir Sharnoff) In article <11249@cae780.csi.com>, muir@cae780 (David Muir Sharnoff) writes: | It's 10pm, everyone else has gone home, I just have to | share this... The most twisted regex that I've had to build. | | I wanted to get rid of extra junk from unix filenames. | | To that end, I tranform: | | a//c -> a/c | a/./c -> a/c | a/c/. -> a/c | a/b/../c -> a/c | a/c/d/.. -> a/c | | Note: I do not tranform //a/c -> /a/c as that would break Apollo filenames. | | ------------ perl starts here ---------- | sub simplify | { | for $p (@_) { | while($p =~ s!(/\.(/))|(^(.+/)/)|(/\.$)|([^/.]+/\.\./)|(/[^/.]+/\.\.$)!\2\4!) {;} | } | } | ------------ perl ends here ---------- I'd deal with it more like what it is... a series of commands to execute: sub simplify { local(@source,@dest); local($body); for $p (@_) { @source = split(/\//, $p); $body = 0; # ($body > 0) means have seen non-null entries for (@source) { push(@dest,$_) unless $body && /^\.{0,2}$/; # don't push body entries that are null, # single dot, or double dot pop(@dest) if $body && /^\.\.$/; # double dot in body means back up one $body++ unless length; # enter the body after initial null entries } $p = join("/",@dest); } } Hmm. after writing this code, I know it breaks on "../..". Yuck. But I'm late for my next appointment. The flash I just had (if someone wants to make it work) is to have an inviolate "prefix string" consisting of all the null and ".." entries from the head of the string as in /^((\.\.)?\/)+/, and then push and pop the rest as above. When you reassemble the string, you glue together the prefix and the stack. Maybe I'll try finishing that off later this evening. print "Just another Perl hacker", # on a tight schedule today... durn. -- /=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\ | on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III | | merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn | \=Cute Quote: "Intel: putting the 'backward' in 'backward compatible'..."====/