Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!know!zaphod.mps.ohio-state.edu!math.lsa.umich.edu!math.lsa.umich.edu!emv From: hakanson@ogicse.ogi.edu (Marion Hakanson) Newsgroups: comp.archives Subject: [perl] Re: Quoting and Splitting Message-ID: <1990Sep25.221435.3803@math.lsa.umich.edu> Date: 25 Sep 90 22:14:35 GMT Sender: emv@math.lsa.umich.edu (Edward Vielmetti) Reply-To: hakanson@ogicse.ogi.edu (Marion Hakanson) Followup-To: comp.lang.perl Organization: Oregon Graduate Institute (formerly OGC), Beaverton, OR Lines: 32 Approved: emv@math.lsa.umich.edu (Edward Vielmetti) X-Original-Newsgroups: comp.lang.perl Archive-name: dnsparse/25-Sep-90 Original-posting-by: hakanson@ogicse.ogi.edu (Marion Hakanson) Original-subject: Re: Quoting and Splitting Archive-site: cse.ogi.edu [129.95.10.2] Archive-directory: /pub Reposted-by: emv@math.lsa.umich.edu (Edward Vielmetti) In article adler@betwixt..caltech.edu (B. Thomas Adler) writes: >. . . >spacing. My question is, is there a way to have split() split on >white-space, while respecting the restrictions imposed by any double quoting? > >ie, I'd like the line > Field_1 parm_1 "This is example one" > >to split into three components, rather than 6. This has been discussed several times before. If you allow the quotes to be escaped (with a backslash, which can also be escaped by a backslash, etc.), then you aren't going to be able to do this with a regular expression. Even if you don't, the r.e. will be ugly. Since you mentioned nameserver files, you may find the approach I took to be of use to you. Use anonymous FTP to retrieve from host cse.ogi.edu the file pub/dnsparse-2.0.tar.Z. Briefly, there is a lexical analyzer (tokenizer) written in C, which is used by Perl code to fully parse DNS master files. The lex-er deals with quotes, etc., and the Perl code does the rest. -- Marion Hakanson Domain: hakanson@cse.ogi.edu UUCP : {hp-pcd,tektronix}!ogicse!hakanson