Path: utzoo!utgpu!news-server.csri.toronto.edu!cs.utexas.edu!usc!elroy.jpl.nasa.gov!jpl-devvax!lwall From: lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) Newsgroups: comp.lang.perl Subject: Re: perl memory usage? Message-ID: <7556@jpl-devvax.JPL.NASA.GOV> Date: 26 Mar 90 22:19:37 GMT References: <1990Mar19.210743.15896@chinet.chi.il.us> <7480@jpl-devvax.JPL.NASA.GOV> <1990Mar26.174959.20102@chinet.chi.il.us> Reply-To: lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) Organization: Jet Propulsion Laboratory, Pasadena, CA Lines: 61 In article <1990Mar26.174959.20102@chinet.chi.il.us> les@chinet.chi.il.us (Leslie Mikesell) writes: : Perhaps a funky regexp would work better anyway, but I couldn't come : up with one. I'm trying to merge items like: : : identifer (used for key in associative array) : text (multi-line) : SUMMARY: : summary-text (multi-line) : STATUS: : text ... : : If I find an updated item without the SUMMARY: entry, I want to grab the : summary-text from the old entry and insert it into the new above the : STATUS line. My first attempt at pattern-matching with bracketed substrings : failed on these multi-line strings, so I switched to the $` and $' and : some tmp variables. Is there a better way? Note that I don't know which : (if either) entry contains the SUMMARY: or that an old entry even exists, : so the ability to test the success of the individual matches is handy. I'd probably write this as $* = 1; if ($new !~ /\nSUMMARY:\n/) { if (($was) = ($old =~ /^SUMMARY:\n([^\0]*)^STATUS:/)) { substr($new,index($new,"STATUS:\n"),0) = $was; } } or some such. The thing to remember is that . doesn't match newline, so use [^\0] to match newlines too. (On older patchlevels you may have to say \000 instead.) Depending on the sizes of the relative text sections, it might be faster to do it all with index, since [^\0]* has to match all the way to the end and then back off. if (index($new, "\nSUMMARY:\n") < $[) { $beg = index($old, "\nSUMMARY:\n"); if ($beg >= $[) { $end = index($old, "\nSTATUS:\n"); substr($new,index($new,"STATUS:\n"),0) = substr($old,$beg + 10, $end - $beg - 9);; } } If there are more headers than that, it often becomes worthwhile to take a parsing pass on it and put the entries into separate variables or entries in an associative array. Then you end up with wonderful statements like $new{'SUMMARY'} = $old{'SUMMARY'} unless $new{'SUMMARY'}; A funky split like @new = split(/(^[A-Z]+):\n/,$new); unshift(@new,"FRONTSTUFF"); %new = @new; # alternating keys and values comes to mind. But that's probably not worthwhile for your thing. Larry