Path: utzoo!attcan!uunet!wuarchive!sdd.hp.com!elroy.jpl.nasa.gov!jpl-devvax!lwall
From: lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall)
Newsgroups: comp.lang.perl
Subject: Re: Cute hack around tiny bug (Re: Help a perl apprentice)
Message-ID: <10149@jpl-devvax.JPL.NASA.GOV>
Date: 29 Oct 90 18:16:49 GMT
References: <18840001@hp-lsd.COS.HP.COM> <21380@orstcs.CS.ORST.EDU> <MDB.90Oct28220540@kosciusko.ESD.3Com.COM>
Reply-To: lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall)
Organization: Jet Propulsion Laboratory, Pasadena, CA
Lines: 38

In article <MDB.90Oct28220540@kosciusko.ESD.3Com.COM> mdb@ESD.3Com.COM (Mark D. Baushke) writes:
: I sent Larry private e-mail about this and suggested a one character
: change to the script change 
: 
:      $line{$avail} = sprintf("%30.30s%10.1f%6s\n", ...);
: to
:      $line{$avail} .= sprintf("%30.30s%10.1f%6s\n", ...);
: 
: This works fine if you don't care about the order of two entries with
: the same amount of available space.
: 
: Larry agreed and then suggested that another way to get around it
: would be to use
: 
:      $line{$avail.'.'.$seq++} = sprintf("%30.30s%10.1f%6s\n", ...);
: 
: Paul> It occured to me that a good way to handle this was to add a
: Paul> small number, say 0.1, to $avail if $line{$avail} aready
: Paul> existed.  But what if there were *three* (or more) file systems
: Paul> with the exact same available space?
: 
: Using a linearly increasing sequence number is probably a better
: solution in this case. Even using rand() there is no guarentee that
: you will not arrive at duplicate keys.

Personally, I like the .= approach the best--it's exactly equivalent to
the sequence number approach, and looks clean.  (Almost too clean--it's
easy to overlook the dot.)  And is a trick I've used quite a bit in the
past, so you can bet you'll run into the situation again.  In fact, if
you want to give the problem a label, it's "inverting on a non-unique key".
(I told you we should put it in the Book, Randal!  We still can...)

The Monte Carlo approach is fun, though.  Given the nature of rand(), I
don't think you're doing to get duplicate keys until you get 2**n filesystems
identically empty filesystems, where n is typically something like 16, 31
or 32.  But it does generate longer keys than a sequence number.

Larry