Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!eecae!tank!uxc!uxc.cso.uiuc.edu!m.cs.uiuc.edu!wsmith
From: wsmith@m.cs.uiuc.edu
Newsgroups: comp.unix.questions
Subject: Computational complexity of rm & ls
Message-ID: <9000012@m.cs.uiuc.edu>
Date: 8 Mar 89 23:46:00 GMT
Lines: 26
Nf-ID: #N:m.cs.uiuc.edu:9000012:000:1213
Nf-From: m.cs.uiuc.edu!wsmith    Mar  8 17:46:00 1989


I did a test of the computational complexity of the ls and rm programs.

I created directories with a large number of files in them (large = 1000-2000).

Then, I timed ls commands in these directories.   It appeared that the
program was quadratic in the amount of system time that it took and n*log n
in user time.  (This is on a encore 4.2 berkeley Unix.)  When I tried it
under 4.3 Berkeley Unix on a VAX, it appeared to be order n in system time
and n log n user time.   The 4.3 version is obviously better.
How do other versions of unix perform on large directories?

To clean up my mess, I removed the files in two steps.  First deleting
half with `rm *a` rmand then deleting the rest with `rm *` The 4.3 Unix 
appeared to be quadratic in system time for this operation, while the 4.2 Unix
was cubic or worse in the amount of system time required!  
(I'm only measuring the curve at 2 points, so it is hard to get a good 
estimate of the complexity.)

Surely rm can do better than cubic, can't it?   "rm *" seems to be a common
enough idiom that rm could be optimized to handle that case better.   I would
like rm to be linear in system time.  Is that possible?

Bill Smith
wsmith@cs.uiuc.edu
uiucdcs!wsmith