Path: utzoo!attcan!uunet!microsoft!w-colinp From: w-colinp@microsoft.UUCP (Colin Plumb) Newsgroups: comp.sys.amiga Subject: Re: Getting 1.5 MB on a standard 880K floppy without compression Message-ID: <40@microsoft.UUCP> Date: 23 Mar 89 14:29:39 GMT References: <3631@sugar.hackercorp.com> <94323@sun.Eng.Sun.COM> Reply-To: w-colinp@microsoft.uucp (Colin Plumb) Organization: very little Lines: 208 cmcmanis@sun.UUCP (Chuck McManis) wrote: > In article <3631@sugar.hackercorp.com> (Karl Lehenbauer) writes: >> In the Roomers column of the March 1989 issue of Amazing Computing, the >> Bandito says that certain game companies are getting 1.5 MB unformatted >> on standard 880K floppies. >> Can anyone verify if this is true and if so, explain how it works? > > [They seem to use RLL.] RLL isn't quite the answer. It's doable, but it's more interesting than that... On a disk, bits are encoded as magnetic domains. If you could see the north poles of the bits of oxide along a track, they'd be pointing alternately clockwise and counterclockwise. A spot where the direction of magnetisation changes is a 1; no change in magnetic field is a 0. When the read head passes over, it detects the 1's as a current pulse. 0's have to be figured out by timing. Since the standard bit time for 3.5" (really, 90mm) floppies is 2us, if two pulses are detected 2us apart, that was 2 consecutive 1's. If the gap was 4us, that was 101, etc. If the gap is too long, the controller can mis-count and get a wrong number of 0's. If the gap is too short, there's a really tiny section on the disk that's pointing the opposite way to its neighbours. It can get swamped and vanish. Very bad. At 300 rpm and on the inner tracks (track 79), the minimum time between 1's works out to 4us. On the outer tracks (track 0), you can achieve the same linear density by speeding up the bit clock (a la Commodore PET and C-64), or by slowing down the drive motor (Macintosh). Or, like the Amiga, ST, and IBM PC, you can not bother. Now, I mentioned that the bit clock rate is 2us. But you can't put 1's closer than 4us apart. So we have to make sure we always put at least one 0 between 1's. Of course, we also have to be careful not to have too long a string of 0's, either. (Side note: 1200 bps and up modems have problems with long strings of 0's, too. They solve it by XORing in a pseudo-random bit stream which the modems work out between themselves. This is referred to as the scrambler polynomial.) This is where RLL comes in. RLL is Run Length Limited. The first disk drives used a 4us bit clock, alternating data bits and 1 bits for synchronisation. Thus, the longest string of 0's was of length 1. I.e. runs of 0's were limited to length 1. But because they could have adjacent 1's (the lower bound on run lengths was 0), they needed to use a slow bit clock for this 0,1 RLL scheme. This is known as Frequency Modulation. (It is, in fact, just like FM stereo. A stream of 1's would get writtten 1111111111, each 1 standing for a polarity change. The signal would end up looking like a square wave of period 8us. A stream of 0's would get written as 010101010101, a square wave of half the frequency.) Then someone worked out a better scheme, MFM (Modified FM), which is still the most common today, in which 1's were always separated by at least 1 zero, and by at most 3. Thus, 1,3 RLL. It's just like FM (data, clock, data, clock), except that the clock 1's are only inserted between two consecutive 0's. This makes it harder to figure out what's clock and what's data (in FM, if you find a 0, you know it's data), but there are a few patterns that still are 1,3 RLL but aren't generated by this scheme. The standard sync mark is A1, 10100001 which, with clock pulses inserted (using i and o for 1 and 0 so you can tell the difference) is 1o0o1o0i0i0i0o1o. By suppressing the second clock pulse (the i after the third 0), you get the raw bit pattern 100010010001001, which cannot appear in normal data (try it!). Because the floppy controller (the trackdisk.device, in this case) eats the clock pulses, it reads as A1. But it's really magic. Because you always have a 0 between adjacent 1's, and you can write 1's 4 us apart, you can write raw bits 2 us apart and stay within the capabilities of the media. 2us per bit, and 0.2 seconds per revolution (300 rpm = 5 rps) gives you 100,000 bits per revolution. 80 tracks, times two sides, gives 16,000,000 bits per disk. Since it takes two raw bits to produce one data bit, that's 8,000,000 data bits, or 1,000,000 (not 1 Meg = 1024x1024) bytes per floppy. There are, however, more complex schemes. What people refer to as RLL in the IBM world uses a 2,7 RLL code. Because you always have 2 0 bits between 1's, you can use a 1.333 us bit clock and still keep your 1's far enough apart. (Actually, it's used for hard drives, which can take 1 bits much closer than 4 us, but the idea is the same.) The only loss is that your read circuitry has to be able to keep time for 8 bit periods as it skips 7 0's. (Also, your media has to stop the boundary from "wandering" as much, since you have less tolerance. But this is usually pretty easy compared with making it accept 2.666 us gaps between 1's.) The 2,7 RLL code still manages to encode 1 data bit as 2 "raw" bits, although it's not nearly as simple as data, clock, data. Even if you wanted to go to all the trouble to make the trackdisk.device use this encoding scheme, it wouldn't help, since you still can't write raw bits any faster than 2 us. But you can, however, use a different 1,n RLL scheme. If you apply a bit of math, it turns out that 1,3 RLL can give you 0.55146 data bits per raw bit. 0.5 of that is used by MFM, and a bit extra is used by the A1 sync pattern. 2,7 RLL gives you 0.51736 data bits per raw bit. Agian, 0.5 of that is used for data, and some of the rest is used for a sync marker. A 1,6 RLL scheme will give you 0.66903 data bits per raw bit... enough to spend two thirds of a raw bit encoding data and have a little left over for a unique sync pattern. However, wringing the last few drops of efficiency from an RLL code is hard (you have about 0.00237 bits to fit your sync marker in here, i.e. it's got to be 1/0.00237 is more than 40 bits long), so I'd use a 1,7 RLL code, which gives 0.67928 bits per bit, giving you 0.01262 bits to fit a sync pulse in, resulting in a more reasonable 8-bit sync pattern to look for. I still haven't actually come up with an encoding scheme, but I expect that just playing with it a bit will produce something. There's also an out I haven't mentioned: perhaps you can push the capabilities of the disk a little, and write adjacent 1 bits on the outer tracks. I think this would put them closer than the disk is rated for, but high-quality floppies can probably take it. But playing by the rules, I's use 1,7 RLL. Now, you're wondering how I came up with that figure of 0.67928 bits per raw bit. Here's where we get into Combinatorics. I goes like this: the null string is 1,6 RLL. No bits, no runs of the wrong length. Furthermore, and 1,6 RLL string can be followed by any of 10, 100, 1000, 10000, 100000, or 1000000 and produce a 1,6 RLL string. So, given a string of length n, we can produce strings of length n+2, n+3, n+4, n+5, n+6, and n+7. These are the only strings we can produce directly, so we can turn it around: we can make a string of length n from any string of length n-2, n-3, n-4, n-5, n-6, or n-7. So S(n), the number of strings of length n, is S(n-2)+S(n-3)+S(n-4)+S(n-5)+S(n-6)+S(n-7). And we know that there is one string of length 0, and none of length <0. S(0) = 1, S(-1) = S(-2) = ... = 0. From this, you can probably see how to write a program to compute the number of strings of length n that are 1,6 RLL. For any p,q RLL, actually. S(n) = S(n-p)+S(n-p-1)+...+S(n-q-1). All I did was use really long strings (about a million bits) and take the log base 2 (log base x of n is log(n)/log(x) for log() in any other base) of the number of possible strings, which gives the number of bits of information you can encode in such strings, and divide this by the number of bits I used (about a million). A few other numbers: 0,1 RLL: 0.69424 /* FM gets 0.5 of this - inefficient! */ 0,2 RLL: 0.87914 0,3 RLL: 0.94677 0,4 RLL: 0.97522 0,5 RLL: 0.98810 0,6 RLL: 0.99419 0,7 RLL: 0.99713 0,8 RLL: 0.99857 0,9 RLL: 0.99929 0,10 RLL: 0.99964 /* 0,infinity RLL is, of course, 1 */ 1,1 RLL: 0 /* I assume you can figure out why */ 1,2 RLL: 0.40568 1,3 RLL: 0.55146 1,4 RLL: 0.61744 1,5 RLL: 0.65089 1,6 RLL: 0.66903 1,7 RLL: 0.67928 1,8 RLL: 0.68525 1,9 RLL: 0.68878 1,10 RLL: 0.69091 2,3 RLL: 0.28776 2,4 RLL: 0.40568 2,5 RLL: 0.46495 2,6 RLL: 0.49790 2,7 RLL: 0.51736 2,8 RLL: 0.52933 2,9 RLL: 0.53691 2,10 RLL: 0.54179 When doing this computation, you'll quickly overflow 32 bits and even double- precision floating point. (After all, a 1 million bit long string in 2,7 RLL has over 2^500,000 possibilities.) I cobbled up my own pseudo floating-point, where if the sum S(n) ever exceeeded 1<<24, I divided it by 2 and incremented an exponent word (32 bits, of course). Adding two numbers in this form isn't too tricky, expecially since in this computation, you always know which is the larger (S(n) > S(n-1), after a few initial bobbles), and you can say: big += little>>(big_exponent-little_exponent); if (big > (2<<24)) { big >>= 1; big_exponent++; } I just kept a circular buffer of the number of strings of length n-1 through n-q-1, and printed out (exponent+log_2(mantissa))/n every thousand times through the loop. (If you do it every time, keep a check for mantissa==0, which is true for length=1 if the minimum run length > 0.) If someone else would like to cobble up a routine and verify my numbers, great. (If you remember enough combinatorics to get a closed-form expression for the number of bits, even better. I tried, but fell back on numerical approximations.) So, is everything clear now? Good. -- -Colin (uunet!microsoft!w-colinp) "Don't listen to me. I never do." - The Doctor