Newsgroups: comp.sys.amiga.programmer Path: utzoo!utgpu!watserv1!watdragon!rose!ccplumb From: ccplumb@rose.uwaterloo.ca (Colin Plumb) Subject: Re: Bizarre Fantasy: RLL Flopies? Message-ID: <1991Apr4.091810.5359@watdragon.waterloo.edu> Keywords: MFM, RLL, Floppies, Blitter, Clueless, Fantasy, etc. Sender: news@watdragon.waterloo.edu (News Owner) Organization: University of Waterloo References: <1991Apr2.061633.17980@kessner.denver.co.us> Date: Thu, 4 Apr 1991 09:18:10 GMT Lines: 238 chris@kessner.denver.co.us (Chris Hanson [Lord Xenon]) wrote: > RLL 2,7 typically gives an approximate > 150% storage space increase, presumably by using only 3 bits to encode > 2 input bits, instead of MFM's piggish 4 bits out for every 2 bits in. Nope, 2,7 RLL also encodes 1 "data" bit as 2 "raw" bits. You want something I've spent a while thinking about, 1,7 RLL. > Now, I know that RLL hard drives use more than just a different type > of encoding algorithm, as you can not always use a MFM HD with and RLL > controller, successfully. Am I missing something? Is anyone out there > familliar with the RLL 2,7 algorithm? Okay, here's a tutorial on magnetic recording technology. When you record something on a disk, the north poles of the ferrite particles are aligned either with the direction of rotation, or against it. There is "vertical recording" technology in the labs that points them up and down, but this is how it works for now. Now, a "raw" 1 bit is written as a *change* in the direction of magnetisation, while a raw 0 bit is written as no change. Thus, a track is made up of a bunch of little bar magnets of various lengths, each corresponding to a gap between adjacent 1 bits. If the gap is too small, the magnet is weak and easily erased. This is not good. Standard floppies, spinning at 300 rpm, are engineered to use a 4us gap size. Now, on the outer tracks, this is a lot more length than on the inner ones, so you can usually get away with a much smaller (2us) gap and have it work pretty reliably... a number of copy-protection schemes do this. But it's not guaranteed to and is Bad Practice. On the other hand, while reading the disk, the computer has to measure the lenths of the gaps to determine how many 0 bits to put in each. It synchronises with the 1 bits and, if it hasn't received another 1 bit by the time 1.5 bits have rolled around, assumes it's a 0 bit and goes on. These two parameters specify minimum and maximum sizes on runs of zeros between adjacent 1's. This is what a run-lengh-limited code is all about. MFM is a (1,3) RLL code. GCR is a (0,2) RLL code. There are lots of others. Now, these restrictions mean that one raw bit is not completely arbitrary, so can't carry one complete bit's worth of information. In general, the more zero bits that must follow a 1 bit, the less dense a code is, and the more zero bits that can, the more dense a code is. A bit of elementary combinatorics (omitted), comes up with the following theoretical limits on data bits per raw bit for various RLL restrictions: (0,infinity) 1.00000000 (0,2) 0.87914642 (1,infinity) 0.69424191 (1,7) 0.67928626 (1,6) 0.66903164 (1,3) 0.55146308 \ > Yes, these two are identical (2,infinity) 0.55146308 / (2,7) 0.51736957 Now, The actual densities achieved are .8 for GCR (0,2), .5 for MFM (1,3), and .5 for (2,7). Seen in this light, MFM doesn't do so badly.. it's about 90% of optimal. (2,7) RLL is within 3.5%, while GCR is also about 90% of optimal. Now, the first digit (0, 1, 2 in the above) is the shortest run of 0's allowed. Adding a 1 to terminate the run should leave at least 4us between 1 bits, so a (0,n) RLL code must be written a 4us/bit, a (1,n) RLL code can be written at 2us/bit, and a (2,n) RLL code can be written at 1.33us/bit. The ratio (6/4) of bit times between (1,n) RLL and (2,n) RLL codes is the 50% density increase you get with RLL. However, as the second number goes up, you have to be more accurate at counting bit times to make sure you don't read 7 0's as 6 or 8. There are all kinds of sources of error in reading magnetic media that conspire to make it difficult. The first observation: going to (2,7) RLL lets you use shorter bit times. Since the Amiga can't write at other than 2 and 4 us/bit, that doesn't gain us anything. However, there is a reason I included the (1,6) and (1,7) entries... since they're (1,n) RLL codes, they are designed to be written at 2us/bit, but they have densities greater than 2/3, so we can get 2 data bits per 3 raw bits, i.e. 4 per 6, versus 3 per 6 for MFM. *This* is a possiblity. Well, there is one little problem... complexity. To get close to the theoretical density is difficult, and while MFM and GCR are fairly simple, (2,7) RLL is rather more complex, and (1,6) RLL, which is less than 1% off optimal, is not really practical. There is, however, a practical (1,7) RLL code, with density 2/3. You can flip bits around and reassign combinations if you like, but it's basically: 00 -> 010 01 -> 001 10 -> 100 11 -> 001 So far, it's a lot like MFM... insert an extra bit between data bits, which is 1 if and only if they're both zero. But we don't insert extra bits between these bit pairs, so we have four possible problem cases where we'd get adjcent 1's, a no-no: 01.10 -> 001.100 01.11 -> 001.101 11.10 -> 101.100 11.11 -> 101.101 We replace these with special patterns: 01.10 -> 010.000 01.11 -> 001.000 11.10 -> 100.000 11.11 -> 101.000 Since each of these patterns ends in 0, the following bit group cannot conflict with it. Proceeding along in this way, we can translate pairs of bits into triples. A run of 7 0's is achieved when we have 0.11.10.01.0, which translates into 0.100.000.001.0. We can get two consecutive runs of length 6 (0.11.10.01.10.01.0 -> 0.100.000.010.000.001.0), but the longest sustained run length we can get is of length 5. Anyway, armed with this encoding scheme, let's see what we can fit on an Amiga floppy. Each track is 60s/300rpm = .2s long, and at 2us/bit, that's 100,000 bits. Except the bits are actually 2.5% faster than that, but may be up to +/- 5% due to genlocking, and the disk may be spinning at +/- 1.5% of 300 rpm, so the final worst-case track length is, depending on whether one interprets n% fast as (1+n/100) or (1-n/100), between 95914 and 96236 bits. Now, using a density-2/3 code, one amiga sector (512 bytes plus 16 bytes header area = 528 bytes) takes 792 bytes, or 6336 bits. 15 of these are 95040 bits, leaving between 874 and 1194 bits for bookkeeping. H'm... that's 72.866 to 99.5 bytes... not much. what can we fit in there? We need sync marks, patterns that are legal (1,7) RLL but not produced by our encoding scheme (in MFM, it's 100010010001, which is at the heart of $4489 = 00100010010001001). An adjacent 6-run and 7-run will do, 1000000100000001 or 1000000010000001, which conveniently just fit into 16 bits, letting us use the wordsync register of the Amiga. But we can have one sync mark per pair of registers or so if we like. Let's call that 8 sync marks per track, for something over 128 bits, or 10 raw bytes. What to do with the remaining 62 to 89 bytes? My idea is ECC. We should have at least a 32-bit CRC on the track, cutting us down to 52 to 79 bytes. But a Reed-Solomon error-correcting code can correct one byte in error out of 257, including the two check bytes. 255 goes into our 528*15=7920 data bytes 31.0588 times, so we need 32 pairs of check bytes, which may or may not fit into that available space. It depends on whether the available space per track is 100000*.975*.95*.985 or 100000/1.025/1.05/1.015 bits. Interleaving the Reed-Solomon ECC lets us correct burst errors up to 32 bytes long. At 24us/byte, that's 768us, or quite a big glitch. Oh, we also need a way to tell which sector on a track is number 1; we can let the others follow in numberical order. Sector number 16 will be short and contain the 16-byte sector headers, 4 bytes of CRC and 64 bytes of ECC (308 bytes instead of 512). Well, our sync mark needs to be padded out with some 0's so it can abut legal patterns... let's precede it with 0 and follow it with 010 for a normal sector and 000 for the short one. So that makes 960 bits of overhead of the 874 to 1194 we have room for. We need a few more bits to detect the track gap, which I can be sneaky on by using the behaviour of the Amiga's wordsync register, and I think that's it. So, the format is basically: Bits Total 18 18 01000000010000001010 12288 12306 1024 bytes of data encoded 18 12324 01000000010000001010 12288 24612 1024 bytes of data encoded 18 24630 01000000010000001010 12288 36918 1024 bytes of data encoded 18 36936 01000000010000001010 12288 49224 1024 bytes of data encoded 18 49242 01000000010000001010 12288 61530 1024 bytes of data encoded 18 61548 01000000010000001010 12288 73836 1024 bytes of data encoded 18 73854 01000000010000001010 12288 86142 1024 bytes of data encoded 18 86160 01000000010000001000 9840 96000 512+308 = 820 bytes of data encoded 96000 bits is 26.7% of the way through our uncertainty zone, so it's probably safe. A few more bits here and there will probably be needed, but it looks like it (barely!) fits. Now, what have we achieved? We've got 15 512-byte sectors por track, per side, for a total of 2400, or 1200K. This is the same capacity as IBM's "1.2 Meg" 5.25" floppies. But for the agressive, we also have the header label area (if you don't know what this is, look up ETD_READ in the RKM's), which is an additional 37.5K, for a grand total of 1237.5K, 8.7K larger than honest 1.2 Meg = 1.2*1024K = 1228.8K. That should keep people happy for a while. Now, does anyone want to *implement* this thing? -- -Colin P.S. Reed-Solomon ECC works like this: you have a primitive element in GF(2^8) (basically, a special kind of math on 8-bit numbers, where addition is exclusive-or (and therefore the same as subtraction) and multiplication is rather wierd, but still obeys all the usual rules), call it p. "Primitive" means that 1, p, p^2, p^3, etc. are all different until you hit p^255 = 1, where it repeats. (Obviously, 0 is not one of these numbers, and it hits every other 8-bit pattern) Given 255 bytes of data, we form the first check byte "a" by xor-ing all the data together, so if one byte is in error, parity computation will tell us what the error is, i.e. the bit pattern to be added to (xor-ed with) one of the bytes to produce the original data. Then, the second check byte "b" is data[0]+p*data[1]+p^2*data[2]+...+p^254*data[254]. If the parity byte tells us we have an error, call it "e", then computing "b" again for the received data, where data[n] is in error, gives us data[0]+p*data[1]+...+p^n*(data[n]+e)+...+p^254*data[254], which is the original b plus p^n*e. Subtracting the two thus gives p^n*e, and then we can divide by e (since e is non-zero) to get p^n, and since n is somewhere in the range 0..254, this uniquely identifies n, and we can exclusive-or data[n] with e to recover the original message. If the parity byte itself is in error, check byte b checks out and we don't correct the data. If the check byte b is in error then, because we're assuming no more than one bad byte, the other bytes are all correct and the parity byte will not report an error. The 32-bit CRC serves as a final check to make sure we did all this correctly, since two or more bytes in error will cause the Reed-Solomon code to miscorrect something. This error correction code is perfect in an information-theoretic sense. We have 16 bits of check data, with 2^16 possible combinations. When we recompute the check data at read time, there are 2^16 possibilities when we xor the old and new together. Assuning there is no more than 1 byte in error, there is 1 case in which there is no error, and 255 possible errors in each of 257 possible bytes. (2^8-1)*(2^8+1) = 2^16-1, plus one for the no-error case gives a unique interpretation for each error syndrome.