|Professional Services||Second Signal||Presentations||Andrew's Blog||Support|
I got an email question the other day and thought it'd make a great blog entry. This is, of course, all based on my experience and memory and not recent research so take it for what it is......
While defragmenting isn't a "Bad" thing, it isn't a necessary one either in most cases.
In the days of the Seagate ST-225 and ST-251, when we talked about random access times of 80ms and all drives were based on the MSDOS FAT system, having those big clunky stepper motor actuated read/write heads moving around took real time. Defgragging one of those did two things. First, it put all the parts of the file together sequentially on the drive, but second it put all the parts of the file sequentially on the physical platter itself within the drive. In those days, the logical sectors that the BIOS was writing to matched the physical locations on the disk. Writing to Platter-0, head-2 meant exactly that.
The typical drive/controller combination in say 1986 would have been a Seagate ST-225 20meg drive (a 5.25" 1/2 height drive with a stepper motor actuator) and a Western Digital WD-1002A-WA2 8 bit controller card. The drive was formatted in MFM format (17 sectors per track) at a 3:1 interleave (three spins of the drive would be required to sequentially read all the sectors on a track in order). Setting the interleave too low, like 1:1 if the card couldn't go that fast, slowed things down because if the card wasn't ready to read the next sector by the time it passed under the head, the platter had to spin all the way around again giving you effectively a 17:1 interleave. I recall seeing about 80k/second throughput on those. I owned a really sweet Adaptec 16 bit controller (don't recall the model number but 8000 seems famillar) which would format the drive (remember >g=C800:5 anyone?) in an RLL format (30% more sectors per track) and was fast enough to handle a 1:1 interleave. I remember seeing speeds of 800k/second with that controller and an expensive 80meg full height drive from Micropolis in my 80286. ** all this is from memory 15 years ago, so forgive me if I've gotten some details a bit off **
What's different now?
1. FAT vs. inode based file systems like NTFS
In a FAT file system, to read the 28th sector of a file, you had to read the fat first with its link to the first sector, then the first sector which had a link to the next sector, and so on and so on. That means to read the 28th sector you had 29 reads. That made a huge difference especially on files where you weren't reading the whole thing at once (a random access file). A fully defragged file, would have 17 sectors in a row all on one track of one platter -- which could be read in as little as one revolution of the platter (on a 1:1 interleave controller) or more commonly 3 revolutions (a 3:1 interleave was very common). Modern file systems don't work that way. The files are allocated via 'informational nodes' which contain maps to some or even all of the sectors of a file. Oh, and we also put way more data per sector on most of our drives now.
2. Drive speeds
Modern drives aren't the big clunky things they were in those days. Instead of 80ms random access seek time, we now see 8 milliseconds commonly. There hasn't been anything but a 1:1 interleave in years as far as I know, and the drives spin so fast that at 3.5" there is enough wobble at the ends of the platter to cause errors at the data densities now being achieved (that's why 3.5" is dead now, and all the newest stuff will be smaller and more dense). We no longer measure in k/second. Typically IDE configurations now run around 12 megabytes per second, and my new Serial ATA drive measures out at 30 megs per second in this machine. A 10,000 RPM SCSI drive with a good controller can double that and with RAID the speeds just ROCK.
3. Heads & Sectors aren't where you think they are any more
The ST-506 interface that defined MFM and RLL is still alive. We don't call it that, and we don't have a fat cable and a skinny cable to attach to each drive. We also don't have to "low level format" our hard drives. Now we have "IDE" which is the same thing really, but it puts the controller right on the drive. Since the controller is matched to the drive, no low level format is needed and the controller can be specially matched to enhanced features of the drive itself. Well, your PC's bios wasn't built to handle the number of sectors we now have on drives. It was built assuming a lot more heads though. So modern controllers "MAP" this for compatibility. They tell the outside world some number of heads and cylinders that give the pc the same number of sectors, but there's no direct link. That means, your defgramenting program cannot effectively know (on an IDE drive) exactly where the data "really" is. In theory, the drive's "Logical Map" will be optimized in some way so that you're not bouncing all over the drive if you put things in sequential order, but not necessarily.
Dangers of Defragging:
The good news is, it won't wear out your drive. That's a myth today. Those old drives were susceptible to it though. The metal band that wound and unwound to move the read/write heads when the stepper motor turned could literally get flexible. No, not anymore. The only danger today, is if things get moved around and aren't where the program expects them to be. Good defrag software should prevent that by getting between the OS call to read or write, and the actual drive. 32 bit operating systems do not allow software to write directly to the hardware (one of the biggest mistakes, imo, in early DOS).
In part 1 we understand that we're not having sequential seek through a long line of sectors to get the files we want any more unless we need to read the whole file. In part 2 we see that those random seeks don't take as long, and in part 3 we see that we may not be saving as much time as we think on IDE drives. Combine this, and you see that the need for defragging is much lower than it used to be.
So when should you do it?
If you read a big file a lot, and you need to read it quickly and all the way from end to end -- its a good target for defragging. MP3 files don't count. They've big, and you read them end to end, but you rarely need to read them really fast -- just faster than your cd burner or your audio player needs them. The big files that get read from end to end frequently are your program files. For that reason, its probably a good idea to defrag after you install some big new software package. Also, put your big software stuff on your fastest drive. It matters. If you edit really really big files (like 100 meg graphics, movies, and such) you may want to have a defrag tool since you tend to read all of them into memory to work with at once. Also, If you are low on disk space, defragging can make a big "unfragmented space" to hold the next big thing you put down like a swap file. That can sometimes help as well.
Hope this helps!
Please wait while your document is saved.