Index of /atarilibrary/atari_cd04/UNIXLIKE/DEFRAG_L
Name Last modified Size Description
Parent Directory 11-May-2007 20:58 -
BUFFERS.C 18-Oct-1993 18:13 18k
CHANGES 07-Jan-1993 08:48 2k
DEFRAG.8 18-Oct-1993 19:11 2k
DEFRAG.C 18-Oct-1993 18:12 13k
DEFRAG.H 14-Jul-1993 19:39 5k
DEFRAG.TTP 24-Oct-1993 12:41 67k
DEFRAG2.TTP 24-Oct-1993 12:44 66k
GETOPT.H 14-Jul-1993 19:22 1k
HDIO.C 18-Oct-1993 18:30 11k
HDIO.H 18-Oct-1993 18:30 1k
INSTALL 22-Dec-1992 21:43 1k
MAKEFILE 24-Oct-1993 12:35 1k
MAKEFILE.LNX 14-Jul-1993 19:09 1k
MINIX.C 18-Oct-1993 18:14 6k
MINIX.H 15-Jul-1993 12:32 2k
MINIX_FS.H 15-Jul-1993 12:30 2k
MISC.C 18-Oct-1993 18:14 1k
PORTST.C 24-Oct-1993 12:09 3k
PORTST.H 15-Jul-1993 18:21 1k
PUN.H 15-Jul-1993 16:50 1k
TERMIOS.H 14-Jul-1993 19:13 0k
TINYXHDI.C 18-Oct-1993 18:30 7k
VERSION.H 07-Jan-1993 08:48 1k
XHDI.H 18-Oct-1993 18:30 2k
README for the Linus extended file system defragmenter
edefrag emergency release 0.3b alpha
Copyright Stephen C. Tweedie, 1992, 1993 (sct@dcs.ed.ac.uk)
Parts Copyright Remy Card, 1992 (card@masi.ibp.fr)
Parts Copyright Linus Torvalds, 1992 (torvalds@kruuna.helsinki.fi)
This file and the accompanying program may be redistributed under the
terms of the GNU General Public License.
INTRODUCTION: What does it do?
==============================
As a file system is used, data tends to become more and more scattered
over the disk, degrading performance. A disk defragmenter simply
reorganises the data on the disk, so that individual files occupy a
single sequential set of disk blocks, and all the free space on the
disk is collected together in a single region. This generally means
that reading a whole file is more efficient.
The extended file system stores a list of unused disk blocks in a
series of unused blocks scattered over the disk (the "free list").
When blocks are required to store data, they are removed from the head
of the list, and are added back when released (by unlinking or
truncating a file).
However, only the free blocks stored at the head of the list are
available to the extfs at any time. This means that not all the free
space is known to the extfs when it tries to find a free block; as a
result, it does not always find the most efficient way to use free
space.
This is in contrast to the minix file system, in which free space is
stored in a single bitmap, and the file system can allocate free space
from anywhere on the disk.
The resulting poorer performance over time of the extended file system
is unfortunate, because the larger partitions and longer filenames it
supports are useful to have around.
So, here is the extended file system defragmenter - recover all that
lost performance from your extfs partition.
For an idea of the performance gains you might obtain - the first time
I defragmented my file system, the time taken to boot my PC (from
switching on until the XDM X windows login prompt stabilises) dropped
from 37 seconds to 27 seconds.
As for the performance of the defragmenter itself - well, that first
version worked, but it thrashed my hard disk solid for over an hour
(this was for a 90MB partition). The current version runs in not much
over 5 minutes now, and most of the accesses are sequential (ie. NO
thrashing). Granted that the fragmentation is not severe any longer,
but that 5 or 6 minutes does still include reading and writing over
70MB of the partition.
Note - as of release 0.3, minix file systems are also supported.
HOW TO USE: and a few warnings.
===============================
Number one - (this applies to all - repeat, ALL - major file system
operations).
*** BACK UP ANY IMPORTANT DATA BEFORE YOU START. ***
There may be bugs in the defragmenter. You may have undetected errors
on your disk which are undiscovered until edefrag tries to write to a
bad block which has never been accessed before. There may be power
glitches, memory glitches, kernel errors. [e]defrag does some major
reorganisation of disk data, and if for any reason it doesn't finish
its work, most of your file system is likely to be trashed.
*** YOU HAVE BEEN WARNED. ***
*** NEVER try to defragment an active or mounted file system.
It is often safe to use [e]fsck on a mounted fs; don't be conned into
thinking that the same will work for [e]defrag. The file system will
be totally unusable while [e]defrag is working; and if this causes a
kernel crash, or if the fs interferes with the defragmenter as it
runs, you may well loose your entire partition.
This means that in order to defragment a root partition, you will
probably need to run [e]defrag from a boot floppy.
However, it IS totally safe to run [e]defrag in its readonly mode (for
testing) on an active partition.
*** Run [e]fsck on the partition first, to check its integrity.
Although I have been quite careful about the defragmenter's behaviour
on a corrupt file system (it should back down gracefully before doing
anything irreversible), it may well cause a lot of damage if the file
system is invalid in any way.
In particular, there is currently no handling of read/write errors in
the defragmenter. The extfs version DOES understand the bad block
inode (and the special handling now works - as of version 0.3b), so if
you suspect you might have bad blocks, try running efsck -t (test for
bad blocks) before defragmenting.
However, if you have an IDE drive, you needn't worry; you should never
get any hd errors, as IDE drives dynamically remap bad blocks
internally, as they occur. Until I have proper bad block support for
minix, it's probably unwise to try to defragment a suspect, non-IDE
minix partition.
*** Run [e]defrag -r next, just to be sure.
If there are any bugs in the defragmenter, running in readonly mode
first may find them ([e]defrag does quite a lot of self-checking as it
goes) before you lose any data.
*** Reinstall lilo after defragmenting a bootable partition.
Defragmentation moves data around the disk. edefrag knows all of the
file system's internal pointers to this data, so these are adjusted as
needed to keep the file system intact. Lilo, unfortunately, keeps its
own pointers to the location of kernel image files, so that the kernel
can be loaded before the filing system is running. (These pointers
are usually kept in /etc/lilo/map.) If you defragment a partition
containing a lilo-bootable kernel image, you MUST reinstall lilo to
rebuild the now-invalid map file.
Usage: edefrag [-Vdrsv] [-p pool_size] /dev/name
-V : Prints the full CVS version id for the release. Send me
this information with any problem reports or suggestions.
-s : Show superblock information.
-v : Verbose. Shows what the program is doing. If used
twice, gives extra progress information.
-r : Readonly. This opens the file system in readonly mode,
which guarantees that your data will not be harmed. This
can be useful for testing purposes, especially for
working out the best buffer pool size to use.
-d : (If enabled at compile-time) Debug mode.
The pool_size is the number of 1KB (disk block) buffers to
allocate to the buffer pool while relocating the file system
data. (Default is 512; it cannot be set below 20.)
Finally, /dev/name should be the device to be defragmented; an
image file may also be used (for debugging purposes), as
edefrag does not check that the file is a block device.
HINTS
=====
You may want to experiment with edefrag to find the best memory usage
before defragmenting. Currently, the significant tables held in
memory by edefrag are:
Relocation maps - eight bytes per block.
Inode table - 64 bytes per inode.
Inode maps - 8 bytes per inode.
The buffer pool must be added on top of this.
For a typical file system, this works out at around 26K of memory
required per MB of disk space, or 2.6MB memory for a 100MB disk
partition; plus the buffer pool.
It is safe to use a swap file or partition if memory is tight (but NOT
one on the file system being defragmented!); this may not even affect
performance much, since during its first (mapping) phase, the
defragmenter accesses the inode table but not the buffer pool; during
the second (relocating) phase, the inode table is unused and the
buffer pool comes into play.
(Don't worry about the defragmenter suddenly running out of memory
during its work; all the memory required is allocated and initialised
before it starts operation, so any memory errors should occur before
the file system gets touched.)
The defragmenter tries as hard as possible to group reads and writes
into long sequential accesses. Data being overwritten on the disk
gets put into a rescue buffer, and may soon just get written back
during the normal course of sequential writes. However, if the buffer
pool is too small or the disk is highly fragmented, edefrag tries to
clear out the rescued data by seeing if its final destination is empty
yet. (These are termed "migrate" writes; the data migrates from the
rescue pool to the output pool.) If that fails to free enough space,
edefrag forces some of the rescue buffers out into empty blocks
("forcing" writes), from which the data will have to be re-read at
some point.
The upshot of this is that normal buffer writes are highly sequential
and efficient; "migrate" writes are slightly less sequential, but
still quite efficient; and "forcing" writes cause data to be read
twice, and from this point of view are quite inefficient.
Running edefrag with the -r option will scan your file system
non-destructively, and will report on the work it would have to do to
defragment the disk. This facility can be used to adjust the pool
size requested to compromise between memory used and defragmenting
efficiency.
For example, I have just run:
$ edefrag -r /dev/hda3 [ default 512K buffer pool ]
[ ... superblock statistics deleted ... ]
Relocation statistics:
44807 buffer reads in 91 groups, of which:
14004 read-aheads.
44807 buffer writes in 91 groups, of which:
0 migrations, 0 forces.
$ edefrag -r -p 100 /dev/hda3
[ ... superblock statistics deleted ... ]
45299 buffer reads in 618 groups, of which:
13310 read-aheads.
45299 buffer writes in 618 groups, of which:
202 migrations, 492 forces.
The first result indicates a higher efficiency with 512 buffers
than with 100. However, even the second run would have been quite
quick; 492 forces out of a 90MB file system is not bad. (By the way,
the reason the total number of writes is less than 90MB is that much
of my hard disk was fully defragmented anyway. 8-)
If, however, my disk had been badly fragmented (as it used to be...) I
would probably have had to allocate around 2000-4000 buffers to get
good efficiency with few forced writes.
The tradeoff is that the less memory you allocate for pool buffers,
the more is available for the kernel to cache reads itself. Since the
kernel reads entire tracks at a time, leaving space to the kernel
effectively gives extra "free" buffer reads.
I'm not yet quite sure whether it is more efficient to leave the
kernel with a healthily large cache for itself, or to allocate as much
for edefrag's own (more optimised for the task) buffering scheme. You
may want to experiment here, and I would be interested in hearing any
conclusions you reach. I am running with 16MB ram, so if you have
less ram your mileage may vary.
WARRANTY:
=========
NONE. Use at your own risk. BACK UP ANY IMPORTANT DATA BEFORE YOU
START.
I have successfully run edefrag on my own root, 90MB extfs partition
at home. It has been tested on particularly hard jobs, such as
defragmenting a 1.44MB floppy with a buffer pool restricted to 20KB -
lots of extra writes are necessary to cope with a tiny buffer pool.
This release has never crashed for me, and has never lost me any data.
I am confident enough to use it fairly regularly, and if I back up
data before using it, I only backup stuff which cannot be reinstalled
from other sources. I have tried as far as possible to ensure that
edefrag will not harm your data. However, I cannot make ANY guarantee
that it won't. Use it and enjoy it, but don't blame me if it ruins
your day.
Having said that, if you DO have problems, let me know and I'll try to
fix them for the next release. (Even better, send me bug fixes!)
TO DO:
======
There is currently NO minix file system support. Watch this space.
When the mark 2 extfs is released by Remy Card, I should support
that, too.
I currently read in the entire inode table before starting, and write
it out again at the end. This is really a throw-back to edefrag's
origins in efsck. Since I no longer access the inodes at all after
initially calculating the disk relocation maps, I could probably get
away with just accessing inode data as needed, so using less memory.
Otherwise, try sharing memory between the inode table and the buffer
pool, since the two are never used at the same time.
The verbose (-v) option could do with a little rationalisation, and an
interactive (maybe full screen?) mode showing progress would be nice.
The sync() frequency should probably be configurable at run-time.
===
Stephen Tweedie (sct@dcs.ed.ac.uk).