Chapter 40. Disk Data Structure

A disk consists of a linear array of 512 byte sectors addressed by a sector number. MMFS aggregates these sectors into blocks which are typically 128, 256 or 512 KiB in size. These blocks are the basic unit of addressing and allocation for MMFS. File data is transferred between the disk and memory in whole blocks, but metadata is accessed in smaller segments.

This disk is divided into four areas, the directory, the freelist, the block allocation tables (BATs) and the data area. The following sections describe these in detail.

40.1. Directory

The directory occupies the first one or two blocks of the disk. It consists of an array of directory entries. Each directory entry contains the following fields:

type

Entry type:

MMFS_TYPE_EMPTY Unused entry. Available for allocation.

MMFS_TYPE_VOLUME Volume label. The data field contains a volume label that describes the format of the filesystem.

MMFS_TYPE_FILE File. A standard data file.

MMFS_TYPE_RESERVED Reserved entry. An entry that exists only to occupy a directory slot. Used to protect the volume label against overwriting while updating adjacent entries.

bat
The block number of the Block Allocation Table for this file.
size
File size in bytes. For streamed files this reflects the number of data blocks in the BAT. For random access files this is the offset of the last byte of the last block in the BAT.
created
Creation time. A timestamp in seconds since the epoch implemented by the system wallclock. If no wallclock is present then this merely records the time since the last system restart.
state

File state. This records the state of the file and aids in system recovery. The possible states are:

MMFS_STATE_CREATING: The file is open for creation and is being actively extended.

MMFS_STATE_CREATED: The file has been closed and will no longer be extendable. However, random access files may be extended in this state.

MMFS_STATE_DELETING: The file is being deleted.

checksum
Checksum over directory entry. This ensures that the directory entry is correct and consistent.
data
Per-entry data. The contents of this field depend on the entry type. For the volume label it contains the filesystem format parameters. For files it may contain user-specified data. It is unused in other entry types.
name
File name. A zero terminated string naming the file. This field is 64 bytes long so, with zero termination, filenames may be a maximum of 63 bytes. There are no limits on the characters allowed.

The data field of a volume label contains the disk format parameters. This consists of the following sub-fields:

signature1
Volume label signature. This is used, together with signature2, to ensure that this is a valid volume label. If these two fields do no contain the expected values then the disk is presumed to be new or corrupt and the filesystem will reformat it.
version
Filesystem version number. Together with the revision number this is used to determine which version of the filesystem formatted this disk.
revision
Filesystem revision number.
sector_size
The size of sectors on this disk. This should match the sector size reported by the disk device itself. At present only sectors of 512 bytes are supported.
phys_block_size
Physical block size. This is the size of the physical blocks supported by this disk. This may differ from the sector size in some cases.
block_size
Size of MMFS blocks in sectors.
disk_size
Total number of blocks on the disk. If the number of sectors on the disk is not an exact multiple of the block_size then the last partial block will be unused.
rootdir_size
Size of the directory in blocks.
freelist_start
Block address of the first block of the freelist. This will be just after the directory.
freelist_size
Size of the freelist in blocks. This is calculated from disk_size so that the freelist is large enough to contain all the blocks on the disk.
bat_size
Size of each Block Allocation Table in blocks. The number of blocks per BAT is set during the formatting process.
bat_count
Number of BATs. The number of BATs is set during the formatting process and defaults to 200.
direntry_size
Size of a directory entry in bytes. This is currently fixed at 256 bytes. It is present to permit changes to the directory entry size in the future.
name_size
Size of the name field in a directory entry. This is currently fixed at 64 bytes. It is present to permit changes to the directory entry size in the future.
data_size
Size of the data field in a directory entry. This is currently fixed at 160 bytes. It is present to permit changes to the directory entry size in the future.
signature2
Second signature word.

40.2. Free List

The free list occupies a whole number of blocks following the directory. It is viewed as an array of block numbers and is large enough to contain the number of every block on the disk, plus enough spare to make it up to a whole number of blocks.

The free list is organized as a circular list with a head and a tail. Blocks are allocated from the head and are freed to the tail. When they reach the top of the free list, the head and tail pointers wrap back around to zero. These pointers are not stored on the disk but are discovered each time the filesystem is mounted by scanning the free list.

The free list is organized in this way for several reasons. First, it separates block allocation from freeing. Allocations need to proceed at a rate determined by the streaming of data onto the disk. Blocks are only freed when a file is deleted, and can be handled as a background task. Second, the separation makes recovery of filesystem integrity simpler, since blocks will not get reused immediately they are freed. Third, blocks that are allocated together in a particular file will be returned to the free list together, preserving locality.

40.3. Block Allocation Tables

The BAT area follows the free list. The size and number of BATs is defined when the filesystem is formatted. BATs are arrays of block addresses for the blocks that contain the data of the file. The number of BATs gives a hard upper limit to the number of files permitted. Usually this is set to equal the number of directory entries. There is little point in making it larger, but it may be useful to set it smaller if the minimal size of the directory exceeds the desired maximum number of files.

The size of each BAT represents a hard upper limit on the size of a file. BAT size should be set to cover the expected range of file sizes. Larger data sets can be handled at application level by splitting the data across several files.

40.4. Data Area

The last, and largest, area is the file data area. This comprises the rest of the disk following the last BAT. During formatting each block in this area is added to the free list.