Case Study: A Simple Filesystem

Overview
Basic Structure
Creating the Filesystem
Creating files
Deleting files
Fragmentation

Implementation Details
Why the FAT?
Filesystem Tools
Development Tools
The Big Picture
References

At its most basic level a filesystem is simply one very large file that contains... other files. That's it. How these files are stored and retrieved is the subject of this document. By looking at the number of filesystems around (dozens to hundreds), you can see that there are many different techniques used to implement them and all of them have their pros and cons. Some are almost brain-dead simple and others are massively complex. We are going to look at the simpler methods, as this is an introduction.

This concept of a file storing other files is very common and you see/use them everyday without even realizing it. These files can be thought of as "mini-filesystems" because the files they operate on are just wrappers for many other files contained within. For example:

zip files are files that contain other (compressed) files.
Microsoft ".docx" files are basically zip files.
Call of Duty .iwd files are also zip files.

To see this, simply run this command on any of the file types above:

unzip -lv filename

and you'll see a directory of all files in the file. If you want to test the integrity of all of the files in the archive, use the -t option:

unzip -t filename

Applications that distribute many files will often wrap them all into one really large file. It's much easier to manage a single file than thousands of separate ones. Some examples (from my youth!)

Doom WAD Format This is the format used by Doom and related games.
Quake BSP Format This is the format used by Quake and related games.
Valve's BSP Format From Half-Life and Counter Strike games.

The most fundamental property of a filesystem is that it is just a data structure, not unlike a linked list or array. In fact, some of the earliest filesystems were simple arrays and linked lists. The complexity of the data structure is directly related to the size, performance (speed), and information stored in the filesystem. Because not all systems require the same level of performance, there are hundreds of different filesystems.

Some are very simple and others are massively complex. Think of an iPod or TiVo or even a cell phone. The number of files these devices contain may be measured in the thousands (10³). These devices have radically different needs than say the server farms at Google or Facebook, where the number of files are measured in the billions (10⁹) and beyond. Those filesystems contain millions of terabytes of data as well. Most home users don't have anywhere near this kind of requirement.

To help understand the problems and challenges that filesystem developers are trying to solve, we will build a simple, yet effective, filesystem that can work for small-scale devices. After completing it, we will look at many other features that can be added to make it faster, more scalable, and more reliable and resilient.

Here is some of the information that we will need to store about a file:

The file's name. (This is for humans.)
The size of the file (in bytes).
Where on the disk is the file stored? (This is the starting block.)

Things that we are not going to concern ourselves with at this time (to keep things simple):

Time and dates when the file was created, modified, or accessed.
The owner or group that the file belongs to.
Permissions of the file (read, write, execute).
The type of file, e.g. directory, link, etc. (We will deal only with regular files.)
We won't be dealing with directories at all.
Many, many other features...

However, once we have the basic features working, we'll see how easy/difficult it would be to add the other features. Again, the earliest filesystems also didn't concern themselves with much other info, either.

The goal of this is not to create a Real-World^™ Filesystem, but to understand how they work and can be implemented.

Simplified view of a very simple filesystem (One Big File):

The header consists of a few dozen to a few hundred bytes and describes the layout of the filesystem.

e.g. the number of files, the number of data blocks, the size of each data block, etc.

There is one directory entry for each file in the filesystem.

Each entry includes the name of the file, the starting block, and the size.

There are one or more blocks (file contents) for each directory entry in the filesystem.

This is what a real FAT directory entry looks like.

BTW, the reason we are not going to try to implement a modern filesystem (or even a partial one) is simply the sheer complexity. Take a look at what the primary file system is for Windows: NTFS Specs (local copy in case the link is dead.) Just look at the hundreds and hundreds of options and details for that file system. Compare that to a "trivial" FAT-type file system and you can see how far they've come.

Basic Structure

At the most basic level, each file is going to be represented by a singly-linked list of blocks. The disk block is the smallest unit that can be read/written by the filesystem. Typical block sizes in bytes are 512, 1K, 2K, 4K, 8K ... up to 64K. Another term you will hear for block is sector.

So, for example, a file that is 1,400 bytes that is stored in a fileystem with 512 byte blocks will require 3 blocks. The first 2 blocks will be full, totaling 1024 bytes (512 * 2) and the third block will store the "residual" 376 bytes (with 136 bytes unused). Logically, the blocks will be stored as a linked-list of 3 blocks:

Our filesystem is going to be loosely based on the original FAT (File Allocation Table) filesystem that was used by MSDOS back in the 1980s. The FAT filesystem is arguably still the most popular filesystem in use today, although it has been modified somewhat (VFAT, FAT32, exFAT) to deal with the technological demands of newer devices. e.g. larger drives, long filenames, very large files, etc.

We'll call the filesystem SFAT for Simple FAT because, although the original FAT filesystem was very simple, ours is going to be even simpler. Remember, this is an instructional video! You are not going to go out and replace Google's filesystem anytime soon!

The SFAT filesystem will have 4 basic sections. These sections are the

super block - This describes the layout of the entire filesystem, e.g. number of sectors, sectors per cluster, bytes per sector, etc.
directory area - This maps the file's name to the disk blocks that contain the contents.
file allocation table - This keeps track of which data blocks are in-use and which are free using linked-list techniques.
data area - This is the bulk of the filesystem and is where the contents of all of the files are actually stored.

This is a graphical view of the four sections. Note that the sections are not to scale, as the data blocks section likely consumes 95% or more of the filesystem.

The relationship between the directory entry, file allocation table, and data blocks.

Close-up of the FAT entries:

Given this configuration for the filesystem:

Attribute Value

total_sectors 8192

sectors_per_cluster 1

bytes_per_sectors 512

total_direntries 1024

Attribute	Value
total_sectors	8192
sectors_per_cluster	1
bytes_per_sectors	512
total_direntries	1024

Here are some values:

The superblock is always 32 bytes, as that is enough to store all of the information about the filesystem itself, regardless of how many files are actually contained within.

Details of the super block:

Diagram SuperBlock C structure

typedef struct SuperBlock { unsigned short total_sectors; // Total number of sectors unsigned short sectors_per_cluster; // Number of sectors in each block unsigned short bytes_per_sector; // Number of bytes in each sector unsigned short available_sectors; // Number of sectors available unsigned short total_direntries; // Total number of directory entries unsigned short available_direntries; // Number of available dir entries unsigned char fs_type; // File system type (FA for SFAT) unsigned char reserved[11]; // Reserved, all set to 0 unsigned char label[8]; // Not guaranteed to be NUL-terminated }SuperBlock;

There are a total of 8K (8192) sectors.
Each sector is 512 bytes. This means we can store up to 4 MB (8K * 512) bytes of data in the filesystem.
There is one sector per cluster. This keeps things much simpler. (More on this later)
The maximum number of files (directory entries) that can be stored is 1K (1024).
Each directory entry requires 16 bytes to store the name of the file (10 bytes max), the starting block (unsigned short int, 2 bytes) and the file size (unsigned int, 4 bytes).

Details of a directory entry:

There will be an array of directory entries. The number of elements in the array is the number of directory entries specified when the filesystem was created.

Diagram DirEntry C structure

typedef struct DirEntry { unsigned char name[10]; // Not guaranteed to be NUL-terminated unsigned short fat_entry; // First data block unsigned int size; // Size of the file in bytes }DirEntry;

The directory area consumes 16K, which is 1024 (max number of files) * 16 (size of each entry).
The file allocation table consumes 16 K also, which is 8192 (number of sectors) * 2 (size of an unsigned short int).
The data area is 4 MB (4,194,304 bytes) which is 8192 (number of sectors) * 512 (bytes per sector).

If we add up the sizes we can see that the total size of the filesystem is:

 superblock            directory area       FAT area         data area
     32        +            16,384      +    16,384    +     4,194,304	= 4,227,104

You can easily see that by changing the number of sectors and/or the size of each sector, we can store more or less data in the filesystem. Typical filesystems have millions or billions of sectors and each sector may be 2K to 8K in size giving us a filesystem that can hold billions or trillions of bytes! Our simple filesystem won't be doing anything close to that!

We can also see the limitations of the filesystem based on this structure:

The maximum number of files is 1024, which was given when we created the filesystem.
The maximum individual file size is 4 MB (4,194,304). This would mean that the filesystem only contained a single file that consumed the entire data area!
The maximum length of a filename is 10 bytes. Again, chosen when the filesystem was created.

None of these limitations are insurmountable, it's just a matter of requiring more space on the disk. Most modern filesystems have much, much higher limits on these things, e.g. unlimited number of files, filenames up to 255 bytes long, individual files up to 2 TB in size, etc.

The diagram below shows the relationship between the file allocation table and the data area (using the numbers from above). There is exactly one FAT entry for each data block in the filesystem. This is what the FAT and data area would look like immediately after creation (and before any files are stored):

With most linked lists you've been dealing with, you normally have a "next" pointer as part of the node (block). In this case, there are no next pointers stored in each data block. Instead, the file allocation table is used for that purpose. You can think of the FAT as an array of all of the next pointers that would be stored in the data blocks. That's why there is a one-to-one correspondence with the number of FAT entries, and the number of data blocks. We are simply storing the "next" pointers outside of the data blocks themselves. This means that the data blocks are storing only data; not any next pointers or other information. The diagrams below will clarify this.

Note that the size of the super block is always 32 bytes and the size of each directory entry is always 16 bytes. What varies are the number of sectors (total_sectors), sectors_per_cluster, the size of each sector (bytes_per_sector), and the number of directory entries (total_direntries).

In essence:

The size of the fileystem (data blocks) is dependent on the number of sectors and the size of each sector. Simply multiply the two numbers together to get the number of bytes available.
The number of files is dependent on the number of directory entries.

Note: Some filesystems do not put a limit on the number of files that a filesystem can contain because the directory area can grow. To keep things manageable for our simple filesystem, we have a fixed-size directory area that is set when the filesystem is created and cannot grow at a later time. Once the directory entries are all in use, the file system can be considered "full", even if there are more data blocks available.

Given a file named Foo that has a size of 1,400 bytes, it might cause the filesystem to now look like this:

Directory entry (16 bytes in size, 10/2/4):

FAT and data blocks without file fragmentation. A zero indicates an unused block and FFFF (hex) represents the end of the list of blocks.

You may realize that the file is not fragmented, meaning, all of the data blocks are contiguous. This is not guaranteed and is usually not the case, especially with a lot of activity like deleting files and adding files. It is entirely possible that the filesystem could end up looking more like this after a bunch of activity:

Directory entry:

FAT and data blocks (file fragmentation):

Self-check:
1. How does the filesystem know how many bytes are stored in the last block (the "residual" bytes) since that information isn't in the FAT nor the data block itself?
2. What happens with the remaining unused bytes in the last data block?
3. What can you say about the size of file A?
4. What can you say about the size of file B?

Creating the Filesystem

Let's create an empty filesystem so we can add files to it. After creating the empty filesystem using the configuration parameters above, this is what a raw dump of the filesystem looks like. Since there are no files stored yet, only the super block has any information.

Raw filesystem dump (240 bytes) Super block information (32 bytes)

00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- super block / 000000 08 00 01 00 10 00 08 00 04 00 04 00 FA 00 00 00 ................ \ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... / 000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ dir. entries | 000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ | 000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \ 000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ FAT | 000060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ / 000070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ | 000080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ | 000090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ | 0000A0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ data area | 0000B0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ | 0000C0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ | 0000D0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \ 0000E0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 8 Directory entries: 4 Available direntries: 4 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3' Comments: When a "real" filesystem is created there is no guarantee that all of the data blocks have been initialized to 0. What will likely happen is that the values will be whatever random garbage happens to be on the disk at that location when the filesystem was created. The directory area and file allocation table (FAT) will likely be initialized to 0.

Raw filesystem dump (240 bytes)		Super block information (32 bytes)
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- super block / 000000 08 00 01 00 10 00 08 00 04 00 04 00 FA 00 00 00 ................ \ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... / 000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ dir. entries \| 000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \| 000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \ 000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ FAT \| 000060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ / 000070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \| 000080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \| 000090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \| 0000A0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ data area \| 0000B0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \| 0000C0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \| 0000D0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \ 0000E0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................		Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 8 Directory entries: 4 Available direntries: 4 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3' Comments: When a "real" filesystem is created there is no guarantee that all of the data blocks have been initialized to 0. What will likely happen is that the values will be whatever random garbage happens to be on the disk at that location when the filesystem was created. The directory area and file allocation table (FAT) will likely be initialized to 0.

The first 32 bytes are for the super block.
The next 64 bytes (4 * 16) are for the directory entries.
The next 16 bytes (8 * 2) are for the file allocation table (FAT)
The remaining 128 bytes (8 * 16) is the data area.

Note: Since our filesystem is so small and basic, it only takes a few lines of code (and a fraction of a second) to create an empty filesystem. More complex filesystems can take much longer to create as they must construct much more sophisticated data structures to manage billions and trillions (and more!) of bytes of data. Remember this?

Creating Files

Here are 4 small files that we'll use to demonstrate. We'll look at the entire filesystem after each one has been stored. Each line ends with an invisible newline (<NL>). The number in the name of the file reflects how many characters are in the file.

file9.txt	file16.txt	file23.txt	file61.txt
9 chars.<NL>	16 characters..<NL>	Exactly 23 characters.<NL>	Roses are red. Violets are blue.<NL> This text is 61 chars long.<NL>

This is the configuration of the filesystem:

Attribute Value

total_sectors 8

sectors_per_cluster 1

bytes_per_sectors 16

total_direntries 4

Attribute	Value
total_sectors	8
sectors_per_cluster	1
bytes_per_sectors	16
total_direntries	4

Given this configuration (i.e. 16 bytes per sector) and the size of each file above, we know that:

file9.txt will require 0 full sectors and 1 partial sector.
file16.txt will require 1 full sector and 0 partial sectors.
file23.txt will require 1 full sector and 1 partial sector.
file61.txt will require 3 full sectors and 1 partial sector.

In fact, all files will have some combination of full sectors and partial sectors. The only time you will not have a partial sector is if the size of the file is a multiple of the number of bytes per sector. The overwhelming majority of the files will not be a perfect multiple and will, therefore, require a partial sector. A file will never have more than one partial sector. This partial sector or "wasted" bytes is known as internal fragmentation (just like with memory pages).

Note: As you can imagine, the larger the sector (bytes per sector), the more wasted bytes you are likely to have. On the other hand, smaller sectors lead to many small blocks which require a lot more overhead to manage and therefore will have a negative impact on performance. On average, each file in the filesystem will waste 1/2 of a block. As always in computer science, it's a trade-off.

Note: Since all files require at least one block, even a file that is a single byte will require an entire block to store. More than 99% of the bytes in the block will be wasted.

After creating file9.txt:

Raw filesystem dump (240 bytes) Super block information (32 bytes)

00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- super block / 000000 08 00 01 00 10 00 07 00 04 00 03 00 FA 00 00 00 ................ \ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... / 000020 66 69 6C 65 39 2E 74 78 74 00 00 00 09 00 00 00 file9.txt....... dir. entries | 000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ | 000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \ 000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ FAT | 000060 FF FF 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ / 000070 39 20 63 68 61 72 73 2E 0A 00 00 00 00 00 00 00 9 chars......... | 000080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ | 000090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ | 0000A0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ data area | 0000B0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ | 0000C0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ | 0000D0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \ 0000E0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 7 Directory entries: 4 Available direntries: 3 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3' Comments: Each new file will decrement the value of Available direntries. Each block that is consumed by a new file will decrement the value of Available sectors.

Raw filesystem dump (240 bytes)		Super block information (32 bytes)
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- super block / 000000 08 00 01 00 10 00 07 00 04 00 03 00 FA 00 00 00 ................ \ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... / 000020 66 69 6C 65 39 2E 74 78 74 00 00 00 09 00 00 00 file9.txt....... dir. entries \| 000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \| 000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \ 000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ FAT \| 000060 FF FF 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ / 000070 39 20 63 68 61 72 73 2E 0A 00 00 00 00 00 00 00 9 chars......... \| 000080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \| 000090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \| 0000A0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ data area \| 0000B0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \| 0000C0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \| 0000D0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \ 0000E0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................		Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 7 Directory entries: 4 Available direntries: 3 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3' Comments: Each new file will decrement the value of Available direntries. Each block that is consumed by a new file will decrement the value of Available sectors.

When creating a new file, all four sections will require some modifications:

Superblock - Update available_sectors and available_direntries.
Directory entries - Update a directory entry with the filename, the FAT entry, and the file size.
FAT - Update the corresponding FAT entry.
Data area - Use the block(s) pointed to by the FAT entry/entries.

Showing the empty filesystem with the updated one.

After creating file16.txt:

Raw filesystem dump (240 bytes) Super block information (32 bytes)

00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- super block / 000000 08 00 01 00 10 00 06 00 04 00 02 00 FA 00 00 00 ................ \ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... / 000020 66 69 6C 65 39 2E 74 78 74 00 00 00 09 00 00 00 file9.txt....... dir. entries | 000030 66 69 6C 65 31 36 2E 74 78 74 01 00 10 00 00 00 file16.txt...... | 000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \ 000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ FAT | 000060 FF FF FF FF 00 00 00 00 00 00 00 00 00 00 00 00 ................ / 000070 39 20 63 68 61 72 73 2E 0A 00 00 00 00 00 00 00 9 chars......... | 000080 31 36 20 63 68 61 72 61 63 74 65 72 73 2E 2E 0A 16 characters... | 000090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ | 0000A0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ data area | 0000B0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ | 0000C0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ | 0000D0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \ 0000E0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 6 Directory entries: 4 Available direntries: 2 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3' Comments: Files can't "share" data blocks, so the "extra" space in the last block of file9.txt is wasted and can't be used for any other file.

Raw filesystem dump (240 bytes)		Super block information (32 bytes)
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- super block / 000000 08 00 01 00 10 00 06 00 04 00 02 00 FA 00 00 00 ................ \ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... / 000020 66 69 6C 65 39 2E 74 78 74 00 00 00 09 00 00 00 file9.txt....... dir. entries \| 000030 66 69 6C 65 31 36 2E 74 78 74 01 00 10 00 00 00 file16.txt...... \| 000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \ 000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ FAT \| 000060 FF FF FF FF 00 00 00 00 00 00 00 00 00 00 00 00 ................ / 000070 39 20 63 68 61 72 73 2E 0A 00 00 00 00 00 00 00 9 chars......... \| 000080 31 36 20 63 68 61 72 61 63 74 65 72 73 2E 2E 0A 16 characters... \| 000090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \| 0000A0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ data area \| 0000B0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \| 0000C0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \| 0000D0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \ 0000E0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................		Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 6 Directory entries: 4 Available direntries: 2 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3' Comments: Files can't "share" data blocks, so the "extra" space in the last block of file9.txt is wasted and can't be used for any other file.

After creating file23.txt:

Raw filesystem dump (240 bytes) Super block information (32 bytes)

00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- super block / 000000 08 00 01 00 10 00 04 00 04 00 01 00 FA 00 00 00 ................ \ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... / 000020 66 69 6C 65 39 2E 74 78 74 00 00 00 09 00 00 00 file9.txt....... dir. entries | 000030 66 69 6C 65 31 36 2E 74 78 74 01 00 10 00 00 00 file16.txt...... | 000040 66 69 6C 65 32 33 2E 74 78 74 02 00 17 00 00 00 file23.txt...... \ 000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ FAT | 000060 FF FF FF FF 03 00 FF FF 00 00 00 00 00 00 00 00 ................ / 000070 39 20 63 68 61 72 73 2E 0A 00 00 00 00 00 00 00 9 chars......... | 000080 31 36 20 63 68 61 72 61 63 74 65 72 73 2E 2E 0A 16 characters... | 000090 45 78 61 63 74 6C 79 20 32 33 20 63 68 61 72 61 Exactly 23 chara | 0000A0 63 74 65 72 73 2E 0A 00 00 00 00 00 00 00 00 00 cters........... data area | 0000B0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ | 0000C0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ | 0000D0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \ 0000E0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 4 Directory entries: 4 Available direntries: 1 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3' Comments: At this point, there are 3 files in the system and they consume 4 data blocks. There is one directory entry and 64 data blocks remaining. The filesystem will be considered full when we run out of either directory entries or data blocks, whichever runs out first.

Raw filesystem dump (240 bytes)		Super block information (32 bytes)
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- super block / 000000 08 00 01 00 10 00 04 00 04 00 01 00 FA 00 00 00 ................ \ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... / 000020 66 69 6C 65 39 2E 74 78 74 00 00 00 09 00 00 00 file9.txt....... dir. entries \| 000030 66 69 6C 65 31 36 2E 74 78 74 01 00 10 00 00 00 file16.txt...... \| 000040 66 69 6C 65 32 33 2E 74 78 74 02 00 17 00 00 00 file23.txt...... \ 000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ FAT \| 000060 FF FF FF FF 03 00 FF FF 00 00 00 00 00 00 00 00 ................ / 000070 39 20 63 68 61 72 73 2E 0A 00 00 00 00 00 00 00 9 chars......... \| 000080 31 36 20 63 68 61 72 61 63 74 65 72 73 2E 2E 0A 16 characters... \| 000090 45 78 61 63 74 6C 79 20 32 33 20 63 68 61 72 61 Exactly 23 chara \| 0000A0 63 74 65 72 73 2E 0A 00 00 00 00 00 00 00 00 00 cters........... data area \| 0000B0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \| 0000C0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \| 0000D0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \ 0000E0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................		Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 4 Directory entries: 4 Available direntries: 1 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3' Comments: At this point, there are 3 files in the system and they consume 4 data blocks. There is one directory entry and 64 data blocks remaining. The filesystem will be considered full when we run out of either directory entries or data blocks, whichever runs out first.

After creating file61.txt:

Raw filesystem dump (240 bytes) Super block information (32 bytes)

00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- super block / 000000 08 00 01 00 10 00 00 00 04 00 00 00 FA 00 00 00 ................ \ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... / 000020 66 69 6C 65 39 2E 74 78 74 00 00 00 09 00 00 00 file9.txt....... dir. entries | 000030 66 69 6C 65 31 36 2E 74 78 74 01 00 10 00 00 00 file16.txt...... | 000040 66 69 6C 65 32 33 2E 74 78 74 02 00 17 00 00 00 file23.txt...... \ 000050 66 69 6C 65 36 31 2E 74 78 74 04 00 3D 00 00 00 file61.txt..=... FAT | 000060 FF FF FF FF 03 00 FF FF 05 00 06 00 07 00 FF FF ................ / 000070 39 20 63 68 61 72 73 2E 0A 00 00 00 00 00 00 00 9 chars......... | 000080 31 36 20 63 68 61 72 61 63 74 65 72 73 2E 2E 0A 16 characters... | 000090 45 78 61 63 74 6C 79 20 32 33 20 63 68 61 72 61 Exactly 23 chara | 0000A0 63 74 65 72 73 2E 0A 00 00 00 00 00 00 00 00 00 cters........... data area | 0000B0 52 6F 73 65 73 20 61 72 65 20 72 65 64 2E 20 56 Roses are red. V | 0000C0 69 6F 6C 65 74 73 20 61 72 65 20 62 6C 75 65 2E iolets are blue. | 0000D0 0A 54 68 69 73 20 74 65 78 74 20 69 73 20 36 31 .This text is 61 \ 0000E0 20 63 68 61 72 73 20 6C 6F 6E 67 2E 0A 00 00 00 chars long.....

Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 0 Directory entries: 4 Available direntries: 0 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3' Comments: The filesystem is now full. No directory entries and no data blocks are available.

Raw filesystem dump (240 bytes)		Super block information (32 bytes)
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- super block / 000000 08 00 01 00 10 00 00 00 04 00 00 00 FA 00 00 00 ................ \ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... / 000020 66 69 6C 65 39 2E 74 78 74 00 00 00 09 00 00 00 file9.txt....... dir. entries \| 000030 66 69 6C 65 31 36 2E 74 78 74 01 00 10 00 00 00 file16.txt...... \| 000040 66 69 6C 65 32 33 2E 74 78 74 02 00 17 00 00 00 file23.txt...... \ 000050 66 69 6C 65 36 31 2E 74 78 74 04 00 3D 00 00 00 file61.txt..=... FAT \| 000060 FF FF FF FF 03 00 FF FF 05 00 06 00 07 00 FF FF ................ / 000070 39 20 63 68 61 72 73 2E 0A 00 00 00 00 00 00 00 9 chars......... \| 000080 31 36 20 63 68 61 72 61 63 74 65 72 73 2E 2E 0A 16 characters... \| 000090 45 78 61 63 74 6C 79 20 32 33 20 63 68 61 72 61 Exactly 23 chara \| 0000A0 63 74 65 72 73 2E 0A 00 00 00 00 00 00 00 00 00 cters........... data area \| 0000B0 52 6F 73 65 73 20 61 72 65 20 72 65 64 2E 20 56 Roses are red. V \| 0000C0 69 6F 6C 65 74 73 20 61 72 65 20 62 6C 75 65 2E iolets are blue. \| 0000D0 0A 54 68 69 73 20 74 65 78 74 20 69 73 20 36 31 .This text is 61 \ 0000E0 20 63 68 61 72 73 20 6C 6F 6E 67 2E 0A 00 00 00 chars long.....		Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 0 Directory entries: 4 Available direntries: 0 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3' Comments: The filesystem is now full. No directory entries and no data blocks are available.

The filesystem is now full. There are no more directory entries available and there are no more data blocks. Any attempt to create another file will result in an error from the filesystem with a message along the lines of "No space left on device."

It turns out that with this example we ran out of directory entries at the same time we ran out of data blocks. This is an unlikely situation. Usually, we either run out of directory entries while still having free data blocks, or, we run out of data blocks while still having free directory entries.

Let's see what the filesystem would look like if the last file that was created was very small. In fact, I'm going to store the first file again, but with a different name:

After creating file9a.txt:

Raw filesystem dump (240 bytes) Super block information (32 bytes)

00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- super block / 000000 08 00 01 00 10 00 03 00 04 00 00 00 FA 00 00 00 ................ \ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... / 000020 66 69 6C 65 39 2E 74 78 74 00 00 00 09 00 00 00 file9.txt....... dir. entries | 000030 66 69 6C 65 31 36 2E 74 78 74 01 00 10 00 00 00 file16.txt...... | 000040 66 69 6C 65 32 33 2E 74 78 74 02 00 17 00 00 00 file23.txt...... \ 000050 66 69 6C 65 39 61 2E 74 78 74 04 00 09 00 00 00 file9a.txt...... FAT | 000060 FF FF FF FF 03 00 FF FF FF FF 00 00 00 00 00 00 ................ / 000070 39 20 63 68 61 72 73 2E 0A 00 00 00 00 00 00 00 9 chars......... | 000080 31 36 20 63 68 61 72 61 63 74 65 72 73 2E 2E 0A 16 characters... | 000090 45 78 61 63 74 6C 79 20 32 33 20 63 68 61 72 61 Exactly 23 chara | 0000A0 63 74 65 72 73 2E 0A 00 00 00 00 00 00 00 00 00 cters........... data area | 0000B0 39 20 63 68 61 72 73 2E 0A 00 00 00 00 00 00 00 9 chars......... | 0000C0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ | 0000D0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \ 0000E0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 3 Directory entries: 4 Available direntries: 0 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3' Comments: The filesystem is still full, even though there are 3 data blocks available. Because there are no more directory entries available, no new files can be added to the filesystem. This is not uncommon when a filesystem has many small files.

Raw filesystem dump (240 bytes)		Super block information (32 bytes)
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- super block / 000000 08 00 01 00 10 00 03 00 04 00 00 00 FA 00 00 00 ................ \ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... / 000020 66 69 6C 65 39 2E 74 78 74 00 00 00 09 00 00 00 file9.txt....... dir. entries \| 000030 66 69 6C 65 31 36 2E 74 78 74 01 00 10 00 00 00 file16.txt...... \| 000040 66 69 6C 65 32 33 2E 74 78 74 02 00 17 00 00 00 file23.txt...... \ 000050 66 69 6C 65 39 61 2E 74 78 74 04 00 09 00 00 00 file9a.txt...... FAT \| 000060 FF FF FF FF 03 00 FF FF FF FF 00 00 00 00 00 00 ................ / 000070 39 20 63 68 61 72 73 2E 0A 00 00 00 00 00 00 00 9 chars......... \| 000080 31 36 20 63 68 61 72 61 63 74 65 72 73 2E 2E 0A 16 characters... \| 000090 45 78 61 63 74 6C 79 20 32 33 20 63 68 61 72 61 Exactly 23 chara \| 0000A0 63 74 65 72 73 2E 0A 00 00 00 00 00 00 00 00 00 cters........... data area \| 0000B0 39 20 63 68 61 72 73 2E 0A 00 00 00 00 00 00 00 9 chars......... \| 0000C0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \| 0000D0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \ 0000E0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................		Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 3 Directory entries: 4 Available direntries: 0 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3' Comments: The filesystem is still full, even though there are 3 data blocks available. Because there are no more directory entries available, no new files can be added to the filesystem. This is not uncommon when a filesystem has many small files.

The filesystem is still technically full, even though there are 3 data blocks available. But, with no more directory entries available, you can't create a new file. Existing files can grow because they already have a directory entry and there are free data blocks, but no new files can be created.

Suppose that the last file was too big to store. What happens? I'm going to try and store this file: (poem.txt)

Roses are red.<NL>
Violets are blue.<NL>
Some poems rhyme.<NL>
But not this one.<NL>

This file is 69 bytes in size, but there are only 64 bytes (4 * 16) left in the filesystem. This is what the result will be when attempting to store the whole file:

After creating poem.txt:

Raw filesystem dump (240 bytes) Super block information (32 bytes)

00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- super block / 000000 08 00 01 00 10 00 00 00 04 00 00 00 FA 00 00 00 ................ \ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... / 000020 66 69 6C 65 39 2E 74 78 74 00 00 00 09 00 00 00 file9.txt....... dir. entries | 000030 66 69 6C 65 31 36 2E 74 78 74 01 00 10 00 00 00 file16.txt...... | 000040 66 69 6C 65 32 33 2E 74 78 74 02 00 17 00 00 00 file23.txt...... \ 000050 70 6F 65 6D 2E 74 78 74 00 00 04 00 40 00 00 00 poem.txt....@... FAT | 000060 FF FF FF FF 03 00 FF FF 05 00 06 00 07 00 FF FF ................ / 000070 39 20 63 68 61 72 73 2E 0A 00 00 00 00 00 00 00 9 chars......... | 000080 31 36 20 63 68 61 72 61 63 74 65 72 73 2E 2E 0A 16 characters... | 000090 45 78 61 63 74 6C 79 20 32 33 20 63 68 61 72 61 Exactly 23 chara | 0000A0 63 74 65 72 73 2E 0A 00 00 00 00 00 00 00 00 00 cters........... data area | 0000B0 52 6F 73 65 73 20 61 72 65 20 72 65 64 2E 0A 56 Roses are red..V | 0000C0 69 6F 6C 65 74 73 20 61 72 65 20 62 6C 75 65 2E iolets are blue. | 0000D0 0A 53 6F 6D 65 20 70 6F 65 6D 73 20 72 68 79 6D .Some poems rhym \ 0000E0 65 2E 0A 42 75 74 20 6E 6F 74 20 74 68 69 73 20 e..But not this

Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 0 Directory entries: 4 Available direntries: 0 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3' Comments: The last file was truncated because the filesystem ran out of data blocks. A truncated file will always have a size that is exactly a multiple of the bytes per sector.

Raw filesystem dump (240 bytes)		Super block information (32 bytes)
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- super block / 000000 08 00 01 00 10 00 00 00 04 00 00 00 FA 00 00 00 ................ \ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... / 000020 66 69 6C 65 39 2E 74 78 74 00 00 00 09 00 00 00 file9.txt....... dir. entries \| 000030 66 69 6C 65 31 36 2E 74 78 74 01 00 10 00 00 00 file16.txt...... \| 000040 66 69 6C 65 32 33 2E 74 78 74 02 00 17 00 00 00 file23.txt...... \ 000050 70 6F 65 6D 2E 74 78 74 00 00 04 00 40 00 00 00 poem.txt....@... FAT \| 000060 FF FF FF FF 03 00 FF FF 05 00 06 00 07 00 FF FF ................ / 000070 39 20 63 68 61 72 73 2E 0A 00 00 00 00 00 00 00 9 chars......... \| 000080 31 36 20 63 68 61 72 61 63 74 65 72 73 2E 2E 0A 16 characters... \| 000090 45 78 61 63 74 6C 79 20 32 33 20 63 68 61 72 61 Exactly 23 chara \| 0000A0 63 74 65 72 73 2E 0A 00 00 00 00 00 00 00 00 00 cters........... data area \| 0000B0 52 6F 73 65 73 20 61 72 65 20 72 65 64 2E 0A 56 Roses are red..V \| 0000C0 69 6F 6C 65 74 73 20 61 72 65 20 62 6C 75 65 2E iolets are blue. \| 0000D0 0A 53 6F 6D 65 20 70 6F 65 6D 73 20 72 68 79 6D .Some poems rhym \ 0000E0 65 2E 0A 42 75 74 20 6E 6F 74 20 74 68 69 73 20 e..But not this		Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 0 Directory entries: 4 Available direntries: 0 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3' Comments: The last file was truncated because the filesystem ran out of data blocks. A truncated file will always have a size that is exactly a multiple of the bytes per sector.

You'll notice that the last file was truncated. Only the bytes that could fit were stored. The size of the file in the filesystem is also modified to reflect this fact. A message would also be displayed saying something along the lines of "No space left on device."

This last situation shows what the filesystem looks like if we run out of data blocks, but still have directory entries available. This will likely happen with many large files. Here, I'm going to store one file in the filesystem that takes up all of the data blocks, but only requires one directory entry. It is one long line in the file. There are well over 128 characters, in the file, so it will definitely be truncated. (preamble)

When in the Course of human events, it becomes necessary
for one people to dissolve the political bands which have
connected them with another, and to assume among the powers
of the earth, the separate and equal station to which the 
Laws of Nature and of Nature's God entitle them, a decent
respect to the opinions of mankind requires that they
should declare the causes which impel them to the
separation.<NL>

After creating preamble:

Raw filesystem dump (240 bytes) Super block information (32 bytes)

00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- super block / 000000 08 00 01 00 10 00 00 00 04 00 03 00 FA 00 00 00 ................ \ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... / 000020 70 72 65 61 6D 62 6C 65 00 00 00 00 80 00 00 00 preamble........ dir. entries | 000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ | 000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \ 000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ FAT | 000060 01 00 02 00 03 00 04 00 05 00 06 00 07 00 FF FF ................ / 000070 57 68 65 6E 20 69 6E 20 74 68 65 20 43 6F 75 72 When in the Cour | 000080 73 65 20 6F 66 20 68 75 6D 61 6E 20 65 76 65 6E se of human even | 000090 74 73 2C 20 69 74 20 62 65 63 6F 6D 65 73 20 6E ts, it becomes n | 0000A0 65 63 65 73 73 61 72 79 20 66 6F 72 20 6F 6E 65 ecessary for one data area | 0000B0 20 70 65 6F 70 6C 65 20 74 6F 20 64 69 73 73 6F people to disso | 0000C0 6C 76 65 20 74 68 65 20 70 6F 6C 69 74 69 63 61 lve the politica | 0000D0 6C 20 62 61 6E 64 73 20 77 68 69 63 68 20 68 61 l bands which ha \ 0000E0 76 65 20 63 6F 6E 6E 65 63 74 65 64 20 74 68 65 ve connected the

Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 0 Directory entries: 4 Available direntries: 3 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3' Comments: There is only one file in this filesystem, but the filesystem is full. There are directory entries available, but there are no data blocks available. No new files can be created.

Raw filesystem dump (240 bytes)		Super block information (32 bytes)
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- super block / 000000 08 00 01 00 10 00 00 00 04 00 03 00 FA 00 00 00 ................ \ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... / 000020 70 72 65 61 6D 62 6C 65 00 00 00 00 80 00 00 00 preamble........ dir. entries \| 000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \| 000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ \ 000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ FAT \| 000060 01 00 02 00 03 00 04 00 05 00 06 00 07 00 FF FF ................ / 000070 57 68 65 6E 20 69 6E 20 74 68 65 20 43 6F 75 72 When in the Cour \| 000080 73 65 20 6F 66 20 68 75 6D 61 6E 20 65 76 65 6E se of human even \| 000090 74 73 2C 20 69 74 20 62 65 63 6F 6D 65 73 20 6E ts, it becomes n \| 0000A0 65 63 65 73 73 61 72 79 20 66 6F 72 20 6F 6E 65 ecessary for one data area \| 0000B0 20 70 65 6F 70 6C 65 20 74 6F 20 64 69 73 73 6F people to disso \| 0000C0 6C 76 65 20 74 68 65 20 70 6F 6C 69 74 69 63 61 lve the politica \| 0000D0 6C 20 62 61 6E 64 73 20 77 68 69 63 68 20 68 61 l bands which ha \ 0000E0 76 65 20 63 6F 6E 6E 65 63 74 65 64 20 74 68 65 ve connected the		Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 0 Directory entries: 4 Available direntries: 3 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3' Comments: There is only one file in this filesystem, but the filesystem is full. There are directory entries available, but there are no data blocks available. No new files can be created.

Like before, you'll notice that the last file was truncated. Only the bytes that could fit were stored. The size of the file in the filesystem is also modified to reflect this fact. A message would also be displayed saying something along the lines of "No space left on device." All of the data blocks have been consumed, yet we only have one file in the filesystem. This can happen if you have very large files.

Things to notice at this point:

Very small files waste a lot of space. Even a 1-byte file must consume an entire block. With larger blocks (e.g. 8K), this is a significant amount of wasted space.
Relating to the previous point; files can't share data blocks. If they could, we could use the left-over wasted space. So, technically, even if you allowed an unlimited number of directory entries (files), you are still limited to the number of data blocks. Other types of filesystems use different techniques to avoid wasted space in the last block.
As long as we just create files (never delete and create more), we will never have fragmented files. Fragmentation happens when you delete files (creating small "holes" between the data blocks) and then fill those holes with blocks from other files. A fragmented file is a file where not all of the data blocks are contiguous.
The example filesystem above is an unrealistically-sized filesystem. But, it was kept small to demonstrate how things work. Once all of the "bugs" are worked out on this small scale, we can easily configure the filesystem to contain millions or billions of bytes, limited only by the actual physical size of the drive.

Deleting Files

It turns out that deleting files requires much less work than creating them. This is because a file's data blocks are not deleted, per se. They are simply marked as deleted, which means that they can be re-used by other files in the future. We'll start with the full filesystem from above, delete each file, and see what the filesystem looks like after each step.

The full filesystem:

Raw filesystem dump (240 bytes) Super block information (32 bytes)

00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- super block / 000000 08 00 01 00 10 00 00 00 04 00 00 00 FA 00 00 00 ................ \ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... / 000020 66 69 6C 65 39 2E 74 78 74 00 00 00 09 00 00 00 file9.txt....... dir. entries | 000030 66 69 6C 65 31 36 2E 74 78 74 01 00 10 00 00 00 file16.txt...... | 000040 66 69 6C 65 32 33 2E 74 78 74 02 00 17 00 00 00 file23.txt...... \ 000050 66 69 6C 65 36 31 2E 74 78 74 04 00 3D 00 00 00 file61.txt..=... FAT | 000060 FF FF FF FF 03 00 FF FF 05 00 06 00 07 00 FF FF ................ / 000070 39 20 63 68 61 72 73 2E 0A 00 00 00 00 00 00 00 9 chars......... | 000080 31 36 20 63 68 61 72 61 63 74 65 72 73 2E 2E 0A 16 characters... | 000090 45 78 61 63 74 6C 79 20 32 33 20 63 68 61 72 61 Exactly 23 chara | 0000A0 63 74 65 72 73 2E 0A 00 00 00 00 00 00 00 00 00 cters........... data area | 0000B0 52 6F 73 65 73 20 61 72 65 20 72 65 64 2E 20 56 Roses are red. V | 0000C0 69 6F 6C 65 74 73 20 61 72 65 20 62 6C 75 65 2E iolets are blue. | 0000D0 0A 54 68 69 73 20 74 65 78 74 20 69 73 20 36 31 .This text is 61 \ 0000E0 20 63 68 61 72 73 20 6C 6F 6E 67 2E 0A 00 00 00 chars long.....

Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 0 Directory entries: 4 Available direntries: 0 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3'

Raw filesystem dump (240 bytes)		Super block information (32 bytes)
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- super block / 000000 08 00 01 00 10 00 00 00 04 00 00 00 FA 00 00 00 ................ \ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... / 000020 66 69 6C 65 39 2E 74 78 74 00 00 00 09 00 00 00 file9.txt....... dir. entries \| 000030 66 69 6C 65 31 36 2E 74 78 74 01 00 10 00 00 00 file16.txt...... \| 000040 66 69 6C 65 32 33 2E 74 78 74 02 00 17 00 00 00 file23.txt...... \ 000050 66 69 6C 65 36 31 2E 74 78 74 04 00 3D 00 00 00 file61.txt..=... FAT \| 000060 FF FF FF FF 03 00 FF FF 05 00 06 00 07 00 FF FF ................ / 000070 39 20 63 68 61 72 73 2E 0A 00 00 00 00 00 00 00 9 chars......... \| 000080 31 36 20 63 68 61 72 61 63 74 65 72 73 2E 2E 0A 16 characters... \| 000090 45 78 61 63 74 6C 79 20 32 33 20 63 68 61 72 61 Exactly 23 chara \| 0000A0 63 74 65 72 73 2E 0A 00 00 00 00 00 00 00 00 00 cters........... data area \| 0000B0 52 6F 73 65 73 20 61 72 65 20 72 65 64 2E 20 56 Roses are red. V \| 0000C0 69 6F 6C 65 74 73 20 61 72 65 20 62 6C 75 65 2E iolets are blue. \| 0000D0 0A 54 68 69 73 20 74 65 78 74 20 69 73 20 36 31 .This text is 61 \ 0000E0 20 63 68 61 72 73 20 6C 6F 6E 67 2E 0A 00 00 00 chars long.....		Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 0 Directory entries: 4 Available direntries: 0 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3'

So what does it mean to "delete" a file? Deleting a file simply means changing the first character of the filename (in the directory entry) to a special character. That special character will be the question mark ? character. Then, we will set all of the file's FAT entries to 0. The data blocks will be left intact. BTW, the ? character is an illegal character in a filename (in our system, as well as others), which means it won't conflict with real file names.

This is the algorithm to delete a file in your system:

Lookup (find) the filename in the directory entries.

If not found, emit error message and stop.

Set the first character of the filename to '?'
Set all FAT entries for the file to 0.
Update appropriate counters in the Superblock (available_sectors, available_direntries).

Let's delete file23.txt first.

Raw filesystem dump (240 bytes) Super block information (32 bytes)

00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- super block / 000000 08 00 01 00 10 00 02 00 04 00 01 00 FA 00 00 00 ................ \ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... / 000020 66 69 6C 65 39 2E 74 78 74 00 00 00 09 00 00 00 file9.txt....... dir. entries | 000030 66 69 6C 65 31 36 2E 74 78 74 01 00 10 00 00 00 file16.txt...... | 000040 3F 69 6C 65 32 33 2E 74 78 74 02 00 17 00 00 00 ?ile23.txt...... \ 000050 66 69 6C 65 36 31 2E 74 78 74 04 00 3D 00 00 00 file61.txt..=... FAT | 000060 FF FF FF FF 00 00 00 00 05 00 06 00 07 00 FF FF ................ / 000070 39 20 63 68 61 72 73 2E 0A 00 00 00 00 00 00 00 9 chars......... | 000080 31 36 20 63 68 61 72 61 63 74 65 72 73 2E 2E 0A 16 characters... | 000090 45 78 61 63 74 6C 79 20 32 33 20 63 68 61 72 61 Exactly 23 chara | 0000A0 63 74 65 72 73 2E 0A 00 00 00 00 00 00 00 00 00 cters........... data area | 0000B0 52 6F 73 65 73 20 61 72 65 20 72 65 64 2E 20 56 Roses are red. V | 0000C0 69 6F 6C 65 74 73 20 61 72 65 20 62 6C 75 65 2E iolets are blue. | 0000D0 0A 54 68 69 73 20 74 65 78 74 20 69 73 20 36 31 .This text is 61 \ 0000E0 20 63 68 61 72 73 20 6C 6F 6E 67 2E 0A 00 00 00 chars long.....

Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 2 Directory entries: 4 Available direntries: 1 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3'

Raw filesystem dump (240 bytes)		Super block information (32 bytes)
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- super block / 000000 08 00 01 00 10 00 02 00 04 00 01 00 FA 00 00 00 ................ \ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... / 000020 66 69 6C 65 39 2E 74 78 74 00 00 00 09 00 00 00 file9.txt....... dir. entries \| 000030 66 69 6C 65 31 36 2E 74 78 74 01 00 10 00 00 00 file16.txt...... \| 000040 3F 69 6C 65 32 33 2E 74 78 74 02 00 17 00 00 00 ?ile23.txt...... \ 000050 66 69 6C 65 36 31 2E 74 78 74 04 00 3D 00 00 00 file61.txt..=... FAT \| 000060 FF FF FF FF 00 00 00 00 05 00 06 00 07 00 FF FF ................ / 000070 39 20 63 68 61 72 73 2E 0A 00 00 00 00 00 00 00 9 chars......... \| 000080 31 36 20 63 68 61 72 61 63 74 65 72 73 2E 2E 0A 16 characters... \| 000090 45 78 61 63 74 6C 79 20 32 33 20 63 68 61 72 61 Exactly 23 chara \| 0000A0 63 74 65 72 73 2E 0A 00 00 00 00 00 00 00 00 00 cters........... data area \| 0000B0 52 6F 73 65 73 20 61 72 65 20 72 65 64 2E 20 56 Roses are red. V \| 0000C0 69 6F 6C 65 74 73 20 61 72 65 20 62 6C 75 65 2E iolets are blue. \| 0000D0 0A 54 68 69 73 20 74 65 78 74 20 69 73 20 36 31 .This text is 61 \ 0000E0 20 63 68 61 72 73 20 6C 6F 6E 67 2E 0A 00 00 00 chars long.....		Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 2 Directory entries: 4 Available direntries: 1 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3'

The bytes that are highlighted above are the ones that were modified when the file was deleted. Notice that the data blocks are still intact. The super block has also been updated to reflect the new state of the filesystem.

Note: The fact that the data blocks are left untouched is what allows special utility programs to undelete files that have been deleted. As long as the data blocks do not get re-used, it is possible to recover the data from a deleted file. Some filesystems make this undeletion easier or harder than others. There are also special tools that will perform a secure erase, which means that all of the data blocks are overwritten with zeros or random garbage to prevent any attempt at recovering the data.

This is what the filesystem looks like after deleting (in any order) the remaining files:

Raw filesystem dump (240 bytes) Super block information (32 bytes)

00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- super block / 000000 08 00 01 00 10 00 08 00 04 00 04 00 FA 00 00 00 ................ \ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... / 000020 3F 69 6C 65 39 2E 74 78 74 00 00 00 09 00 00 00 ?ile9.txt....... dir. entries | 000030 3F 69 6C 65 31 36 2E 74 78 74 01 00 10 00 00 00 ?ile16.txt...... | 000040 3F 69 6C 65 32 33 2E 74 78 74 02 00 17 00 00 00 ?ile23.txt...... \ 000050 3F 69 6C 65 36 31 2E 74 78 74 04 00 3D 00 00 00 ?ile61.txt..=... FAT | 000060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ / 000070 39 20 63 68 61 72 73 2E 0A 00 00 00 00 00 00 00 9 chars......... | 000080 31 36 20 63 68 61 72 61 63 74 65 72 73 2E 2E 0A 16 characters... | 000090 45 78 61 63 74 6C 79 20 32 33 20 63 68 61 72 61 Exactly 23 chara | 0000A0 63 74 65 72 73 2E 0A 00 00 00 00 00 00 00 00 00 cters........... data area | 0000B0 52 6F 73 65 73 20 61 72 65 20 72 65 64 2E 20 56 Roses are red. V | 0000C0 69 6F 6C 65 74 73 20 61 72 65 20 62 6C 75 65 2E iolets are blue. | 0000D0 0A 54 68 69 73 20 74 65 78 74 20 69 73 20 36 31 .This text is 61 \ 0000E0 20 63 68 61 72 73 20 6C 6F 6E 67 2E 0A 00 00 00 chars long.....

Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 8 Directory entries: 4 Available direntries: 4 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3'

Raw filesystem dump (240 bytes)		Super block information (32 bytes)
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- super block / 000000 08 00 01 00 10 00 08 00 04 00 04 00 FA 00 00 00 ................ \ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... / 000020 3F 69 6C 65 39 2E 74 78 74 00 00 00 09 00 00 00 ?ile9.txt....... dir. entries \| 000030 3F 69 6C 65 31 36 2E 74 78 74 01 00 10 00 00 00 ?ile16.txt...... \| 000040 3F 69 6C 65 32 33 2E 74 78 74 02 00 17 00 00 00 ?ile23.txt...... \ 000050 3F 69 6C 65 36 31 2E 74 78 74 04 00 3D 00 00 00 ?ile61.txt..=... FAT \| 000060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ / 000070 39 20 63 68 61 72 73 2E 0A 00 00 00 00 00 00 00 9 chars......... \| 000080 31 36 20 63 68 61 72 61 63 74 65 72 73 2E 2E 0A 16 characters... \| 000090 45 78 61 63 74 6C 79 20 32 33 20 63 68 61 72 61 Exactly 23 chara \| 0000A0 63 74 65 72 73 2E 0A 00 00 00 00 00 00 00 00 00 cters........... data area \| 0000B0 52 6F 73 65 73 20 61 72 65 20 72 65 64 2E 20 56 Roses are red. V \| 0000C0 69 6F 6C 65 74 73 20 61 72 65 20 62 6C 75 65 2E iolets are blue. \| 0000D0 0A 54 68 69 73 20 74 65 78 74 20 69 73 20 36 31 .This text is 61 \ 0000E0 20 63 68 61 72 73 20 6C 6F 6E 67 2E 0A 00 00 00 chars long.....		Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 8 Directory entries: 4 Available direntries: 4 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3'

At this point, the filesystem is empty and 4 new files can be created. When the new files are created, the data blocks will get re-used and the old data will get overwritten. That's what we'll look at next.

Creating/Deleting/Creating = Fragmentation

Up until now, we haven't re-used any data blocks. We either created new files with previously unused data blocks or deleted (marked) files. If we never created another file once we started deleting files, life would be simple. Now, however, we are going to start to develop "holes" in the filesystem when we delete files that are in-between other files. Examples will clarify.

When we want to create a new file, how do we know which blocks to use? This simple filesystem uses a very simple and straight-forward technique: The first block that is available (as determined by the file allocation table) will be the first block for our new file. If the file requires multiple blocks, we will continue searching forward in a linear fashion looking for free blocks in the FAT.

Example: Suppose we have a full filesystem with 8 small files. (Yes, it's unrealistic.) Each file fits into a single block. Assume there are only 8 data blocks and each block is 512 bytes. There are also only 8 directory entries. The data area would look something like this:

Graphically, the filesystem would look something like this:

If we were to delete files FileB, FileD, FileF, and FileH, we'd have this:

and the filesystem would look something like this:

Observations:

The directory entries are marked available/free (because of the '?' character as the first one in the name).
The corresponding FAT entries have been set to 0, indicating that the corresponding data block is free.
The free data blocks are shown in white because, although the data from the previous files are still intact, the blocks are available for reuse, which is what the FAT is indicating.

Now, suppose we have a bunch of data that we want to save to a file called FileZ. The data contains 2048 bytes and requires 4 blocks:

We don't have 2,048 contiguous bytes (4 contiguous blocks) so we are going to have to split up the data into 4 blocks that will look something like this:

This is what the filesystem looks like after adding the file:

You can clearly see that FileZ is fragmented. However, this is a much better outcome than just saying that the disk is full and wasting all of the available blocks.

Self-check: Make sure that you can explain what each value in all of the diagrams represents. This will let you know if you understand how this works.

In a nutshell, this is how file fragmentation happens. Let's see how this will be handled with our simple filesystem.

First, let's create a filesystem with several small files. (Deleting small files and then creating a large file is generally how fragmentation happens.) This is the configuration of the file system.

Attribute Value

total_sectors 8

sectors_per_cluster 1

bytes_per_sectors 16

total_direntries 8

Attribute	Value
total_sectors	8
sectors_per_cluster	1
bytes_per_sectors	16
total_direntries	8

And here is a full filesystem (shown in sections) with 8 files.

Raw filesystem dump (304 bytes) Super block information (32 bytes)

00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- 000000 08 00 01 00 10 00 00 00 08 00 00 00 FA 00 00 00 ................ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... 000020 46 69 6C 65 41 00 00 00 00 00 00 00 10 00 00 00 FileA........... 000030 46 69 6C 65 42 00 00 00 00 00 01 00 10 00 00 00 FileB........... 000040 46 69 6C 65 43 00 00 00 00 00 02 00 10 00 00 00 FileC........... 000050 46 69 6C 65 44 00 00 00 00 00 03 00 10 00 00 00 FileD........... 000060 46 69 6C 65 45 00 00 00 00 00 04 00 10 00 00 00 FileE........... 000070 46 69 6C 65 46 00 00 00 00 00 05 00 10 00 00 00 FileF........... 000080 46 69 6C 65 47 00 00 00 00 00 06 00 10 00 00 00 FileG........... 000090 46 69 6C 65 48 00 00 00 00 00 07 00 10 00 00 00 FileH........... 0000A0 FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF ................ 0000B0 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA 0000C0 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 BBBBBBBBBBBBBBBB 0000D0 43 43 43 43 43 43 43 43 43 43 43 43 43 43 43 43 CCCCCCCCCCCCCCCC 0000E0 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 DDDDDDDDDDDDDDDD 0000F0 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 EEEEEEEEEEEEEEEE 000100 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 FFFFFFFFFFFFFFFF 000110 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 GGGGGGGGGGGGGGGG 000120 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 HHHHHHHHHHHHHHHH

Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 0 Directory entries: 8 Available direntries: 0 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3'

Raw filesystem dump (304 bytes)		Super block information (32 bytes)
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- 000000 08 00 01 00 10 00 00 00 08 00 00 00 FA 00 00 00 ................ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... 000020 46 69 6C 65 41 00 00 00 00 00 00 00 10 00 00 00 FileA........... 000030 46 69 6C 65 42 00 00 00 00 00 01 00 10 00 00 00 FileB........... 000040 46 69 6C 65 43 00 00 00 00 00 02 00 10 00 00 00 FileC........... 000050 46 69 6C 65 44 00 00 00 00 00 03 00 10 00 00 00 FileD........... 000060 46 69 6C 65 45 00 00 00 00 00 04 00 10 00 00 00 FileE........... 000070 46 69 6C 65 46 00 00 00 00 00 05 00 10 00 00 00 FileF........... 000080 46 69 6C 65 47 00 00 00 00 00 06 00 10 00 00 00 FileG........... 000090 46 69 6C 65 48 00 00 00 00 00 07 00 10 00 00 00 FileH........... 0000A0 FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF ................ 0000B0 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA 0000C0 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 BBBBBBBBBBBBBBBB 0000D0 43 43 43 43 43 43 43 43 43 43 43 43 43 43 43 43 CCCCCCCCCCCCCCCC 0000E0 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 DDDDDDDDDDDDDDDD 0000F0 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 EEEEEEEEEEEEEEEE 000100 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 FFFFFFFFFFFFFFFF 000110 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 GGGGGGGGGGGGGGGG 000120 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 HHHHHHHHHHHHHHHH		Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 0 Directory entries: 8 Available direntries: 0 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3'

There are 8 files that are all 16 bytes in size. This means they each consume exactly one full data block. The contents match the name of the files. This is so you can easily see where one file stops and the next file begins. You'll also be able to easily see the fragmented files because they won't be contiguous.

Self-check: At this point, you should be able to explain what every byte in the output above means. That will let you know if you understand what's been going on here.

Ok, so let's delete 4 files: FileB, FileD, FileF, FileH. This will leave us with 4 data blocks, none of which are contiguous, meaning, they are fragmented:

Raw filesystem dump (304 bytes) Super block information (32 bytes)

00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- 000000 08 00 01 00 10 00 04 00 08 00 04 00 FA 00 00 00 ................ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... 000020 46 69 6C 65 41 00 00 00 00 00 00 00 10 00 00 00 FileA........... 000030 3F 69 6C 65 42 00 00 00 00 00 01 00 10 00 00 00 ?ileB........... 000040 46 69 6C 65 43 00 00 00 00 00 02 00 10 00 00 00 FileC........... 000050 3F 69 6C 65 44 00 00 00 00 00 03 00 10 00 00 00 ?ileD........... 000060 46 69 6C 65 45 00 00 00 00 00 04 00 10 00 00 00 FileE........... 000070 3F 69 6C 65 46 00 00 00 00 00 05 00 10 00 00 00 ?ileF........... 000080 46 69 6C 65 47 00 00 00 00 00 06 00 10 00 00 00 FileG........... 000090 3F 69 6C 65 48 00 00 00 00 00 07 00 10 00 00 00 ?ileH........... 0000A0 FF FF 00 00 FF FF 00 00 FF FF 00 00 FF FF 00 00 ................ 0000B0 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA 0000C0 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 BBBBBBBBBBBBBBBB 0000D0 43 43 43 43 43 43 43 43 43 43 43 43 43 43 43 43 CCCCCCCCCCCCCCCC 0000E0 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 DDDDDDDDDDDDDDDD 0000F0 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 EEEEEEEEEEEEEEEE 000100 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 FFFFFFFFFFFFFFFF 000110 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 GGGGGGGGGGGGGGGG 000120 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 HHHHHHHHHHHHHHHH

Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 4 Directory entries: 8 Available direntries: 4 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3'

Raw filesystem dump (304 bytes)		Super block information (32 bytes)
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- 000000 08 00 01 00 10 00 04 00 08 00 04 00 FA 00 00 00 ................ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... 000020 46 69 6C 65 41 00 00 00 00 00 00 00 10 00 00 00 FileA........... 000030 3F 69 6C 65 42 00 00 00 00 00 01 00 10 00 00 00 ?ileB........... 000040 46 69 6C 65 43 00 00 00 00 00 02 00 10 00 00 00 FileC........... 000050 3F 69 6C 65 44 00 00 00 00 00 03 00 10 00 00 00 ?ileD........... 000060 46 69 6C 65 45 00 00 00 00 00 04 00 10 00 00 00 FileE........... 000070 3F 69 6C 65 46 00 00 00 00 00 05 00 10 00 00 00 ?ileF........... 000080 46 69 6C 65 47 00 00 00 00 00 06 00 10 00 00 00 FileG........... 000090 3F 69 6C 65 48 00 00 00 00 00 07 00 10 00 00 00 ?ileH........... 0000A0 FF FF 00 00 FF FF 00 00 FF FF 00 00 FF FF 00 00 ................ 0000B0 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA 0000C0 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 BBBBBBBBBBBBBBBB 0000D0 43 43 43 43 43 43 43 43 43 43 43 43 43 43 43 43 CCCCCCCCCCCCCCCC 0000E0 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 DDDDDDDDDDDDDDDD 0000F0 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 EEEEEEEEEEEEEEEE 000100 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 FFFFFFFFFFFFFFFF 000110 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 GGGGGGGGGGGGGGGG 000120 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 HHHHHHHHHHHHHHHH		Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 4 Directory entries: 8 Available direntries: 4 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3'

Now, we create a new file that is 64 bytes in size called FileZ:

Raw filesystem dump (304 bytes) Super block information (32 bytes)

00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- 000000 08 00 01 00 10 00 00 00 08 00 03 00 FA 00 00 00 ................ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... 000020 46 69 6C 65 41 00 00 00 00 00 00 00 10 00 00 00 FileA........... 000030 46 69 6C 65 5A 00 00 00 00 00 01 00 40 00 00 00 FileZ.......@... 000040 46 69 6C 65 43 00 00 00 00 00 02 00 10 00 00 00 FileC........... 000050 3F 69 6C 65 44 00 00 00 00 00 03 00 10 00 00 00 ?ileD........... 000060 46 69 6C 65 45 00 00 00 00 00 04 00 10 00 00 00 FileE........... 000070 3F 69 6C 65 46 00 00 00 00 00 05 00 10 00 00 00 ?ileF........... 000080 46 69 6C 65 47 00 00 00 00 00 06 00 10 00 00 00 FileG........... 000090 3F 69 6C 65 48 00 00 00 00 00 07 00 10 00 00 00 ?ileH........... 0000A0 FF FF 03 00 FF FF 05 00 FF FF 07 00 FF FF FF FF ................ 0000B0 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA 0000C0 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A ZZZZZZZZZZZZZZZZ 0000D0 43 43 43 43 43 43 43 43 43 43 43 43 43 43 43 43 CCCCCCCCCCCCCCCC 0000E0 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A ZZZZZZZZZZZZZZZZ 0000F0 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 EEEEEEEEEEEEEEEE 000100 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A ZZZZZZZZZZZZZZZZ 000110 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 GGGGGGGGGGGGGGGG 000120 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A ZZZZZZZZZZZZZZZZ

Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 0 Directory entries: 8 Available direntries: 3 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3'

Raw filesystem dump (304 bytes)		Super block information (32 bytes)
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F -------------------------------------------------------------------------- 000000 08 00 01 00 10 00 00 00 08 00 03 00 FA 00 00 00 ................ 000010 00 00 00 00 00 00 00 00 56 46 53 2D 33 00 00 00 ........VFS-3... 000020 46 69 6C 65 41 00 00 00 00 00 00 00 10 00 00 00 FileA........... 000030 46 69 6C 65 5A 00 00 00 00 00 01 00 40 00 00 00 FileZ.......@... 000040 46 69 6C 65 43 00 00 00 00 00 02 00 10 00 00 00 FileC........... 000050 3F 69 6C 65 44 00 00 00 00 00 03 00 10 00 00 00 ?ileD........... 000060 46 69 6C 65 45 00 00 00 00 00 04 00 10 00 00 00 FileE........... 000070 3F 69 6C 65 46 00 00 00 00 00 05 00 10 00 00 00 ?ileF........... 000080 46 69 6C 65 47 00 00 00 00 00 06 00 10 00 00 00 FileG........... 000090 3F 69 6C 65 48 00 00 00 00 00 07 00 10 00 00 00 ?ileH........... 0000A0 FF FF 03 00 FF FF 05 00 FF FF 07 00 FF FF FF FF ................ 0000B0 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA 0000C0 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A ZZZZZZZZZZZZZZZZ 0000D0 43 43 43 43 43 43 43 43 43 43 43 43 43 43 43 43 CCCCCCCCCCCCCCCC 0000E0 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A ZZZZZZZZZZZZZZZZ 0000F0 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 EEEEEEEEEEEEEEEE 000100 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A ZZZZZZZZZZZZZZZZ 000110 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 GGGGGGGGGGGGGGGG 000120 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A 5A ZZZZZZZZZZZZZZZZ		Total sectors: 8 Sectors per cluster: 1 Bytes per sector: 16 Available sectors: 0 Directory entries: 8 Available direntries: 3 Filesystem type: FA Reserved: 00 00 00 00 00 00 00 00 00 00 00 Label: 'VFS-3'

It's clear that FileZ has been fragmented into 4 non-contiguous blocks. Again, this is more desirable than simply truncating the file or not storing any of it at all. This was a conscious design decision that the inventors of the FAT-like filesystem (and most others, too) made.

Some Implementation Details

To make the implementation easier, you should have pointers to each of the four sections in the filesystem.

Notes: (The diagram is not to scale.)

The super block is 32 bytes.

typedef struct SuperBlock
{
  unsigned short total_sectors;        // Total number of sectors
  unsigned short sectors_per_cluster;  // Number of sectors in each cluster
  unsigned short bytes_per_sector;     // Number of bytes in each sector
  unsigned short available_sectors;    // Number of sectors available
  unsigned short total_direntries;     // Total number of directory entries
  unsigned short available_direntries; // Number of available dir entries
  unsigned char fs_type;               // File system type (FA for SFAT)
  unsigned char reserved[11];          // Reserved, all set to 0
  unsigned char label[8];              // Not guaranteed to be NUL-terminated
}SuperBlock;

Each directory entry is 16 bytes.

typedef struct DirEntry
{
  unsigned char name[10];   // Not guaranteed to be NUL-terminated
  unsigned short fat_entry; // First data block 
  unsigned int size;        // Size of the file in bytes
}DirEntry;

Each FAT entry is 2 bytes. (unsigned short)
The size of a data block (sector) varies. (superblock->bytes_per_sector)
The number of directory entries varies. (superblock->total_direntries)
The number of FAT entries varies and must be the same as the data blocks. (superblock->total_sectors)
The number of data blocks varies. (superblock->total_sectors)

Calculating the pointers is simple pointer arithmetic. Remember, the entire filesystem is just an array of unsigned characters (bytes). How they are interpreted depends on which section you are referring to. The pointers are all offsets from the beginning of the array and their types match what they are pointing at.

The super block is simply pointing at the first byte in the array. The type of the pointer is struct SuperBlock. (Offset: 0)
The dir entries pointer comes next. The type of the pointer is pointer to struct DirEntry. The offset is:
```
sizeof(struct SuperBlock)
```
The FAT pointer follows. The type of the pointer is pointer to unsigned short. The offset is:
```
sizeof(struct SuperBlock)) + number_of_direntries * sizeof(struct DirEntry)
```

The data blocks pointer follows. The type of the pointer is pointer to unsigned char. The offset is:

sizeof(struct SuperBlock)) + number_of_direntries * sizeof(struct DirEntry) + total_sectors * sizeof(unsigned short)

Suppose that you had these variable definitions:

struct SuperBlock *superblock_ptr;
struct DirEntry *direntries_ptr;
unsigned short *fat_ptr;
unsigned char *data_blocks_ptr;

Now, if you wanted to access any member in the superblock:

superblock_ptr->member;

To move to the next directory entry, fat entry, or data block, you could do something like this:

direntries_ptr++;                    /* Move to next directory entry */
fat_ptr++;                           /* Move to next FAT entry       */
data_blocks_ptr += bytes_per_sector; /* Move to next data block      */

Or you could use random access with subscripting for the directory entries and FAT:

direntries_ptr[3]; /* 4th directory entry */
fat_ptr[8];        /* 9th fat entry       */

Since the entire filesystem is an array of unsigned characters, you will need to do some casting to get the other pointer types to work properly. This is expected, especially with low-level code such as a filesystem. (Memory is similar in that it's just an array of characters.)

Once you have these pointers set up properly, moving through the various sections of the filesystem to locate objects is trivial and very efficient.

Self-check:
1. Suppose you want to support filenames up to 255 bytes in length. What changes would you have to make? What are the pros and cons of that?
2. Suppose you wanted to support millions of files. What changes would you need to make?
3. Suppose you wanted to handle files that were GBs in size. What changes would you need to make?
4. Supposed you wanted to keep track of a files last modification time/data and owner. What changes would need to be made?
5. Currently, we can only create new files. If you wanted to be able to append to an existing file, how would you go about that?

Why The File Allocation Table?

In essence, the scheme that we're using here is based on a singly-linked list. The difference is that we are storing each next pointer of a "node" (block) in a separate array, instead of in the node/block itself. Why is that? In a word: Performance.

The speed of memory is measured in nanoseconds (1 billionth of a second) and the speed of disks (HDD or SSD) is measured in milliseconds (1 thousandth of a second). That makes memory 1,000,000 times faster than the disk. Or, another way to say it is that the disk is 1,000,000 times slower than memory. There is overhead in accessing memory, so to compensate we'll just say that memory access takes microseconds (1 millionth of a second), so the disk is only 1,000 times slower (and that's being generous.)

Bottom line: We want to avoid accessing the disk unless we absolutely, positively must.

Linked lists are used everywhere in programming, but they are usually all in memory. For example, the list below shows 5 blocks of data (nodes) stored in memory. Each node has a next pointer that points to the location of the next node. We can assume that each block contains a few thousand bytes, so "stealing" a handful (4 or 8 bytes for a pointer) at the end has a negligible impact (low overhead).

Once the first block (head, address 2000) in the list is located, reading the next pointer tells us where the next block is. It is just a matter of reading memory to get the next pointer, so it's very fast. We just continue following the next pointers until we reach the end. Since memory is very fast, traversing the list in memory is efficient.

None of this information is new or enlightening to anyone reading this. However, when the nodes (blocks) are on the disk and not in memory, that's when things get interesting (and glacially slow) so a different technique is required.

Suppose each block of data contains 500 bytes of data. The five blocks combined gives us 2,500 bytes of data. This is where the bytes are stored:

Bytes 000 - 499 are in block #0 starting at address 2000.
Bytes 500 - 999 are in block #1 starting at address 3000.
Bytes 1000 - 1499 are in block #2 starting at address 4000.
Bytes 1500 - 1999 are in block #3 starting at address 5000.
Bytes 2000 - 2499 are in block #4 starting at address 6000.

Now suppose you want to read byte number 2300. You know it's in block #4 (2300 / 500 is 4 with integer division) which is in the fifth block, but you don't know where the fifth block is until you traverse the linked list. Unlike arrays, there is no random access to these blocks so we have to walk through all of the preceding blocks to obtain the next pointer and continue until we get to the byte we want.

Now, imagine that those 5 blocks of data are stored on a disk instead of memory. We'll use the same addresses for demonstration purposes. How much effort (read: disk access) is required to read the 5 blocks? The algorithm is the same as before, except instead of reading the next pointers from memory, we have to read them from the disk. Since the disk is 1,000 times slower (again, that's being generous) than memory, it takes 1,000 times longer to read all of the next pointers! That will make any program completely unusable. 60 frames per second? Try more like 0.06 fps.

The fundamental problem with this strategy (when the blocks/nodes are on the disk) is that we have to read the entire block in order to get the next pointer. And reading disk blocks is extremely slow. We are spending way too much time reading unnecessary data just to get to the next pointers.

Here's an idea: What if we separated the data from the next pointers and kept the data (very large) on the disk and kept the next pointers (very small) in memory? That's where the file allocation table comes in.

Of course, we still have to read the next pointers from the disk, but we can read a lot of them (maybe thousands at-a-time) with one disk read and then traverse them in memory instead of reading each one from disk. Once we find the one we need, we can then just make one additional disk read to get the byte(s) we're interested in.

Back to the original problem:

Showing the "logical next pointers" in the file allocation table:

Filesystem Tools

At this point, we have a very rudimentary filesystem in place. There are really only two things that you can do with it and they are:

Create new files
Delete existing files

That's it! We can't even view a file or get any information about one. We also can't update an existing file. We would have to delete it and then re-create it. However, many files can't be updated (e.g. text files) and the only way to modify them is to do what I just described (delete, re-create). Still, we need to begin adding more functionality. Remember, the goal is to gain some insight into how filesystems work.

Also, we are tracking very little information about each file. In fact, we only track the name and size. We don't keep track of times, dates, owner/group, types (e.g. executable, directory), permissions like read/write, etc. This was just to keep things very simple at the beginning. Later, you will see how to easily add some of this information. We also don't have subdirectories; every file is in the same "global/root" directory, not unlike some older filesystems that didn't have anything but a single, root directory. MS-DOS didn't support hard disks or subdirectories until version 2.0. (It only supported floppies before then and all files were in the "root" directory.)

Some other things we'd like to do at this point.

Get a listing of the files, displaying all of the attributes. For example, a typical filesystem might show this with the command: ls -l
```
-rw-r--r-- 1 chico chico 12,401 Nov 28 15:31 driver.c
-rw-r--r-- 1 chico chico 12,884 Nov 28 18:49 dumpvfs.c
-rw-r--r-- 1 chico chico  7,423 Nov 27 17:28 mkvfs.c
-rw-r--r-- 1 chico chico  9,138 Nov 26 17:37 vfs.c
```
This shows the type, permissions, owner, group, size, date, time, and filename. Our simple filesystem will only be able to display a size and a name.

Get statistics from a file with the stat command like this: stat vfs.c:

  File: vfs.c
  Size: 9138      	Blocks: 24         IO Block: 4096   regular file
Device: 805h/2053d	Inode: 1844721     Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/   chico)   Gid: ( 1000/   chico)
Access: 2021-11-29 16:21:59.648300925 -0800
Modify: 2021-11-26 17:37:16.017421021 -0800
Change: 2021-11-27 19:34:19.123915757 -0800
 Birth: -

This gives us more detail about the file's storage. The inode is somewhat analagous to the file allocation table (FAT) that we are using.

Get information about the filesystem itself (using stat):

    ID: 1938f8db12bca83c Namelen: 255     Type: ext2/ext3
Block size: 4096       Fundamental block size: 4096
Blocks: Total: 25066860   Free: 16391150   Available: 15107054
Inodes: Total: 6406144    Free: 5900127

Using tune2fs you can get a lot more information.

Display the contents of a file using cat. For example, if we did cat poem.txt, we might see something like this on the screen:
```
Roses are red.
Violets are blue.
Some poems rhyme.
But not this one.
```
If we have a very large file, we will have to locate all of the blocks (in order) and display the contents of each, one after the other.

How much space is free in the filesystem? Use the df (diskfree) command to find out:

Filesystem             1B-blocks               Used        Available   Use%   Mounted on
/dev/sda1         58,787,778,560     26,084,511,744   29,686,591,488    47%   / 
/dev/sda5        102,673,858,560     35,536,543,744   61,877,657,600    37%   /home
/dev/sdb1        206,289,469,440     40,816,304,128  154,970,628,096    21%   /backups
/dev/sdb2        777,949,659,136     31,258,271,744  707,150,217,216     5%   /extra
/dev/sdc1      1,968,874,242,048  1,315,211,378,688  553,626,198,016    71%   /storage

or df -h (human-readable)

Filesystem      Size  Used Avail  Use%  Mounted on
/dev/sda1        55G   25G   28G   47%  /
/dev/sda5        96G   34G   58G   37%  /home
/dev/sdb1       193G   39G  145G   21%  /backups
/dev/sdb2       725G   30G  659G    5%  /extra
/dev/sdc1       1.8T  1.2T  516G   71%  /storage

There are several other "features" that we'd like to implement, such as:
- Renaming a file
- Defragging the files (They're going to get fragmented.)
- Secure erase: Instead of just marking a file as deleted, zero-out all of the data.
- Checking the filesystem for consistency (i.e. corruption)
- Displaying various internal sections of the filesystem (e.g. directory area, FAT)
- Other things as well...
Some of these features would be for "normal" users (e.g. renaming) and others would be used by an administrator (e.g. checking corruption, defragging) or for debugging by a developer (e.g. displaying the FAT).

Development Tools

Now that you are working with structured binary files, you may need additional custom tools to help you prove to yourselves that your output is correct. This is especially true with binary files. Text files are completely human-readable and so we never needed any special tools to view them. We would generate output and could tell just by looking at it (sometimes) if it was correct. For complicated text output, we used a diff tool to help us find the differences.

However, with binary files, it's more work because we can't just look at the files (easily) to see the differences, nor can we use a text-based diff tool (easily). Fortunately, there are a few tools that we can use.

Here are some suggested tools that I used or created during development of the filesystem:

cmp - Like diff, but for binary files. The output is not very human-friendly, but it does work. Once you've determined that your output is different, you will want to use another tool to display the differences. Of course, if cmp shows no differences, you're done! These tools are built-in to Linux and macOS. You can get them for Windows with the Cygwin tools.

dumpit - Yes, this has been a work-horse of a program for years. Many of my web pages display binary data in hex output and this tool (IMHO) is the best for that job. By "converting" the binary files into text files, you can then use diff to find the differences in a more human-readable way. Just take any filesystem. For example:
```
dumpit vfs3.fs > vfs3.txt
```
will produce a text version of the binary file. Do this with any of the input and output filesystems then you can use diff. Here's the help screen from the program: (The color probably won't work on Windows.)
```
dumpit version 1.01
Usage: dumpit [options] [file1 file2 ...] 

Options:
 -c   --color     Use colorized output.
 -lX  --length=X  Stop after printing X characters. (Default is entire file.)
 -oX  --offset=X  Start at byte X in the file. (Default is 0.)
 -v   --version   Display the version and quit.
 -wX  --width=X   How wide the output is. (Default is 16 chars.)
 -h   --help      Display this information.

If no input files are specified, read from stdin.
```

dumpvfs - This is a tool that prints out a lot of information about a filesystem. For example, running this command:

./dumpvfs vfs.fs

gives this output:

         Filesystem label: 'fat'
          Filesystem type:    FA
          Filesystem size: 526624 bytes
       Size of SuperBlock:    32 bytes
              Size of FAT:  2048 bytes
   Size of directory area:   256 bytes
        Size of data area: 524288 bytes
    Sectors in filesystem:  1024
         Bytes per sector:   512 bytes
        Sectors per block:     1
      Max number of files:    16
     Largest file allowed: 524288 bytes
      Data sectors in use:    84
   Data sectors available:   940
       Data blocks in use:    84
    Data blocks available:   940
             Files in use:     5
          Files remaining:    11
            Size of files: 41942 bytes
  Space consumed by files: 43008 bytes
   Percent usage (blocks):     8.20%
    Percent usage (bytes):     8.00%
   Internal fragmentation:     2.48%
   External fragmentation: TBD

mkvfs - This is a tool that will create filesystems with any number and size of files. It has several options. This is from the help screen:

mkvfs version 1.00
Usage: mkvfs [options] [file1 file2 ...] 

Options:
 -sX  --total-sectors=X         The total number of sectors in the data area. (Default: 16)
 -bX  --bytes-per-sector=X      The number of bytes per sector. (Default: 16)
 -cX  --sectors-per-cluster=X   The number of sectors per cluster. (Default: 1)
 -dX  --directory-entries=X     The total number of directory entries. (Default: 4)
 -lX  --label=X                 The filesystem label. (Default: "SFAT")
 -fX  --fstype=X                The filesystem type. (Default: 0xFA)
 -oX  --output=X                The name of the file to create. (Default: vfs.fs)
 -h   --help                    Display this information.

If no input files are specified, an empty filesystem is created.

Example: Suppose there are two files: pic.png (2395 bytes) and decl.txt (2037 bytes). To create a filesystem that includes both of those files, you could execute a command like this (long and short options):

./mkvfs --total-sectors 12 --bytes-per-sector 512 --directory-entries 8 --output vfsx.fs pic.png decl.txt
./mkvfs -s12 -b512 -d8 -o vfsx.fs pic.png decl.txt

Output:

         filename: vfsx.fs
    total_sectors:   12
 bytes_per_sector:  512
sectors-per-block:    1
directory-entries:    8
            label: SFAT
          fs-type: FA
  Number of files: 2
File 001: pic.png
File 002: decl.txt

Running dumpvfs on the output file:

         Filesystem label: 'SFAT'
          Filesystem type:    FA
          Filesystem size:  6328 bytes
       Size of SuperBlock:    32 bytes
              Size of FAT:    24 bytes
   Size of directory area:   128 bytes
        Size of data area:  6144 bytes
    Sectors in filesystem:    12
         Bytes per sector:   512 bytes
        Sectors per block:     1
      Max number of files:     8
     Largest file allowed:  6144 bytes
      Data sectors in use:     9
   Data sectors available:     3
       Data blocks in use:     9
    Data blocks available:     3
             Files in use:     2
          Files remaining:     6
            Size of files:  4432 bytes
  Space consumed by files:  4608 bytes
   Percent usage (blocks):    75.00%
    Percent usage (bytes):    72.14%
   Internal fragmentation:     3.82%
   External fragmentation: TBD

statvfs - This is a tool that will generate a stat-like output for the files in the filesystem. It requires the filename of the filesystem (e.g. vfs.fs) and the name of a file in the filesystem (e.g. pic.png):

./statvfs vfsx.fs pic.png

Output:

   File: pic.png
   Size: 2395      Blocks: 5       IO Block: 512     FAT entry: 0       
On disk: 2560      Wasted: 6.45%
Sectors: 0 1 2 3 4

If you don't provide a file from the filesystem, all files are processed:

./statvfs vfsx.fs

Output:

   File: pic.png
   Size: 2395      Blocks: 5       IO Block: 512     FAT entry: 0       
On disk: 2560      Wasted: 6.45%
Sectors: 0 1 2 3 4

   File: decl.txt
   Size: 2037      Blocks: 4       IO Block: 512     FAT entry: 5       
On disk: 2048      Wasted: 0.54%
Sectors: 5 6 7 8

lsvfs - This is a tool that will generate an ls-like output for the files in the filesystem. It requires the filename of the filesystem (e.g. vfs9.fs) and the name of a file in the filesystem (e.g. driver.c):

./lsvfs vfs9.fs driver.c

Output:

Blocks    Size     Name
[    31]    15550  driver.c

If you don't provide a file from the filesystem, all files are processed with a summary at the end:

./lsvfs vfs9.fs

Output:

Blocks    Size     Name
[     5]     2395  pic.png
[     4]     2037  decl.txt
[     1]       69  poem.txt
[    48]    24265  notes.odt
[   220]   112522  Doxyfile
[    31]    15550  driver.c
[    92]    47036  typescript
[   107]    54714  foo.c
File count: 8, Blocks used: 508, Size in bytes: 258588, Size on disk: 260096, Overhead: 0.6%

catvfs - This is a tool that will dump out the contents of a file to the screen. For binary files, you should redirect them to a file since it is unlikely that the output is meaningful. It works much like the built-in program cat works. It requires the filename of the filesystem (e.g. vfs9.fs) and the name of a file in the filesystem (e.g. poem.txt):
```
./catvfs vfs9.fs poem.txt
```
Output:
```
Roses are red.
Violets are blue.
Some poems rhyme.
But not this one.
```
This command:
```
./catvfs vfs9.fs pic.png > foo.png
```
creates a file in the "real" filesystem called foo.png. It contains the contents of the file pic.png from the SFAT filesystem called vfs9.fs. The command:
```
ls -l foo.png
```
shows this:
```
-rw-r--r-- 1 mmead mmead 2,395 Mar 30 13:23 foo.png
```
which shows that it is the same size as what's in vfs9.fs. You can now view the image file with any image viewer.

The Big Picture

OK, so how realistic is this filesystem design? In theory, it's a sound starting place for a usable, albeit, inefficient way to store large numbers of files. In practice, it would be too inefficient for all but the smallest systems. However, by understanding the implementation of this simple filesystem you can begin to see how more sophisticated systems can be built.

Let's see what we can, and more importantly, cannot do with our current implementation.

Our Implementation		Real-World^™ Implementations
Files Create Delete Filesystem Create Mount (Load)		Files Read (open/read/close text/binary) Modify (open/write/close text/binary) Grow (append) and shrink (truncate) Rename List (think: ls -l) Timestamps (create/modify/access date/time) Permissions (rwx, user/group/other) Type info (file, link, dir) Longer filenames Secure erase Directories! Filesystem Variable block sizes Verify (corruption) Repair (corruption) Defragment Compression Encryption Grow/shrink (w/o losing data) Hard links (Windows junctions) Soft (symbolic) links (Windows shortcuts) Error handling (e.g. duplicate names) Transactions Copy-on-write Snapshots Deduplication Concurrency (multiple processes accessing the filesystem simultaneously)

Our Implementation

Real-World^™ Implementations

Files
  Create
  Delete










Filesystem
  Create
  Mount (Load)

Files
  Read (open/read/close  text/binary)
  Modify (open/write/close  text/binary)
  Grow (append) and shrink (truncate)
  Rename
  List (think: ls -l)
  Timestamps (create/modify/access date/time)
  Permissions (rwx, user/group/other)
  Type info (file, link, dir)
  Longer filenames
  Secure erase
  Directories!

Filesystem
  Variable block sizes
  Verify (corruption)
  Repair (corruption)
  Defragment
  Compression
  Encryption
  Grow/shrink (w/o losing data)
  Hard links (Windows junctions)
  Soft (symbolic) links (Windows shortcuts)
  Error handling (e.g. duplicate names)
  Transactions
  Copy-on-write
  Snapshots
  Deduplication
  Concurrency (multiple processes accessing the filesystem simultaneously)

Well, that's a humbling list!

So, although our filesystem does almost nothing, you can easily see how you might extend it to support features found in modern filesystems. In fact, here are some questions for you to answer about this simple filesystem. How would you support:

Files:

longer filenames? Unlimited length?
time and date stamps?
permissions?
hard links
symbolic links (shortcuts)?
undelete?
secure erase?
directories?

Filesystem:

compression?
encryption?
defragmentation?
copy-on-write
deduplication

References

Links

File Allocation Table from Wikipedia.
The Design of the FAT system. from Wikipedia.
The FAT filesystem Recommended by the crew from OSTEP.
List of file systems.
Comparison of file systems.
A Five-Year Study of File-System Metadata.pdf