Case Study: A Simple Filesystem

  1. Overview
  2. Basic Structure
  3. Creating the Filesystem
  4. Creating files
  5. Deleting files
  6. Fragmentation
  1. Implementation Details
  2. Why the FAT?
  3. Filesystem Tools
  4. Development Tools
  5. The Big Picture
  6. References

Overview

At its most basic level a filesystem is simply one very large file that contains... other files. That's it. How these files are stored and retrieved is the subject of this document. By looking at the number of filesystems around (dozens to hundreds), you can see that there are many different techniques used to implement them and all of them have their pros and cons. Some are almost brain-dead simple and others are massively complex. We are going to look at the simpler methods, as this is an introduction.

This concept of a file storing other files is very common and you see/use them everyday without even realizing it. These files can be thought of as "mini-filesystems" because the files they operate on are just wrappers for many other files contained within. For example:

To see this, simply run this command on any of the file types above:
unzip -lv filename
and you'll see a directory of all files in the file. If you want to test the integrity of all of the files in the archive, use the -t option:
unzip -t filename

Applications that distribute many files will often wrap them all into one really large file. It's much easier to manage a single file than thousands of separate ones. Some examples (from my youth!)


The most fundamental property of a filesystem is that it is just a data structure, not unlike a linked list or array. In fact, some of the earliest filesystems were simple arrays and linked lists. The complexity of the data structure is directly related to the size, performance (speed), and information stored in the filesystem. Because not all systems require the same level of performance, there are hundreds of different filesystems.

Some are very simple and others are massively complex. Think of an iPod or TiVo or even a cell phone. The number of files these devices contain may be measured in the thousands (103). These devices have radically different needs than say the server farms at Google or Facebook, where the number of files are measured in the billions (109) and beyond. Those filesystems contain millions of terabytes of data as well. Most home users don't have anywhere near this kind of requirement.

To help understand the problems and challenges that filesystem developers are trying to solve, we will build a simple, yet effective, filesystem that can work for small-scale devices. After completing it, we will look at many other features that can be added to make it faster, more scalable, and more reliable and resilient.

Here is some of the information that we will need to store about a file:

  1. The file's name. (This is for humans.)
  2. The size of the file (in bytes).
  3. Where on the disk is the file stored? (This is the starting block.)
Things that we are not going to concern ourselves with at this time (to keep things simple): However, once we have the basic features working, we'll see how easy/difficult it would be to add the other features. Again, the earliest filesystems also didn't concern themselves with much other info, either.

The goal of this is not to create a Real-World Filesystem, but to understand how they work and can be implemented.

Simplified view of a very simple filesystem (One Big File):

This is what a real FAT directory entry looks like.

BTW, the reason we are not going to try to implement a modern filesystem (or even a partial one) is simply the sheer complexity. Take a look at what the primary file system is for Windows: NTFS Specs (local copy in case the link is dead.) Just look at the hundreds and hundreds of options and details for that file system. Compare that to a "trivial" FAT-type file system and you can see how far they've come.

Basic Structure

At the most basic level, each file is going to be represented by a singly-linked list of blocks. The disk block is the smallest unit that can be read/written by the filesystem. Typical block sizes in bytes are 512, 1K, 2K, 4K, 8K ... up to 64K. Another term you will hear for block is sector.

So, for example, a file that is 1,400 bytes that is stored in a fileystem with 512 byte blocks will require 3 blocks. The first 2 blocks will be full, totaling 1024 bytes (512 * 2) and the third block will store the "residual" 376 bytes (with 136 bytes unused). Logically, the blocks will be stored as a linked-list of 3 blocks:

Our filesystem is going to be loosely based on the original FAT (File Allocation Table) filesystem that was used by MSDOS back in the 1980s. The FAT filesystem is arguably still the most popular filesystem in use today, although it has been modified somewhat (VFAT, FAT32, exFAT) to deal with the technological demands of newer devices. e.g. larger drives, long filenames, very large files, etc.

We'll call the filesystem SFAT for Simple FAT because, although the original FAT filesystem was very simple, ours is going to be even simpler. Remember, this is an instructional video! You are not going to go out and replace Google's filesystem anytime soon!

The SFAT filesystem will have 4 basic sections. These sections are the

  1. super block - This describes the layout of the entire filesystem, e.g. number of sectors, sectors per cluster, bytes per sector, etc.
  2. directory area - This maps the file's name to the disk blocks that contain the contents.
  3. file allocation table - This keeps track of which data blocks are in-use and which are free using linked-list techniques.
  4. data area - This is the bulk of the filesystem and is where the contents of all of the files are actually stored.
This is a graphical view of the four sections. Note that the sections are not to scale, as the data blocks section likely consumes 95% or more of the filesystem.
The relationship between the directory entry, file allocation table, and data blocks.
Close-up of the FAT entries:
Given this configuration for the filesystem:
AttributeValue
total_sectors 8192
sectors_per_cluster 1
bytes_per_sectors 512
total_direntries 1024
Here are some values: If we add up the sizes we can see that the total size of the filesystem is:
 superblock            directory area       FAT area         data area
     32        +            16,384      +    16,384    +     4,194,304	= 4,227,104
You can easily see that by changing the number of sectors and/or the size of each sector, we can store more or less data in the filesystem. Typical filesystems have millions or billions of sectors and each sector may be 2K to 8K in size giving us a filesystem that can hold billions or trillions of bytes! Our simple filesystem won't be doing anything close to that!

We can also see the limitations of the filesystem based on this structure:

None of these limitations are insurmountable, it's just a matter of requiring more space on the disk. Most modern filesystems have much, much higher limits on these things, e.g. unlimited number of files, filenames up to 255 bytes long, individual files up to 2 TB in size, etc.

The diagram below shows the relationship between the file allocation table and the data area (using the numbers from above). There is exactly one FAT entry for each data block in the filesystem. This is what the FAT and data area would look like immediately after creation (and before any files are stored):

With most linked lists you've been dealing with, you normally have a "next" pointer as part of the node (block). In this case, there are no next pointers stored in each data block. Instead, the file allocation table is used for that purpose. You can think of the FAT as an array of all of the next pointers that would be stored in the data blocks. That's why there is a one-to-one correspondence with the number of FAT entries, and the number of data blocks. We are simply storing the "next" pointers outside of the data blocks themselves. This means that the data blocks are storing only data; not any next pointers or other information. The diagrams below will clarify this.

Note that the size of the super block is always 32 bytes and the size of each directory entry is always 16 bytes. What varies are the number of sectors (total_sectors), sectors_per_cluster, the size of each sector (bytes_per_sector), and the number of directory entries (total_direntries).

In essence:

  1. The size of the fileystem (data blocks) is dependent on the number of sectors and the size of each sector. Simply multiply the two numbers together to get the number of bytes available.
  2. The number of files is dependent on the number of directory entries.

Note: Some filesystems do not put a limit on the number of files that a filesystem can contain because the directory area can grow. To keep things manageable for our simple filesystem, we have a fixed-size directory area that is set when the filesystem is created and cannot grow at a later time. Once the directory entries are all in use, the file system can be considered "full", even if there are more data blocks available.


Given a file named Foo that has a size of 1,400 bytes, it might cause the filesystem to now look like this:

Directory entry (16 bytes in size, 10/2/4):


FAT and data blocks without file fragmentation. A zero indicates an unused block and FFFF (hex) represents the end of the list of blocks.
You may realize that the file is not fragmented, meaning, all of the data blocks are contiguous. This is not guaranteed and is usually not the case, especially with a lot of activity like deleting files and adding files. It is entirely possible that the filesystem could end up looking more like this after a bunch of activity:

Directory entry:


FAT and data blocks (file fragmentation):

Self-check:
1. How does the filesystem know how many bytes are stored in the last block (the "residual" bytes) since that information isn't in the FAT nor the data block itself?
2. What happens with the remaining unused bytes in the last data block?
3. What can you say about the size of file A?
4. What can you say about the size of file B?

Creating the Filesystem

Let's create an empty filesystem so we can add files to it. After creating the empty filesystem using the configuration parameters above, this is what a raw dump of the filesystem looks like. Since there are no files stored yet, only the super block has any information.

Raw filesystem dump (240 bytes)Super block information (32 bytes)
                      00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
               --------------------------------------------------------------------------
 super block / 000000 08 00 01 00 10 00 08 00  04 00 04 00 FA 00 00 00   ................
             \ 000010 00 00 00 00 00 00 00 00  56 46 53 2D 33 00 00 00   ........VFS-3...

             / 000020 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
dir. entries | 000030 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
             | 000040 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
             \ 000050 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
              
         FAT | 000060 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................

             / 000070 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
             | 000080 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
             | 000090 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
             | 0000A0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
   data area | 0000B0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
             | 0000C0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
             | 0000D0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
             \ 0000E0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
       Total sectors:   8
 Sectors per cluster:   1
    Bytes per sector:  16
   Available sectors:   8
   Directory entries:   4
Available direntries:   4
     Filesystem type:  FA
            Reserved: 00 00 00 00 00 00 00 00 00 00 00
               Label: 'VFS-3'



Comments:
When a "real" filesystem is created there is no guarantee
that all of the data blocks have been initialized to 0.
What will likely happen is that the values will be whatever
random garbage happens to be on the disk at that
location when the filesystem was created. The directory
area and file allocation table (FAT) will likely be
initialized to 0.

  1. The first 32 bytes are for the super block.
  2. The next 64 bytes (4 * 16) are for the directory entries.
  3. The next 16 bytes (8 * 2) are for the file allocation table (FAT)
  4. The remaining 128 bytes (8 * 16) is the data area.

Note: Since our filesystem is so small and basic, it only takes a few lines of code (and a fraction of a second) to create an empty filesystem. More complex filesystems can take much longer to create as they must construct much more sophisticated data structures to manage billions and trillions (and more!) of bytes of data. Remember this?

Creating Files

Here are 4 small files that we'll use to demonstrate. We'll look at the entire filesystem after each one has been stored. Each line ends with an invisible newline (<NL>). The number in the name of the file reflects how many characters are in the file.
file9.txtfile16.txtfile23.txtfile61.txt
9 chars.<NL>
16 characters..<NL>
Exactly 23 characters.<NL>
Roses are red. Violets are blue.<NL>
This text is 61 chars long.<NL>

This is the configuration of the filesystem:
AttributeValue
total_sectors 8
sectors_per_cluster 1
bytes_per_sectors 16
total_direntries 4
Given this configuration (i.e. 16 bytes per sector) and the size of each file above, we know that:
  1. file9.txt will require 0 full sectors and 1 partial sector.
  2. file16.txt will require 1 full sector and 0 partial sectors.
  3. file23.txt will require 1 full sector and 1 partial sector.
  4. file61.txt will require 3 full sectors and 1 partial sector.
In fact, all files will have some combination of full sectors and partial sectors. The only time you will not have a partial sector is if the size of the file is a multiple of the number of bytes per sector. The overwhelming majority of the files will not be a perfect multiple and will, therefore, require a partial sector. A file will never have more than one partial sector. This partial sector or "wasted" bytes is known as internal fragmentation (just like with memory pages).

Note: As you can imagine, the larger the sector (bytes per sector), the more wasted bytes you are likely to have. On the other hand, smaller sectors lead to many small blocks which require a lot more overhead to manage and therefore will have a negative impact on performance. On average, each file in the filesystem will waste 1/2 of a block. As always in computer science, it's a trade-off.

Note: Since all files require at least one block, even a file that is a single byte will require an entire block to store. More than 99% of the bytes in the block will be wasted.


  1. After creating file9.txt:

    Raw filesystem dump (240 bytes)Super block information (32 bytes)
                          00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
                   --------------------------------------------------------------------------
     super block / 000000 08 00 01 00 10 00 07 00  04 00 03 00 FA 00 00 00   ................
                 \ 000010 00 00 00 00 00 00 00 00  56 46 53 2D 33 00 00 00   ........VFS-3...
    
                 / 000020 66 69 6C 65 39 2E 74 78  74 00 00 00 09 00 00 00   file9.txt.......
    dir. entries | 000030 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
                 | 000040 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
                 \ 000050 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
                  
             FAT | 000060 FF FF 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
    
                 / 000070 39 20 63 68 61 72 73 2E  0A 00 00 00 00 00 00 00   9 chars.........
                 | 000080 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
                 | 000090 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
                 | 0000A0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
       data area | 0000B0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
                 | 0000C0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
                 | 0000D0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
                 \ 0000E0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
    
           Total sectors:   8
     Sectors per cluster:   1
        Bytes per sector:  16
       Available sectors:   7
       Directory entries:   4
    Available direntries:   3
         Filesystem type:  FA
                Reserved: 00 00 00 00 00 00 00 00 00 00 00
                   Label: 'VFS-3'
    
    
    
    Comments:
    Each new file will decrement the value of Available
    direntries. Each block that is consumed by a new
    file will decrement the value of Available sectors.
    

    When creating a new file, all four sections will require some modifications:

    1. Superblock - Update available_sectors and available_direntries.
    2. Directory entries - Update a directory entry with the filename, the FAT entry, and the file size.
    3. FAT - Update the corresponding FAT entry.
    4. Data area - Use the block(s) pointed to by the FAT entry/entries.

    Showing the empty filesystem with the updated one.


  2. After creating file16.txt:

    Raw filesystem dump (240 bytes)Super block information (32 bytes)
                          00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
                   --------------------------------------------------------------------------
     super block / 000000 08 00 01 00 10 00 06 00  04 00 02 00 FA 00 00 00   ................
                 \ 000010 00 00 00 00 00 00 00 00  56 46 53 2D 33 00 00 00   ........VFS-3...
    
                 / 000020 66 69 6C 65 39 2E 74 78  74 00 00 00 09 00 00 00   file9.txt.......
    dir. entries | 000030 66 69 6C 65 31 36 2E 74  78 74 01 00 10 00 00 00   file16.txt......
                 | 000040 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
                 \ 000050 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
                  
             FAT | 000060 FF FF FF FF 00 00 00 00  00 00 00 00 00 00 00 00   ................
    
                 / 000070 39 20 63 68 61 72 73 2E  0A 00 00 00 00 00 00 00   9 chars.........
                 | 000080 31 36 20 63 68 61 72 61  63 74 65 72 73 2E 2E 0A   16 characters...
                 | 000090 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
                 | 0000A0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
       data area | 0000B0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
                 | 0000C0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
                 | 0000D0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
                 \ 0000E0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
    
           Total sectors:   8
     Sectors per cluster:   1
        Bytes per sector:  16
       Available sectors:   6
       Directory entries:   4
    Available direntries:   2
         Filesystem type:  FA
                Reserved: 00 00 00 00 00 00 00 00 00 00 00
                   Label: 'VFS-3'
    
    
    
    
    
    Comments:
    Files can't "share" data blocks, so the "extra" space
    in the last block of file9.txt is wasted and
    can't be used for any other file.
    


  3. After creating file23.txt:

    Raw filesystem dump (240 bytes)Super block information (32 bytes)
                          00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
                   --------------------------------------------------------------------------
     super block / 000000 08 00 01 00 10 00 04 00  04 00 01 00 FA 00 00 00   ................
                 \ 000010 00 00 00 00 00 00 00 00  56 46 53 2D 33 00 00 00   ........VFS-3...
    
                 / 000020 66 69 6C 65 39 2E 74 78  74 00 00 00 09 00 00 00   file9.txt.......
    dir. entries | 000030 66 69 6C 65 31 36 2E 74  78 74 01 00 10 00 00 00   file16.txt......
                 | 000040 66 69 6C 65 32 33 2E 74  78 74 02 00 17 00 00 00   file23.txt......
                 \ 000050 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
                  
             FAT | 000060 FF FF FF FF 03 00 FF FF  00 00 00 00 00 00 00 00   ................
    
                 / 000070 39 20 63 68 61 72 73 2E  0A 00 00 00 00 00 00 00   9 chars.........
                 | 000080 31 36 20 63 68 61 72 61  63 74 65 72 73 2E 2E 0A   16 characters...
                 | 000090 45 78 61 63 74 6C 79 20  32 33 20 63 68 61 72 61   Exactly 23 chara
                 | 0000A0 63 74 65 72 73 2E 0A 00  00 00 00 00 00 00 00 00   cters...........
       data area | 0000B0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
                 | 0000C0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
                 | 0000D0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
                 \ 0000E0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
    
           Total sectors:   8
     Sectors per cluster:   1
        Bytes per sector:  16
       Available sectors:   4
       Directory entries:   4
    Available direntries:   1
         Filesystem type:  FA
                Reserved: 00 00 00 00 00 00 00 00 00 00 00
                   Label: 'VFS-3'
    
    
    
    Comments:
    At this point, there are 3 files in the system and
    they consume 4 data blocks. There is one directory
    entry and 64 data blocks remaining. The filesystem
    will be considered full when we run out of either
    directory entries or data blocks, whichever runs out
    first.
    


  4. After creating file61.txt:

    Raw filesystem dump (240 bytes)Super block information (32 bytes)
                          00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
                   --------------------------------------------------------------------------
     super block / 000000 08 00 01 00 10 00 00 00  04 00 00 00 FA 00 00 00   ................
                 \ 000010 00 00 00 00 00 00 00 00  56 46 53 2D 33 00 00 00   ........VFS-3...
    
                 / 000020 66 69 6C 65 39 2E 74 78  74 00 00 00 09 00 00 00   file9.txt.......
    dir. entries | 000030 66 69 6C 65 31 36 2E 74  78 74 01 00 10 00 00 00   file16.txt......
                 | 000040 66 69 6C 65 32 33 2E 74  78 74 02 00 17 00 00 00   file23.txt......
                 \ 000050 66 69 6C 65 36 31 2E 74  78 74 04 00 3D 00 00 00   file61.txt..=...
                  
             FAT | 000060 FF FF FF FF 03 00 FF FF  05 00 06 00 07 00 FF FF   ................
    
                 / 000070 39 20 63 68 61 72 73 2E  0A 00 00 00 00 00 00 00   9 chars.........
                 | 000080 31 36 20 63 68 61 72 61  63 74 65 72 73 2E 2E 0A   16 characters...
                 | 000090 45 78 61 63 74 6C 79 20  32 33 20 63 68 61 72 61   Exactly 23 chara
                 | 0000A0 63 74 65 72 73 2E 0A 00  00 00 00 00 00 00 00 00   cters...........
       data area | 0000B0 52 6F 73 65 73 20 61 72  65 20 72 65 64 2E 20 56   Roses are red. V
                 | 0000C0 69 6F 6C 65 74 73 20 61  72 65 20 62 6C 75 65 2E   iolets are blue.
                 | 0000D0 0A 54 68 69 73 20 74 65  78 74 20 69 73 20 36 31   .This text is 61
                 \ 0000E0 20 63 68 61 72 73 20 6C  6F 6E 67 2E 0A 00 00 00    chars long.....
    
           Total sectors:   8
     Sectors per cluster:   1
        Bytes per sector:  16
       Available sectors:   0
       Directory entries:   4
    Available direntries:   0
         Filesystem type:  FA
                Reserved: 00 00 00 00 00 00 00 00 00 00 00
                   Label: 'VFS-3'
    
    
    
    Comments:
    The filesystem is now full. No directory entries and no
    data blocks are available.
    

    The filesystem is now full. There are no more directory entries available and there are no more data blocks. Any attempt to create another file will result in an error from the filesystem with a message along the lines of "No space left on device."

    It turns out that with this example we ran out of directory entries at the same time we ran out of data blocks. This is an unlikely situation. Usually, we either run out of directory entries while still having free data blocks, or, we run out of data blocks while still having free directory entries.


  5. Let's see what the filesystem would look like if the last file that was created was very small. In fact, I'm going to store the first file again, but with a different name:

    After creating file9a.txt:

    Raw filesystem dump (240 bytes)Super block information (32 bytes)
                          00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
                   --------------------------------------------------------------------------
     super block / 000000 08 00 01 00 10 00 03 00  04 00 00 00 FA 00 00 00   ................
                 \ 000010 00 00 00 00 00 00 00 00  56 46 53 2D 33 00 00 00   ........VFS-3...
    
                 / 000020 66 69 6C 65 39 2E 74 78  74 00 00 00 09 00 00 00   file9.txt.......
    dir. entries | 000030 66 69 6C 65 31 36 2E 74  78 74 01 00 10 00 00 00   file16.txt......
                 | 000040 66 69 6C 65 32 33 2E 74  78 74 02 00 17 00 00 00   file23.txt......
                 \ 000050 66 69 6C 65 39 61 2E 74  78 74 04 00 09 00 00 00   file9a.txt......
                  
             FAT | 000060 FF FF FF FF 03 00 FF FF  FF FF 00 00 00 00 00 00   ................
    
                 / 000070 39 20 63 68 61 72 73 2E  0A 00 00 00 00 00 00 00   9 chars.........
                 | 000080 31 36 20 63 68 61 72 61  63 74 65 72 73 2E 2E 0A   16 characters...
                 | 000090 45 78 61 63 74 6C 79 20  32 33 20 63 68 61 72 61   Exactly 23 chara
                 | 0000A0 63 74 65 72 73 2E 0A 00  00 00 00 00 00 00 00 00   cters...........
       data area | 0000B0 39 20 63 68 61 72 73 2E  0A 00 00 00 00 00 00 00   9 chars.........
                 | 0000C0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
                 | 0000D0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
                 \ 0000E0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
    
           Total sectors:   8
     Sectors per cluster:   1
        Bytes per sector:  16
       Available sectors:   3
       Directory entries:   4
    Available direntries:   0
         Filesystem type:  FA
                Reserved: 00 00 00 00 00 00 00 00 00 00 00
                   Label: 'VFS-3'
    
    
    
    Comments:
    The filesystem is still full, even though there are 3
    data blocks available. Because there are no more
    directory entries available, no new files can be added
    to the filesystem. This is not uncommon when a
    filesystem has many small files.
    

    The filesystem is still technically full, even though there are 3 data blocks available. But, with no more directory entries available, you can't create a new file. Existing files can grow because they already have a directory entry and there are free data blocks, but no new files can be created.


  6. Suppose that the last file was too big to store. What happens? I'm going to try and store this file: (poem.txt)
    Roses are red.<NL>
    Violets are blue.<NL>
    Some poems rhyme.<NL>
    But not this one.<NL>
    
    This file is 69 bytes in size, but there are only 64 bytes (4 * 16) left in the filesystem. This is what the result will be when attempting to store the whole file:

    After creating poem.txt:

    Raw filesystem dump (240 bytes)Super block information (32 bytes)
                          00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
                   --------------------------------------------------------------------------
     super block / 000000 08 00 01 00 10 00 00 00  04 00 00 00 FA 00 00 00   ................
                 \ 000010 00 00 00 00 00 00 00 00  56 46 53 2D 33 00 00 00   ........VFS-3...
    
                 / 000020 66 69 6C 65 39 2E 74 78  74 00 00 00 09 00 00 00   file9.txt.......
    dir. entries | 000030 66 69 6C 65 31 36 2E 74  78 74 01 00 10 00 00 00   file16.txt......
                 | 000040 66 69 6C 65 32 33 2E 74  78 74 02 00 17 00 00 00   file23.txt......
                 \ 000050 70 6F 65 6D 2E 74 78 74  00 00 04 00 40 00 00 00   poem.txt....@...
                  
             FAT | 000060 FF FF FF FF 03 00 FF FF  05 00 06 00 07 00 FF FF   ................
    
                 / 000070 39 20 63 68 61 72 73 2E  0A 00 00 00 00 00 00 00   9 chars.........
                 | 000080 31 36 20 63 68 61 72 61  63 74 65 72 73 2E 2E 0A   16 characters...
                 | 000090 45 78 61 63 74 6C 79 20  32 33 20 63 68 61 72 61   Exactly 23 chara
                 | 0000A0 63 74 65 72 73 2E 0A 00  00 00 00 00 00 00 00 00   cters...........
       data area | 0000B0 52 6F 73 65 73 20 61 72  65 20 72 65 64 2E 0A 56   Roses are red..V
                 | 0000C0 69 6F 6C 65 74 73 20 61  72 65 20 62 6C 75 65 2E   iolets are blue.
                 | 0000D0 0A 53 6F 6D 65 20 70 6F  65 6D 73 20 72 68 79 6D   .Some poems rhym
                 \ 0000E0 65 2E 0A 42 75 74 20 6E  6F 74 20 74 68 69 73 20   e..But not this 
    
           Total sectors:   8
     Sectors per cluster:   1
        Bytes per sector:  16
       Available sectors:   0
       Directory entries:   4
    Available direntries:   0
         Filesystem type:  FA
                Reserved: 00 00 00 00 00 00 00 00 00 00 00
                   Label: 'VFS-3'
    
    
    
    Comments:
    The last file was truncated because the filesystem ran
    out of data blocks. A truncated file will always have
    a size that is exactly a multiple of the bytes per
    sector.
    

    You'll notice that the last file was truncated. Only the bytes that could fit were stored. The size of the file in the filesystem is also modified to reflect this fact. A message would also be displayed saying something along the lines of "No space left on device."


  7. This last situation shows what the filesystem looks like if we run out of data blocks, but still have directory entries available. This will likely happen with many large files. Here, I'm going to store one file in the filesystem that takes up all of the data blocks, but only requires one directory entry. It is one long line in the file. There are well over 128 characters, in the file, so it will definitely be truncated. (preamble)

    When in the Course of human events, it becomes necessary
    for one people to dissolve the political bands which have
    connected them with another, and to assume among the powers
    of the earth, the separate and equal station to which the 
    Laws of Nature and of Nature's God entitle them, a decent
    respect to the opinions of mankind requires that they
    should declare the causes which impel them to the
    separation.<NL>
    

    After creating preamble:

    Raw filesystem dump (240 bytes)Super block information (32 bytes)
                          00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
                   --------------------------------------------------------------------------
     super block / 000000 08 00 01 00 10 00 00 00  04 00 03 00 FA 00 00 00   ................
                 \ 000010 00 00 00 00 00 00 00 00  56 46 53 2D 33 00 00 00   ........VFS-3...
    
                 / 000020 70 72 65 61 6D 62 6C 65  00 00 00 00 80 00 00 00   preamble........
    dir. entries | 000030 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
                 | 000040 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
                 \ 000050 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
                  
             FAT | 000060 01 00 02 00 03 00 04 00  05 00 06 00 07 00 FF FF   ................
    
                 / 000070 57 68 65 6E 20 69 6E 20  74 68 65 20 43 6F 75 72   When in the Cour
                 | 000080 73 65 20 6F 66 20 68 75  6D 61 6E 20 65 76 65 6E   se of human even
                 | 000090 74 73 2C 20 69 74 20 62  65 63 6F 6D 65 73 20 6E   ts, it becomes n
                 | 0000A0 65 63 65 73 73 61 72 79  20 66 6F 72 20 6F 6E 65   ecessary for one
       data area | 0000B0 20 70 65 6F 70 6C 65 20  74 6F 20 64 69 73 73 6F    people to disso
                 | 0000C0 6C 76 65 20 74 68 65 20  70 6F 6C 69 74 69 63 61   lve the politica
                 | 0000D0 6C 20 62 61 6E 64 73 20  77 68 69 63 68 20 68 61   l bands which ha
                 \ 0000E0 76 65 20 63 6F 6E 6E 65  63 74 65 64 20 74 68 65   ve connected the
    
           Total sectors:   8
     Sectors per cluster:   1
        Bytes per sector:  16
       Available sectors:   0
       Directory entries:   4
    Available direntries:   3
         Filesystem type:  FA
                Reserved: 00 00 00 00 00 00 00 00 00 00 00
                   Label: 'VFS-3'
    
    
    
    Comments:
    There is only one file in this filesystem, but the
    filesystem is full. There are directory entries
    available, but there are no data blocks available. No
    new files can be created.
    

    Like before, you'll notice that the last file was truncated. Only the bytes that could fit were stored. The size of the file in the filesystem is also modified to reflect this fact. A message would also be displayed saying something along the lines of "No space left on device." All of the data blocks have been consumed, yet we only have one file in the filesystem. This can happen if you have very large files.

Things to notice at this point:

Deleting Files

It turns out that deleting files requires much less work than creating them. This is because a file's data blocks are not deleted, per se. They are simply marked as deleted, which means that they can be re-used by other files in the future. We'll start with the full filesystem from above, delete each file, and see what the filesystem looks like after each step.

The full filesystem:

Raw filesystem dump (240 bytes)Super block information (32 bytes)
                      00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
               --------------------------------------------------------------------------
 super block / 000000 08 00 01 00 10 00 00 00  04 00 00 00 FA 00 00 00   ................
             \ 000010 00 00 00 00 00 00 00 00  56 46 53 2D 33 00 00 00   ........VFS-3...

             / 000020 66 69 6C 65 39 2E 74 78  74 00 00 00 09 00 00 00   file9.txt.......
dir. entries | 000030 66 69 6C 65 31 36 2E 74  78 74 01 00 10 00 00 00   file16.txt......
             | 000040 66 69 6C 65 32 33 2E 74  78 74 02 00 17 00 00 00   file23.txt......
             \ 000050 66 69 6C 65 36 31 2E 74  78 74 04 00 3D 00 00 00   file61.txt..=...
              
         FAT | 000060 FF FF FF FF 03 00 FF FF  05 00 06 00 07 00 FF FF   ................

             / 000070 39 20 63 68 61 72 73 2E  0A 00 00 00 00 00 00 00   9 chars.........
             | 000080 31 36 20 63 68 61 72 61  63 74 65 72 73 2E 2E 0A   16 characters...
             | 000090 45 78 61 63 74 6C 79 20  32 33 20 63 68 61 72 61   Exactly 23 chara
             | 0000A0 63 74 65 72 73 2E 0A 00  00 00 00 00 00 00 00 00   cters...........
   data area | 0000B0 52 6F 73 65 73 20 61 72  65 20 72 65 64 2E 20 56   Roses are red. V
             | 0000C0 69 6F 6C 65 74 73 20 61  72 65 20 62 6C 75 65 2E   iolets are blue.
             | 0000D0 0A 54 68 69 73 20 74 65  78 74 20 69 73 20 36 31   .This text is 61
             \ 0000E0 20 63 68 61 72 73 20 6C  6F 6E 67 2E 0A 00 00 00    chars long.....
       Total sectors:   8
 Sectors per cluster:   1
    Bytes per sector:  16
   Available sectors:   0
   Directory entries:   4
Available direntries:   0
     Filesystem type:  FA
            Reserved: 00 00 00 00 00 00 00 00 00 00 00
               Label: 'VFS-3'

So what does it mean to "delete" a file? Deleting a file simply means changing the first character of the filename (in the directory entry) to a special character. That special character will be the question mark ? character. Then, we will set all of the file's FAT entries to 0. The data blocks will be left intact. BTW, the ? character is an illegal character in a filename (in our system, as well as others), which means it won't conflict with real file names.

This is the algorithm to delete a file in your system:

  1. Lookup (find) the filename in the directory entries.
  2. Set the first character of the filename to '?'
  3. Set all FAT entries for the file to 0.
  4. Update appropriate counters in the Superblock (available_sectors, available_direntries).
Let's delete file23.txt first.

Raw filesystem dump (240 bytes)Super block information (32 bytes)
                      00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
               --------------------------------------------------------------------------
 super block / 000000 08 00 01 00 10 00 02 00  04 00 01 00 FA 00 00 00   ................
             \ 000010 00 00 00 00 00 00 00 00  56 46 53 2D 33 00 00 00   ........VFS-3...

             / 000020 66 69 6C 65 39 2E 74 78  74 00 00 00 09 00 00 00   file9.txt.......
dir. entries | 000030 66 69 6C 65 31 36 2E 74  78 74 01 00 10 00 00 00   file16.txt......
             | 000040 3F 69 6C 65 32 33 2E 74  78 74 02 00 17 00 00 00   ?ile23.txt......
             \ 000050 66 69 6C 65 36 31 2E 74  78 74 04 00 3D 00 00 00   file61.txt..=...
              
         FAT | 000060 FF FF FF FF 00 00 00 00  05 00 06 00 07 00 FF FF   ................

             / 000070 39 20 63 68 61 72 73 2E  0A 00 00 00 00 00 00 00   9 chars.........
             | 000080 31 36 20 63 68 61 72 61  63 74 65 72 73 2E 2E 0A   16 characters...
             | 000090 45 78 61 63 74 6C 79 20  32 33 20 63 68 61 72 61   Exactly 23 chara
             | 0000A0 63 74 65 72 73 2E 0A 00  00 00 00 00 00 00 00 00   cters...........
   data area | 0000B0 52 6F 73 65 73 20 61 72  65 20 72 65 64 2E 20 56   Roses are red. V
             | 0000C0 69 6F 6C 65 74 73 20 61  72 65 20 62 6C 75 65 2E   iolets are blue.
             | 0000D0 0A 54 68 69 73 20 74 65  78 74 20 69 73 20 36 31   .This text is 61
             \ 0000E0 20 63 68 61 72 73 20 6C  6F 6E 67 2E 0A 00 00 00    chars long.....
       Total sectors:   8
 Sectors per cluster:   1
    Bytes per sector:  16
   Available sectors:   2
   Directory entries:   4
Available direntries:   1
     Filesystem type:  FA
            Reserved: 00 00 00 00 00 00 00 00 00 00 00
               Label: 'VFS-3'

The bytes that are highlighted above are the ones that were modified when the file was deleted. Notice that the data blocks are still intact. The super block has also been updated to reflect the new state of the filesystem.

Note: The fact that the data blocks are left untouched is what allows special utility programs to undelete files that have been deleted. As long as the data blocks do not get re-used, it is possible to recover the data from a deleted file. Some filesystems make this undeletion easier or harder than others. There are also special tools that will perform a secure erase, which means that all of the data blocks are overwritten with zeros or random garbage to prevent any attempt at recovering the data.

This is what the filesystem looks like after deleting (in any order) the remaining files:

Raw filesystem dump (240 bytes)Super block information (32 bytes)
                      00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
               --------------------------------------------------------------------------
 super block / 000000 08 00 01 00 10 00 08 00  04 00 04 00 FA 00 00 00   ................
             \ 000010 00 00 00 00 00 00 00 00  56 46 53 2D 33 00 00 00   ........VFS-3...

             / 000020 3F 69 6C 65 39 2E 74 78  74 00 00 00 09 00 00 00   ?ile9.txt.......
dir. entries | 000030 3F 69 6C 65 31 36 2E 74  78 74 01 00 10 00 00 00   ?ile16.txt......
             | 000040 3F 69 6C 65 32 33 2E 74  78 74 02 00 17 00 00 00   ?ile23.txt......
             \ 000050 3F 69 6C 65 36 31 2E 74  78 74 04 00 3D 00 00 00   ?ile61.txt..=...
              
         FAT | 000060 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................

             / 000070 39 20 63 68 61 72 73 2E  0A 00 00 00 00 00 00 00   9 chars.........
             | 000080 31 36 20 63 68 61 72 61  63 74 65 72 73 2E 2E 0A   16 characters...
             | 000090 45 78 61 63 74 6C 79 20  32 33 20 63 68 61 72 61   Exactly 23 chara
             | 0000A0 63 74 65 72 73 2E 0A 00  00 00 00 00 00 00 00 00   cters...........
   data area | 0000B0 52 6F 73 65 73 20 61 72  65 20 72 65 64 2E 20 56   Roses are red. V
             | 0000C0 69 6F 6C 65 74 73 20 61  72 65 20 62 6C 75 65 2E   iolets are blue.
             | 0000D0 0A 54 68 69 73 20 74 65  78 74 20 69 73 20 36 31   .This text is 61
             \ 0000E0 20 63 68 61 72 73 20 6C  6F 6E 67 2E 0A 00 00 00    chars long.....
       Total sectors:   8
 Sectors per cluster:   1
    Bytes per sector:  16
   Available sectors:   8
   Directory entries:   4
Available direntries:   4
     Filesystem type:  FA
            Reserved: 00 00 00 00 00 00 00 00 00 00 00
               Label: 'VFS-3'

At this point, the filesystem is empty and 4 new files can be created. When the new files are created, the data blocks will get re-used and the old data will get overwritten. That's what we'll look at next.

Creating/Deleting/Creating = Fragmentation

Up until now, we haven't re-used any data blocks. We either created new files with previously unused data blocks or deleted (marked) files. If we never created another file once we started deleting files, life would be simple. Now, however, we are going to start to develop "holes" in the filesystem when we delete files that are in-between other files. Examples will clarify.
When we want to create a new file, how do we know which blocks to use? This simple filesystem uses a very simple and straight-forward technique: The first block that is available (as determined by the file allocation table) will be the first block for our new file. If the file requires multiple blocks, we will continue searching forward in a linear fashion looking for free blocks in the FAT.

Example: Suppose we have a full filesystem with 8 small files. (Yes, it's unrealistic.) Each file fits into a single block. Assume there are only 8 data blocks and each block is 512 bytes. There are also only 8 directory entries. The data area would look something like this:

Graphically, the filesystem would look something like this:

If we were to delete files FileB, FileD, FileF, and FileH, we'd have this:

and the filesystem would look something like this:

Observations:

Now, suppose we have a bunch of data that we want to save to a file called FileZ. The data contains 2048 bytes and requires 4 blocks:

We don't have 2,048 contiguous bytes (4 contiguous blocks) so we are going to have to split up the data into 4 blocks that will look something like this:

This is what the filesystem looks like after adding the file:

You can clearly see that FileZ is fragmented. However, this is a much better outcome than just saying that the disk is full and wasting all of the available blocks.

Self-check: Make sure that you can explain what each value in all of the diagrams represents. This will let you know if you understand how this works.

In a nutshell, this is how file fragmentation happens. Let's see how this will be handled with our simple filesystem.

First, let's create a filesystem with several small files. (Deleting small files and then creating a large file is generally how fragmentation happens.) This is the configuration of the file system.

AttributeValue
total_sectors 8
sectors_per_cluster 1
bytes_per_sectors 16
total_direntries 8
And here is a full filesystem (shown in sections) with 8 files.

Raw filesystem dump (304 bytes)Super block information (32 bytes)
       00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
--------------------------------------------------------------------------
000000 08 00 01 00 10 00 00 00  08 00 00 00 FA 00 00 00   ................
000010 00 00 00 00 00 00 00 00  56 46 53 2D 33 00 00 00   ........VFS-3...

000020 46 69 6C 65 41 00 00 00  00 00 00 00 10 00 00 00   FileA...........
000030 46 69 6C 65 42 00 00 00  00 00 01 00 10 00 00 00   FileB...........
000040 46 69 6C 65 43 00 00 00  00 00 02 00 10 00 00 00   FileC...........
000050 46 69 6C 65 44 00 00 00  00 00 03 00 10 00 00 00   FileD...........
000060 46 69 6C 65 45 00 00 00  00 00 04 00 10 00 00 00   FileE...........
000070 46 69 6C 65 46 00 00 00  00 00 05 00 10 00 00 00   FileF...........
000080 46 69 6C 65 47 00 00 00  00 00 06 00 10 00 00 00   FileG...........
000090 46 69 6C 65 48 00 00 00  00 00 07 00 10 00 00 00   FileH...........

0000A0 FF FF FF FF FF FF FF FF  FF FF FF FF FF FF FF FF   ................

0000B0 41 41 41 41 41 41 41 41  41 41 41 41 41 41 41 41   AAAAAAAAAAAAAAAA
0000C0 42 42 42 42 42 42 42 42  42 42 42 42 42 42 42 42   BBBBBBBBBBBBBBBB
0000D0 43 43 43 43 43 43 43 43  43 43 43 43 43 43 43 43   CCCCCCCCCCCCCCCC
0000E0 44 44 44 44 44 44 44 44  44 44 44 44 44 44 44 44   DDDDDDDDDDDDDDDD
0000F0 45 45 45 45 45 45 45 45  45 45 45 45 45 45 45 45   EEEEEEEEEEEEEEEE
000100 46 46 46 46 46 46 46 46  46 46 46 46 46 46 46 46   FFFFFFFFFFFFFFFF
000110 47 47 47 47 47 47 47 47  47 47 47 47 47 47 47 47   GGGGGGGGGGGGGGGG
000120 48 48 48 48 48 48 48 48  48 48 48 48 48 48 48 48   HHHHHHHHHHHHHHHH
       Total sectors:   8
 Sectors per cluster:   1
    Bytes per sector:  16
   Available sectors:   0
   Directory entries:   8
Available direntries:   0
     Filesystem type:  FA
            Reserved: 00 00 00 00 00 00 00 00 00 00 00
               Label: 'VFS-3'

There are 8 files that are all 16 bytes in size. This means they each consume exactly one full data block. The contents match the name of the files. This is so you can easily see where one file stops and the next file begins. You'll also be able to easily see the fragmented files because they won't be contiguous.

Self-check: At this point, you should be able to explain what every byte in the output above means. That will let you know if you understand what's been going on here.

Ok, so let's delete 4 files: FileB, FileD, FileF, FileH. This will leave us with 4 data blocks, none of which are contiguous, meaning, they are fragmented:

Raw filesystem dump (304 bytes)Super block information (32 bytes)
       00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
--------------------------------------------------------------------------
000000 08 00 01 00 10 00 04 00  08 00 04 00 FA 00 00 00   ................
000010 00 00 00 00 00 00 00 00  56 46 53 2D 33 00 00 00   ........VFS-3...

000020 46 69 6C 65 41 00 00 00  00 00 00 00 10 00 00 00   FileA...........
000030 3F 69 6C 65 42 00 00 00  00 00 01 00 10 00 00 00   ?ileB...........
000040 46 69 6C 65 43 00 00 00  00 00 02 00 10 00 00 00   FileC...........
000050 3F 69 6C 65 44 00 00 00  00 00 03 00 10 00 00 00   ?ileD...........
000060 46 69 6C 65 45 00 00 00  00 00 04 00 10 00 00 00   FileE...........
000070 3F 69 6C 65 46 00 00 00  00 00 05 00 10 00 00 00   ?ileF...........
000080 46 69 6C 65 47 00 00 00  00 00 06 00 10 00 00 00   FileG...........
000090 3F 69 6C 65 48 00 00 00  00 00 07 00 10 00 00 00   ?ileH...........

0000A0 FF FF 00 00 FF FF 00 00  FF FF 00 00 FF FF 00 00   ................

0000B0 41 41 41 41 41 41 41 41  41 41 41 41 41 41 41 41   AAAAAAAAAAAAAAAA
0000C0 42 42 42 42 42 42 42 42  42 42 42 42 42 42 42 42   BBBBBBBBBBBBBBBB
0000D0 43 43 43 43 43 43 43 43  43 43 43 43 43 43 43 43   CCCCCCCCCCCCCCCC
0000E0 44 44 44 44 44 44 44 44  44 44 44 44 44 44 44 44   DDDDDDDDDDDDDDDD
0000F0 45 45 45 45 45 45 45 45  45 45 45 45 45 45 45 45   EEEEEEEEEEEEEEEE
000100 46 46 46 46 46 46 46 46  46 46 46 46 46 46 46 46   FFFFFFFFFFFFFFFF
000110 47 47 47 47 47 47 47 47  47 47 47 47 47 47 47 47   GGGGGGGGGGGGGGGG
000120 48 48 48 48 48 48 48 48  48 48 48 48 48 48 48 48   HHHHHHHHHHHHHHHH
       Total sectors:   8
 Sectors per cluster:   1
    Bytes per sector:  16
   Available sectors:   4
   Directory entries:   8
Available direntries:   4
     Filesystem type:  FA
            Reserved: 00 00 00 00 00 00 00 00 00 00 00
               Label: 'VFS-3'

Now, we create a new file that is 64 bytes in size called FileZ:

Raw filesystem dump (304 bytes)Super block information (32 bytes)
       00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
--------------------------------------------------------------------------
000000 08 00 01 00 10 00 00 00  08 00 03 00 FA 00 00 00   ................
000010 00 00 00 00 00 00 00 00  56 46 53 2D 33 00 00 00   ........VFS-3...

000020 46 69 6C 65 41 00 00 00  00 00 00 00 10 00 00 00   FileA...........
000030 46 69 6C 65 5A 00 00 00  00 00 01 00 40 00 00 00   FileZ.......@...
000040 46 69 6C 65 43 00 00 00  00 00 02 00 10 00 00 00   FileC...........
000050 3F 69 6C 65 44 00 00 00  00 00 03 00 10 00 00 00   ?ileD...........
000060 46 69 6C 65 45 00 00 00  00 00 04 00 10 00 00 00   FileE...........
000070 3F 69 6C 65 46 00 00 00  00 00 05 00 10 00 00 00   ?ileF...........
000080 46 69 6C 65 47 00 00 00  00 00 06 00 10 00 00 00   FileG...........
000090 3F 69 6C 65 48 00 00 00  00 00 07 00 10 00 00 00   ?ileH...........

0000A0 FF FF 03 00 FF FF 05 00  FF FF 07 00 FF FF FF FF   ................

0000B0 41 41 41 41 41 41 41 41  41 41 41 41 41 41 41 41   AAAAAAAAAAAAAAAA
0000C0 5A 5A 5A 5A 5A 5A 5A 5A  5A 5A 5A 5A 5A 5A 5A 5A   ZZZZZZZZZZZZZZZZ
0000D0 43 43 43 43 43 43 43 43  43 43 43 43 43 43 43 43   CCCCCCCCCCCCCCCC
0000E0 5A 5A 5A 5A 5A 5A 5A 5A  5A 5A 5A 5A 5A 5A 5A 5A   ZZZZZZZZZZZZZZZZ
0000F0 45 45 45 45 45 45 45 45  45 45 45 45 45 45 45 45   EEEEEEEEEEEEEEEE
000100 5A 5A 5A 5A 5A 5A 5A 5A  5A 5A 5A 5A 5A 5A 5A 5A   ZZZZZZZZZZZZZZZZ
000110 47 47 47 47 47 47 47 47  47 47 47 47 47 47 47 47   GGGGGGGGGGGGGGGG
000120 5A 5A 5A 5A 5A 5A 5A 5A  5A 5A 5A 5A 5A 5A 5A 5A   ZZZZZZZZZZZZZZZZ
       Total sectors:   8
 Sectors per cluster:   1
    Bytes per sector:  16
   Available sectors:   0
   Directory entries:   8
Available direntries:   3
     Filesystem type:  FA
            Reserved: 00 00 00 00 00 00 00 00 00 00 00
               Label: 'VFS-3'

It's clear that FileZ has been fragmented into 4 non-contiguous blocks. Again, this is more desirable than simply truncating the file or not storing any of it at all. This was a conscious design decision that the inventors of the FAT-like filesystem (and most others, too) made.

Some Implementation Details

To make the implementation easier, you should have pointers to each of the four sections in the filesystem.

Notes: (The diagram is not to scale.)

Calculating the pointers is simple pointer arithmetic. Remember, the entire filesystem is just an array of unsigned characters (bytes). How they are interpreted depends on which section you are referring to. The pointers are all offsets from the beginning of the array and their types match what they are pointing at.
  1. The super block is simply pointing at the first byte in the array. The type of the pointer is struct SuperBlock. (Offset: 0)
  2. The dir entries pointer comes next. The type of the pointer is pointer to struct DirEntry. The offset is:
    sizeof(struct SuperBlock)
    
  3. The FAT pointer follows. The type of the pointer is pointer to unsigned short. The offset is:
    sizeof(struct SuperBlock)) + number_of_direntries * sizeof(struct DirEntry)
    
  4. The data blocks pointer follows. The type of the pointer is pointer to unsigned char. The offset is:
    sizeof(struct SuperBlock)) + number_of_direntries * sizeof(struct DirEntry) + total_sectors * sizeof(unsigned short)
    
Suppose that you had these variable definitions:
struct SuperBlock *superblock_ptr;
struct DirEntry *direntries_ptr;
unsigned short *fat_ptr;
unsigned char *data_blocks_ptr;
Now, if you wanted to access any member in the superblock:
superblock_ptr->member;
To move to the next directory entry, fat entry, or data block, you could do something like this:
direntries_ptr++;                    /* Move to next directory entry */
fat_ptr++;                           /* Move to next FAT entry       */
data_blocks_ptr += bytes_per_sector; /* Move to next data block      */
Or you could use random access with subscripting for the directory entries and FAT:
direntries_ptr[3]; /* 4th directory entry */
fat_ptr[8];        /* 9th fat entry       */
Since the entire filesystem is an array of unsigned characters, you will need to do some casting to get the other pointer types to work properly. This is expected, especially with low-level code such as a filesystem. (Memory is similar in that it's just an array of characters.)

Once you have these pointers set up properly, moving through the various sections of the filesystem to locate objects is trivial and very efficient.

Self-check:
1. Suppose you want to support filenames up to 255 bytes in length. What changes would you have to make? What are the pros and cons of that?
2. Suppose you wanted to support millions of files. What changes would you need to make?
3. Suppose you wanted to handle files that were GBs in size. What changes would you need to make?
4. Supposed you wanted to keep track of a files last modification time/data and owner. What changes would need to be made?
5. Currently, we can only create new files. If you wanted to be able to append to an existing file, how would you go about that?

Why The File Allocation Table?

In essence, the scheme that we're using here is based on a singly-linked list. The difference is that we are storing each next pointer of a "node" (block) in a separate array, instead of in the node/block itself. Why is that? In a word: Performance.

The speed of memory is measured in nanoseconds (1 billionth of a second) and the speed of disks (HDD or SSD) is measured in milliseconds (1 thousandth of a second). That makes memory 1,000,000 times faster than the disk. Or, another way to say it is that the disk is 1,000,000 times slower than memory. There is overhead in accessing memory, so to compensate we'll just say that memory access takes microseconds (1 millionth of a second), so the disk is only 1,000 times slower (and that's being generous.)

Bottom line: We want to avoid accessing the disk unless we absolutely, positively must.

Linked lists are used everywhere in programming, but they are usually all in memory. For example, the list below shows 5 blocks of data (nodes) stored in memory. Each node has a next pointer that points to the location of the next node. We can assume that each block contains a few thousand bytes, so "stealing" a handful (4 or 8 bytes for a pointer) at the end has a negligible impact (low overhead).

Once the first block (head, address 2000) in the list is located, reading the next pointer tells us where the next block is. It is just a matter of reading memory to get the next pointer, so it's very fast. We just continue following the next pointers until we reach the end. Since memory is very fast, traversing the list in memory is efficient.

None of this information is new or enlightening to anyone reading this. However, when the nodes (blocks) are on the disk and not in memory, that's when things get interesting (and glacially slow) so a different technique is required.

Suppose each block of data contains 500 bytes of data. The five blocks combined gives us 2,500 bytes of data. This is where the bytes are stored:

  1. Bytes 000 - 499 are in block #0 starting at address 2000.
  2. Bytes 500 - 999 are in block #1 starting at address 3000.
  3. Bytes 1000 - 1499 are in block #2 starting at address 4000.
  4. Bytes 1500 - 1999 are in block #3 starting at address 5000.
  5. Bytes 2000 - 2499 are in block #4 starting at address 6000.
Now suppose you want to read byte number 2300. You know it's in block #4 (2300 / 500 is 4 with integer division) which is in the fifth block, but you don't know where the fifth block is until you traverse the linked list. Unlike arrays, there is no random access to these blocks so we have to walk through all of the preceding blocks to obtain the next pointer and continue until we get to the byte we want.

Now, imagine that those 5 blocks of data are stored on a disk instead of memory. We'll use the same addresses for demonstration purposes. How much effort (read: disk access) is required to read the 5 blocks? The algorithm is the same as before, except instead of reading the next pointers from memory, we have to read them from the disk. Since the disk is 1,000 times slower (again, that's being generous) than memory, it takes 1,000 times longer to read all of the next pointers! That will make any program completely unusable. 60 frames per second? Try more like 0.06 fps.

The fundamental problem with this strategy (when the blocks/nodes are on the disk) is that we have to read the entire block in order to get the next pointer. And reading disk blocks is extremely slow. We are spending way too much time reading unnecessary data just to get to the next pointers.

Here's an idea: What if we separated the data from the next pointers and kept the data (very large) on the disk and kept the next pointers (very small) in memory? That's where the file allocation table comes in.

Of course, we still have to read the next pointers from the disk, but we can read a lot of them (maybe thousands at-a-time) with one disk read and then traverse them in memory instead of reading each one from disk. Once we find the one we need, we can then just make one additional disk read to get the byte(s) we're interested in.

Back to the original problem:


Showing the "logical next pointers" in the file allocation table:

Filesystem Tools

At this point, we have a very rudimentary filesystem in place. There are really only two things that you can do with it and they are:
  1. Create new files
  2. Delete existing files
That's it! We can't even view a file or get any information about one. We also can't update an existing file. We would have to delete it and then re-create it. However, many files can't be updated (e.g. text files) and the only way to modify them is to do what I just described (delete, re-create). Still, we need to begin adding more functionality. Remember, the goal is to gain some insight into how filesystems work.

Also, we are tracking very little information about each file. In fact, we only track the name and size. We don't keep track of times, dates, owner/group, types (e.g. executable, directory), permissions like read/write, etc. This was just to keep things very simple at the beginning. Later, you will see how to easily add some of this information. We also don't have subdirectories; every file is in the same "global/root" directory, not unlike some older filesystems that didn't have anything but a single, root directory. MS-DOS didn't support hard disks or subdirectories until version 2.0. (It only supported floppies before then and all files were in the "root" directory.)

Some other things we'd like to do at this point.

Development Tools

Now that you are working with structured binary files, you may need additional custom tools to help you prove to yourselves that your output is correct. This is especially true with binary files. Text files are completely human-readable and so we never needed any special tools to view them. We would generate output and could tell just by looking at it (sometimes) if it was correct. For complicated text output, we used a diff tool to help us find the differences.

However, with binary files, it's more work because we can't just look at the files (easily) to see the differences, nor can we use a text-based diff tool (easily). Fortunately, there are a few tools that we can use.

Here are some suggested tools that I used or created during development of the filesystem:

The Big Picture

OK, so how realistic is this filesystem design? In theory, it's a sound starting place for a usable, albeit, inefficient way to store large numbers of files. In practice, it would be too inefficient for all but the smallest systems. However, by understanding the implementation of this simple filesystem you can begin to see how more sophisticated systems can be built.

Let's see what we can, and more importantly, cannot do with our current implementation.

Our ImplementationReal-World Implementations
Files
  Create
  Delete










Filesystem
  Create
  Mount (Load)
        
Files
  Read (open/read/close  text/binary)
  Modify (open/write/close  text/binary)
  Grow (append) and shrink (truncate)
  Rename
  List (think: ls -l)
  Timestamps (create/modify/access date/time)
  Permissions (rwx, user/group/other)
  Type info (file, link, dir)
  Longer filenames
  Secure erase
  Directories!

Filesystem
  Variable block sizes
  Verify (corruption)
  Repair (corruption)
  Defragment
  Compression
  Encryption
  Grow/shrink (w/o losing data)
  Hard links (Windows junctions)
  Soft (symbolic) links (Windows shortcuts)
  Error handling (e.g. duplicate names)
  Transactions
  Copy-on-write
  Snapshots
  Deduplication
  Concurrency (multiple processes accessing the filesystem simultaneously)
Well, that's a humbling list!

So, although our filesystem does almost nothing, you can easily see how you might extend it to support features found in modern filesystems. In fact, here are some questions for you to answer about this simple filesystem. How would you support:

References

Links