File Systems

  1. Overview
  2. Virtual File System Layer
  3. A Brief Unix/Linux/macOS Example
  4. Directories
  1. Case Study: ext2/ext3/ext4 Filesystem
  2. More inode Details
  3. Extents
  4. References

Overview

Virtual File System Layer

Operating System Concepts - 8th Edition Silberschatz, Galvin, Gagne ©2009  

Hardware view of a file Software view of a file Structure of a file

A Brief Unix/Linux/macOS Example

Another view: (annotated view)

Note: The multiple levels of indirection shown above is also how B-Trees (a tree-like data structure for very large data sets) work. Many filesystems are implemented using B-Trees or similar data structures that use extents for more efficiency.

The relationship between directory entries, inodes, and data blocks:
You can think of the directory entries as the Table of Contents of the file system. This is how the filesystem "looks up" the file by name and then follows the pointer (12345 in the example) to get to the metadata (inode), which leads to the data blocks.

The size of a pointer and the size of the disk blocks (either blocks of pointers or blocks of data, as they are the same) determines the maximum size of the disk (filesystem) as well as the maximum size of a file. Given this information and the sizes below, answer the question:

Assume these sizes:
Pointer sizeBlock sizeMax filesystem size*Max file size
4 bytes2,048 (2K)Depends???
4 bytes4,096 (4K)Depends???
8 bytes4,096 (4K)Depends???
8 bytes8,192 (8K)Depends???
* This value depends on how many inodes the filesystem has and is sometimes determined when the filesystem is created.

Every file in the system has a number of pointers to its data blocks. Find what the maximum number of pointers is (for a file) and then multiply that by the size of a disk block. That gives the size of the largest file. For a simplistic example, if you had a maximum of 1,000 pointers to data blocks, and each data block was 4,096 (4K) bytes, then the largest file would be:

1,000 x 4,096 = 4,096,000 bytes
It's just simple math arithmetic. This is why you need to know the size of a pointer and the size of a disk block, because this tells you exactly what the maximum number of pointers can be, and hence, the maximum file size.

Self-check - Given all of this information, answer this question: "What is the maximum number of files that a filesystem can hold?" The answer is not simply a number, it's an explanation. Think of it like this: "How many files of zero length can the filesystem hold?" That will give you the answer. (Hint: It's not unlimited or infinite!)

Bonus: What is the command in Linux that will tell you this information?

Self-check - With multiple levels of indirection, filesystems can be implemented efficiently for fragmented files. However, for non-fragmented (i.e. contiguous files), this approach is not very efficient. Explain why that is and how a better method can be used.

Self-check - For very small files (just a few bytes), there is a lot of overhead necessary to keep track of it using this scheme. Can you think of a simple optimization that could reduce the overhead for files that are very small, say, less than 100 bytes? Many systems have many very small files and we call them symbolic links or shortcuts.

For reference, this is somewhat related to how using a doubly-linked list to keep track of a single character causes a lot of overhead. Essentially, with 8-byte pointers, each node in the list would require 24 bytes just to hold the single character (plus 2 pointers and padding/alignment) That's essentially 96% overhead!

A simple filesystem implementation.

Directories

Case Study: ext2/ext3/ext4 Filesystem

The first filesystem developed specifically for Linux was the ext filesystem or extended filesystem, which was based on the Unix filesystem (a.k.a the Berkley Fast File System or FFS). Then, the ext2 filesystem enhanced ext further with more features from the FFS.

Next came the ext3 filesystem which added more improvements, especially journaling. After that came the ext4 filesystem, which added several more improvements, most notably, extents. Because the data structures (for the most part) have been compatible between the three filesystems (and we aren't interested in the other features yet), talking about ext4 will be very similar to discussing the structure of ext2/ext3 systems.

The ext4 filesystem is a very stable and mature filesystem used by many Linux distributions. It's not the best (if there exists a "best" filesystem) or fastest or the most feature-rich filesystem, but it's fairly efficient and fairly straight-forward to understand and implement (if you're an operating systems implementer). Many more powerful/complex filesystems have similar attributes of ext4. By understanding the basics of this filesystem, you'll be more likely to understand how other file systems work and what they have done to improve upon ext4.

So, with that said, let's see just how much work the filesystem must do in order to simply display the contents of a simple text file. We'll use this reference system for the demonstration:

chico@nina ~ $ ls -l / total 258,048 drwxr-xr-x 2 root root 4,096 Apr 9 2019 bin drwxr-xr-x 3 root root 4,096 Apr 9 2019 boot drwxr-xr-x 2 root root 4,096 Aug 23 2015 cdrom drwxr-xr-x 17 root root 4,640 Oct 1 11:56 dev drwxr-xr-x 213 root root 12,288 Oct 8 13:17 etc drwxr-xr-x 10 root root 4,096 Oct 8 13:20 home drwxr-xr-x 8 root root 4,096 Oct 8 13:20 homes drwxr-xr-x 27 root root 4,096 Apr 16 2019 lib drwxr-xr-x 2 root root 4,096 Apr 9 2019 lib32 drwxr-xr-x 2 root root 4,096 Apr 9 2019 lib64 drwxr-xr-x 2 root root 4,096 Apr 9 2019 libx32 drwxr-xr-x 2 root root 16,384 Feb 18 2017 lost+found drwxr-xr-x 6 root root 4,096 Jul 8 2018 media [several more lines removed . . .] chico@nina ~ $

We're going to focus on the user named chico. We will search for and display a text file in his own home directory which is /homes/chico. Let's see what we have in the homes directory.

chico@nina ~ $ ls -l /homes total 24,576 drwxr-xr-x 2 alvin alvin 4,096 Oct 8 13:20 alvin drwxr-xr-x 2 betty betty 4,096 Oct 8 13:20 betty drwxr-xr-x 8 chico chico 4,096 Oct 8 13:20 chico drwxr-xr-x 2 fred fred 4,096 Oct 8 13:20 fred drwxr-xr-x 2 veronica veronica 4,096 Oct 8 13:20 veronica drwxr-xr-x 2 wilma wilma 4,096 Oct 8 13:20 wilma chico@nina ~ $

Note: On a typical Linux system, a user's home directory is in the /home (singular) directory. However, for this example (and for technical reasons), I've created some "artificial" users in /homes (plural) which will make the details a little easier to explain and understand. Just keep that in mind if you're trying to find a /homes directory on your system as it's unlikely to exist.

Let's see what's in chico's directory using the tree command:

chico@nina ~ $ tree /homes/chico /homes/chico ├── bathroom ├── bedroom ├── garage └── kitchen ├── cupboards ├── microwave ├── oven ├── refrigerator │ ├── apples │ ├── butter │ ├── cake │ ├── cheese │ ├── chicken │ ├── coke │ ├── eggs │ ├── juice │ ├── milk │ └── pie ├── sink └── stove 10 directories, 10 files chico@nina ~ $

The file were interested in is cake. The full path to cake is:
/homes/chico/kitchen/refrigerator/cake
and the command that we will use to display the contents:
cat /homes/chico/kitchen/refrigerator/cake
and the output:
eggs
butter
milk
flour
vanilla
icing
strawberries
peaches
lettuce
asparagus
Which is presumably all of the things that are in the cake! (Don't knock it until you've tried it!)

Note: I'm attempting to create an analogy/metaphor here. In chico's home (directory) there is a kitchen (directory), and in the kitchen there is a refrigerator (directory) and in the refrigerator there is a cake (file) that contains ingredients (lines of text).

So, the question is, "How many disk reads are required to locate (search), open (read), and display the file?" To answer that question, this is how we proceed.

  1. We have to locate the root directory, / (the forward slash). All files are within the root directory.
  2. Then, we search the root directory looking for a directory called homes.
  3. Next, we search the homes directory looking for a directory called chico.
  4. Next, we search the chico directory looking for a directory called kitchen.
  5. Next, we search the kitchen directory looking for a directory called refrigerator.
  6. Next, we search the refrigerator directory looking for a file called cake.
  7. Then, we open the file called cake and read in all of the data.
  8. Finally, we display the data on the screen.
The first seven steps all require disk reads. That, in a nutshell, is how we would display the contents of the file. As you can see, the longer the path is, the more work is required by the filesystem. So, as you can imagine, locating the file:
/usr/hostname
is going to require significantly less work than locating this file:
/usr/share/icons/foo/bar/baz/bat/one/more/dir/and/were/done/file.txt
The hostname file above only requires searching the root directory (/) and the usr directory. The file.txt requires searching the root directory and 13 other directories before getting to file.txt! That's a lot of work that must be done everytime you access that file. Fortunately for the users, it's all hidden behind the filesystem.

The ext4 filesystem accomplishes this work using inodes and data blocks that were described above. Let's go through this step-by-step to see exactly what is going on. I'm going to use real data from one of my systems to show this process.

As you can imagine, there are a bunch of tools on a Linux system that will help us peer into the filesystems data structures (inodes) and disk blocks. The first and simplest command is our trusty ls command. If you run ls -ld / (on the root directory), it will display something like this:

drwxr-xr-x 29 root root 4,096 Oct  9 13:00 /
If we add -i to the command, it will also show us the inode that contains the information about the root directory:
ls -ldi /
Output:
2 drwxr-xr-x 29 root root 4,096 Oct  9 13:00 /
This tells us that the root directory's inode is inode #2. By the way, the -d option tells ls to just show information about the directory itself, not the contents of the directory. Removing the option will show this output.

Another way we could have found the inode is with the statstat command:

stat /
Output:
  File: '/'
  Size: 4096      	Blocks: 8          IO Block: 4096   directory
Device: 801h/2049d	Inode: 2           Links: 29
Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2019-05-18 12:13:53.741950988 -0700
Modify: 2020-10-09 13:00:01.918312997 -0700
Change: 2020-10-09 13:00:01.918312997 -0700
 Birth: -
There's a lot of other information displayed as well, but for now, we're just concerned with the inode. (The IO Block: 4096 is also important as it tells us how big each logical disk block is.)

OK, so we have the inode, but where on the disk is that inode? This is where the next tool comes in handy. It's called debugfs and it's used to help debug (or simply glean information about) the ext2/ext3/ext4 filesystems. This is the command that will map the inode number into a disk block:

sudo debugfs -R 'imap <2>' /dev/sda1
This command essentially runs debugfs and tells it to map inode #2 to its corresponding disk block on /dev/sda1, which is the first partition on the first hard drive in the system. If you want to see all of the partitions on all of the drives, just run the lsblk command and you'll see something like this:
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda      8:0    0 931.5G  0 disk 
├─sda1   8:1    0  39.1G  0 part /
├─sda2   8:2    0  15.6G  0 part 
├─sda3   8:3    0  39.1G  0 part 
├─sda4   8:4    0     1K  0 part 
├─sda5   8:5    0 781.3G  0 part /home
├─sda6   8:6    0    41G  0 part /opt
└─sda7   8:7    0  15.5G  0 part [SWAP]
sdb      8:16   0   3.7T  0 disk 
└─sdb1   8:17   0   3.7T  0 part /storage
sdc      8:32   0   3.7T  0 disk 
└─sdc1   8:33   0   3.7T  0 part /media/chico/wd-elements1
sdd      8:48   0   3.7T  0 disk 
└─sdd1   8:49   0   3.7T  0 part /media/chico/wd-elements3
sde      8:64   0   3.7T  0 disk 
└─sde1   8:65   0   3.7T  0 part /media/chico/wd-elements2
sr0     11:0    1  1024M  0 rom  
I've highlighted the partition that we're interested in which is the first partition on the first disk.

This output is also telling me that there are 5 "disks" connected to my computer named sda, sdb, sdc, sdd, and sr0 (which is a DVD drive). It also tells me that the first drive has 7 partitions and the others only have one. Incidentally, these are the types of storage devices in the system:

  1. sda - This is a 1 TB internal solid state mSATA drive.
  2. sdb - This is a 4 TB internal solid state drive (SSD).
  3. sdc - This is a 4 TB external USB drive.
  4. sdd - This is a 4 TB external USB drive.
  5. sde - This is a 4 TB external USB drive.
  6. sr0 - This is an external USB DVD reader/writer.

OK, so back to the command:

sudo debugfs -R 'imap <2>' /dev/sda1
and its output:
debugfs 1.42.9 (4-Feb-2014)
Inode 2 is part of block group 0
	located at block 1057, offset 0x0100
The important information is the last line which tells us that inode #2 is located 256 bytes (0x0100) within disk block #1057. Now, all we have to do is to read the data at that location and we will have read all of the important information about the root directory.

To help out with my demonstration, I've written my own program that will read any blocks or partial blocks of data from any partition on any device. It's called readblock and you use it like this:

sudo readblock <partition> <block-number> <offset> <bytes-to-read>
So, to read the raw bytes from inode #2, we do this:
sudo readblock /dev/sda1 1057 0x0100 256
Broken down:

Note: The readblock program is a work-in-progress. It currently reads the device to find out the size of the disk blocks. Generally, the size of the blocks is 4K (4,096) bytes, which is true for all of my partitions. It is important to have the correct block size because that value is used in all of the calculations. The program also allows the user to specify values in hexadecimal (0x prefix) or decimal.

Note: There are existing tools on Linux that will do something similar to my readblock program. However, I wanted to have total control over the output, so I wrote my own. It's only a few lines of code, actually. One such tool on Linux is dd. Very handy, powerful, and, dangerous! Read up on it before using it! YOU HAVE BEEN WARNED!

So, the actual bytes that will be read are bytes 4,329,998 to 4,330,254. The way we arrived at those numbers was:

BlockNumber * BlockSize + Offset
    1057    *   4096    +  256     = 4,329,742 + 256 = 4,329,998 [starting byte]
                                               + 256 = 4,330,254 [ending byte]
Now, because the information in the inode is mostly binary, when displaying it on the screen it will just look like garbage:
�AqY�\Qπ_Qπ7�!$ �P�ɬP��0尦��X
However, it really did read and display (or try to display) 256 bytes of binary data. One thing you can do is to redirect the output to a file:
sudo readblock /dev/sda1 1057 0x0100 256 > inode2.bin
On the disk you'll see that it's exactly 256 bytes;
ls -l inode2.bin
Output:
-rw------- 1 chico chico 256 Oct  9 14:46 inode2.bin
Now, you can just use any of the bajillion hex viewers to look at it such as hexdump or od (octal dump)
od -x inode2.bin
Output:
0000000 41ed 0000 1000 0000 5971 5ce0 cf51 5f80
0000020 cf51 5f80 0000 0000 0000 001d 0008 0000
0000040 0000 0008 3714 0000 f30a 0001 0004 0000
0000060 0000 0000 0000 0000 0001 0000 2421 0000
0000100 0000 0000 0000 0000 0000 0000 0000 0000
*
0000200 0020 0000 50ac c9c5 50ac c9c5 1830 b0e5
0000220 ffa6 58ac 0000 0000 0000 0000 0000 0000
0000240 0000 0000 0000 0000 0000 0000 0000 0000
*
0000400
Or, better yet, how about the trusty old dumpit program:
dumpit inode2.bin
Output:
inode2.bin:
       00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
--------------------------------------------------------------------------
000000 ED 41 00 00 00 10 00 00  71 59 E0 5C 51 CF 80 5F   .A......qY.\Q.._
000010 51 CF 80 5F 00 00 00 00  00 00 1D 00 08 00 00 00   Q.._............
000020 00 00 08 00 14 37 00 00  0A F3 01 00 04 00 00 00   .....7..........
000030 00 00 00 00 00 00 00 00  01 00 00 00 21 24 00 00   ............!$..
000040 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000050 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000060 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000070 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000080 20 00 00 00 AC 50 C5 C9  AC 50 C5 C9 30 18 E5 B0    ....P...P..0...
000090 A6 FF AC 58 00 00 00 00  00 00 00 00 00 00 00 00   ...X............
0000A0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000B0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000C0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000D0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000E0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000F0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
This is showing us the actual raw binary data that is stored in the disk block.

In fact, let's skip the temporary file creation and just pipe the output of readblocks directly into dumpit:

sudo readblock /dev/sda1 1057 0x0100 256 | dumpit
That will produce the same output! Yeah, pipes are a wonderful thing! (If you don't have access to the dumpit program, you can just use od or something similar.)
       00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
--------------------------------------------------------------------------
000000 ED 41 00 00 00 10 00 00  71 59 E0 5C 51 CF 80 5F   .A......qY.\Q.._
000010 51 CF 80 5F 00 00 00 00  00 00 1D 00 08 00 00 00   Q.._............
000020 00 00 08 00 14 37 00 00  0A F3 01 00 04 00 00 00   .....7..........
000030 00 00 00 00 00 00 00 00  01 00 00 00 21 24 00 00   ............!$..
000040 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000050 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000060 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000070 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000080 20 00 00 00 AC 50 C5 C9  AC 50 C5 C9 30 18 E5 B0    ....P...P..0...
000090 A6 FF AC 58 00 00 00 00  00 00 00 00 00 00 00 00   ...X............
0000A0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000B0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000C0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000D0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000E0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000F0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
Most of the entries are zeros, but there is a bunch of other stuff. Specifically, those values represent permissions (read/write/execute and owner/group) as well as time/date of the file, how big it is, what type of file/directory it is, etc. However, what we are interested in is the contents of the root directory. Remember, our goal in all of this is to locate and display this file:
/homes/chico/kitchen/refrigerator/cake
Currently, we've just found the root directory's inode. Now, with this, we need to get the contents of the root directory because that's where the homes directory is located. I've highlighted some bytes in the output above. The hex number: 00 00 24 21 is the one. (My system is little-endian, so that's why the bytes appear to be reversed.) That number is a pointer (block number) to another block that contains the contents (i.e. the filenames) in the root directory.

Aside: There is a lot of information encoded in that inode and most of it is not necessary to understand in order to learn how the filesystem works. I will point out some other useful bits of information later. For now, the only piece we are interested in is the location (read: pointer) of the contents of the directory. That's what is highlighted. There are links below that will describe the layout of the inode and all of its data fields in excrutiating detail.

Ok, so how do we read the contents? Simple. We use the readblock program again:
sudo readblock /dev/sda1 0x2421 0 512 | dumpit
I'm just reading the first 512 bytes from data block #0x2421 (9249 in decimal), as that will contain what we're looking for. Of course, all data blocks are 4,096 bytes in length and if I showed every byte, all of the bytes at the end would be 0.
       00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
--------------------------------------------------------------------------
000000 02 00 00 00 0C 00 01 02  2E 00 00 00 02 00 00 00   ................
000010 0C 00 02 02 2E 2E 00 00  0B 00 00 00 14 00 0A 02   ................
000020 6C 6F 73 74 2B 66 6F 75  6E 64 00 00 0C 00 00 00   lost+found......
000030 14 00 0A 07 69 6E 69 74  72 64 2E 69 6D 67 00 00   ....initrd.img..
000040 0D 00 00 00 10 00 07 07  76 6D 6C 69 6E 75 7A 00   ........vmlinuz.
000050 01 00 24 00 0C 00 03 02  62 69 6E 00 01 00 08 00   ..$.....bin.....
000060 0C 00 04 02 62 6F 6F 74  01 00 0C 00 10 00 05 02   ....boot........
000070 63 64 72 6F 6D 00 00 00  01 00 0A 00 0C 00 03 02   cdrom...........
000080 64 65 76 00 01 00 02 00  0C 00 03 02 65 74 63 00   dev.........etc.
000090 01 00 0E 00 0C 00 04 02  68 6F 6D 65 01 00 14 00   ........home....
0000A0 0C 00 03 02 6C 69 62 00  01 00 20 00 10 00 05 02   ....lib... .....
0000B0 6C 69 62 33 32 00 00 00  01 00 10 00 10 00 05 02   lib32...........
0000C0 6C 69 62 36 34 00 00 00  01 00 04 00 10 00 06 02   lib64...........
0000D0 6C 69 62 78 33 32 00 00  01 00 16 00 10 00 05 02   libx32..........
0000E0 6D 65 64 69 61 00 00 00  01 00 06 00 0C 00 03 02   media...........
0000F0 6D 6E 74 00 01 00 22 00  0C 00 03 02 6F 70 74 00   mnt...".....opt.
000100 01 00 18 00 0C 00 04 02  70 72 6F 63 01 00 1C 00   ........proc....
000110 0C 00 04 02 72 6F 6F 74  01 00 1A 00 0C 00 03 02   ....root........
000120 72 75 6E 00 01 00 1E 00  0C 00 04 02 73 62 69 6E   run.........sbin
000130 02 00 08 00 0C 00 03 02  73 72 76 00 02 00 04 00   ........srv.....
000140 0C 00 03 02 73 79 73 00  02 00 06 00 0C 00 03 02   ....sys.........
000150 74 6D 70 00 02 00 0A 00  0C 00 03 02 75 73 72 00   tmp.........usr.
000160 02 00 02 00 0C 00 03 02  76 61 72 00 02 00 0C 00   ........var.....
000170 0C 00 04 02 77 65 62 6D  66 77 06 00 14 00 07 02   ....webmfw......
000180 73 74 6F 72 61 67 65 6F  46 71 57 41 0E 00 00 00   storageoFqWA....
000190 10 00 05 01 2E 68 63 77  64 00 00 00 1A 0E 02 00   .....hcwd.......
0001A0 10 00 07 02 2E 63 6F 6E  66 69 67 74 D2 05 14 00   .....configt....
0001B0 10 00 05 02 68 6F 6D 65  73 31 77 76 0F 00 00 00   ....homes1wv....
0001C0 44 0E 10 01 77 65 62 6D  69 6E 2D 73 65 74 75 70   D...webmin-setup
0001D0 2E 6F 75 74 12 00 00 00  2C 0E 12 01 2E 69 73 6D   .out....,....ism
0001E0 6F 75 6E 74 2D 74 65 73  74 2D 66 69 6C 65 00 00   ount-test-file..
0001F0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
I've highlighted the name of the directory we're searching for (homes) as well as a few other things. The 05 is the length of the filename, as these are not NUL-terminated strings (like C/C++). Also, the 02 is the type of file (0x02 means it's a directory). Files can be of these types:
CodeType of file
0Unknown
1regular file
2directory
3character device
4block device
5FIFO
6socket
7symbolic link

Lastly, and most importantly, I've highlighted the number D2 05 14 00, as this is the inode (little endian) for the homes directory. Remember, in addition to finding and searching the root directory, we also have to find and search the homes, chico, kitchen, and refrigerator directories. This is what's happening "behind the scenes" every time you try to access any file on the system.

Incidentally, the 2E and 2E 2E values at the top of the output correspond to the current directory (just a single dot .) and the parent directory, (2 dots ..) which are two directories you will find in every directory (even the root, which has no parent!)

Ok, so now it's time to search through the contents of the homes directory and see if we can locate the directory named chico.

First, we have to read the inode for the homes directory. We know that the inode number is 0x001405D2 because that's what we found in the root directory. Converting the hex to decimal we get 1312210. To verify that we are actually correct, we can simply stat the homes directory:

stat /homes
Output:
  File: '/homes'
  Size: 4096      	Blocks: 8          IO Block: 4096   directory
Device: 801h/2049d	Inode: 1312210     Links: 8
Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2020-10-08 13:15:10.002908649 -0700
Modify: 2020-10-08 13:20:28.402906431 -0700
Change: 2020-10-08 13:20:28.402906431 -0700
 Birth: -
Of course, we could have done this as well:
ls -ldi /homes
Output:
1312210 drwxr-xr-x 8 root root 4,096 Oct  8 13:20 /homes
Ok, let's dump that inode using readblock. First, we have to find out where (read: in which disk block) the inode resides. Using debugfs again to map the inode number to a disk block:
sudo debugfs -R 'imap <1312210>' /dev/sda1
Output:
Inode 1312210 is part of block group 160
	located at block 5243005, offset 0x0100
Using this information, we can read the block:
sudo readblock /dev/sda1 5243005 0x0100 256 | dumpit
Output:
       00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
--------------------------------------------------------------------------
000000 ED 41 00 00 00 10 00 00  4E 73 7F 5F 8C 74 7F 5F   .A......Ns._.t._
000010 8C 74 7F 5F 00 00 00 00  00 00 08 00 08 00 00 00   .t._............
000020 00 00 08 00 07 00 00 00  0A F3 01 00 04 00 00 00   ................
000030 00 00 00 00 00 00 00 00  01 00 00 00 AF 2B 50 00   .............+P.
000040 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000050 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000060 00 00 00 00 2D AD E8 08  00 00 00 00 00 00 00 00   ....-...........
000070 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000080 1C 00 00 00 FC 74 0F 60  FC 74 0F 60 A4 87 B1 00   .....t.`.t.`....
000090 4E 73 7F 5F A4 87 B1 00  00 00 00 00 00 00 00 00   Ns._............
0000A0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000B0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000C0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000D0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000E0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000F0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
This is the inode for the homes directory. We need to see the content (read: filenames) in the directory. I've highlighted the pointer to the contents above. Now, read that block to get the contents:
sudo readblock /dev/sda1 0x00502BAF 0 256 | dumpit
Output:
       00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
--------------------------------------------------------------------------
000000 D2 05 14 00 0C 00 01 02  2E 00 00 00 02 00 00 00   ................
000010 0C 00 02 02 2E 2E 00 00  29 60 14 00 10 00 05 02   ........)`......
000020 63 68 69 63 6F 00 00 00  38 60 14 00 10 00 05 02   chico...8`......
000030 61 6C 76 69 6E 00 00 00  39 60 14 00 10 00 08 02   alvin...9`......
000040 76 65 72 6F 6E 69 63 61  3A 60 14 00 10 00 05 02   veronica:`......
000050 62 65 74 74 79 00 00 00  3B 60 14 00 0C 00 04 02   betty...;`......
000060 66 72 65 64 3C 60 14 00  9C 0F 05 02 77 69 6C 6D   fred<`......wilm
000070 61 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   a...............
000080 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000090 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000A0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000B0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000C0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000D0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000E0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000F0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
Aw, yeah! Now we're cookin' with gas! I've highlighted the name (chico) and its corresponding inode (0x00146029). Remember, this is what's in /homes:

chico@nina ~ $ ls -l /homes total 24,576 drwxr-xr-x 2 alvin alvin 4,096 Oct 8 13:20 alvin drwxr-xr-x 2 betty betty 4,096 Oct 8 13:20 betty drwxr-xr-x 8 chico chico 4,096 Oct 8 13:20 chico drwxr-xr-x 2 fred fred 4,096 Oct 8 13:20 fred drwxr-xr-x 2 veronica veronica 4,096 Oct 8 13:20 veronica drwxr-xr-x 2 wilma wilma 4,096 Oct 8 13:20 wilma chico@nina ~ $

We can find the block that contains this inode for /homes/chico:
sudo debugfs -R 'imap <0x00146029>' /dev/sda1
Output:
Inode 1335337 is part of block group 163
	located at block 5244450, offset 0x0800
Then dump the inode:
sudo readblock /dev/sda1 5244450 0x0800 256 | dumpit
Output:
       00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
--------------------------------------------------------------------------
000000 ED 41 EA 03 00 10 00 00  73 73 7F 5F B4 74 7F 5F   .A......ss._.t._
000010 A4 74 7F 5F 00 00 00 00  EB 03 08 00 08 00 00 00   .t._............
000020 00 00 08 00 13 00 00 00  0A F3 01 00 04 00 00 00   ................
000030 00 00 00 00 00 00 00 00  01 00 00 00 B8 2B 50 00   .............+P.
000040 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000050 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000060 00 00 00 00 32 AD E8 08  00 00 00 00 00 00 00 00   ....2...........
000070 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000080 1C 00 00 00 90 8C 6F E8  68 5E 5E 07 90 A3 64 82   ......o.h^^...d.
000090 73 73 7F 5F 90 A3 64 82  00 00 00 00 00 00 00 00   ss._..d.........
0000A0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000B0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000C0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000D0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000E0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000F0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
To get the contents of the /homes/chico directory, we have to follow the pointer that is highlighted above (0x00502BB8) and dump the first few bytes of the block:
sudo readblock /dev/sda1 0x502BB8 0 256 | dumpit
Output:
       00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
--------------------------------------------------------------------------
000000 29 60 14 00 0C 00 01 02  2E 00 00 00 D2 05 14 00   )`..............
000010 0C 00 02 02 2E 2E 00 00  2A 60 14 00 10 00 07 02   ........*`......
000020 2E 63 6F 6E 66 69 67 00  2B 60 14 00 10 00 08 02   .config.+`......
000030 2E 6D 6F 7A 69 6C 6C 61  B7 15 14 00 1C 00 11 01   .mozilla........
000040 2E 63 6F 6D 70 74 6F 6E  2D 74 64 65 2E 63 6F 6E   .compton-tde.con
000050 66 50 30 00 B6 15 14 00  14 00 0C 01 2E 62 61 73   fP0..........bas
000060 68 5F 6C 6F 67 6F 75 74  B9 15 14 00 18 00 0B 01   h_logout........
000070 2E 78 63 6F 6D 70 6D 67  72 72 63 4C 53 63 74 6E   .xcompmgrrcLSctn
000080 B8 15 14 00 10 00 08 01  2E 70 72 6F 66 69 6C 65   .........profile
000090 3D 60 14 00 10 00 07 02  6B 69 74 63 68 65 6E 67   =`......kitcheng
0000A0 3E 60 14 00 10 00 07 02  62 65 64 72 6F 6F 6D 00   >`......bedroom.
0000B0 3F 60 14 00 10 00 08 02  62 61 74 68 72 6F 6F 6D   ?`......bathroom
0000C0 40 60 14 00 40 0F 06 02  67 61 72 61 67 65 00 00   @`..@...garage..
0000D0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000E0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000F0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
As a reminder, this is what's in /homes/chico. There are 4 visible directories there. You'll also see there are a few hidden files/directories also, but we can ignore those.

This time, I've highlighted the kitchen directory and its inode (0x0014603D) because now we need to find the refrigerator directory in the kitchen directory.

Now, find the block that contains the inode:

sudo debugfs -R 'imap <0x0014603D>' /dev/sda1
Output:
Inode 1335357 is part of block group 163
	located at block 5244451, offset 0x0c00
Then, dump the inode:
sudo readblock /dev/sda1 5244451 0x0c00 256 | dumpit
Output:
       00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
--------------------------------------------------------------------------
000000 ED 41 00 00 00 10 00 00  A4 74 7F 5F CE 74 7F 5F   .A.......t._.t._
000010 CE 74 7F 5F 00 00 00 00  00 00 08 00 08 00 00 00   .t._............
000020 00 00 08 00 07 00 00 00  0A F3 01 00 04 00 00 00   ................
000030 00 00 00 00 00 00 00 00  01 00 00 00 FE 23 50 00   .............#P.
000040 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000050 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000060 00 00 00 00 D2 AD E8 08  00 00 00 00 00 00 00 00   ................
000070 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000080 1C 00 00 00 CC FD DF 63  CC FD DF 63 68 5E 5E 07   .......c...ch^^.
000090 A4 74 7F 5F 68 5E 5E 07  00 00 00 00 00 00 00 00   .t._h^^.........
0000A0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000B0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000C0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000D0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000E0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000F0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
To get the contents of the /homes/chico/kitchen directory, we need to follow the highlighted pointer (block) above and dump the first few bytes of that block:
sudo readblock /dev/sda1 0x005023FE 0 256 | dumpit
Output:
       00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
--------------------------------------------------------------------------
000000 3D 60 14 00 0C 00 01 02  2E 00 00 00 29 60 14 00   =`..........)`..
000010 0C 00 02 02 2E 2E 00 00  41 60 14 00 14 00 0C 02   ........A`......
000020 72 65 66 72 69 67 65 72  61 74 6F 72 42 60 14 00   refrigeratorB`..
000030 0C 00 04 02 73 69 6E 6B  43 60 14 00 14 00 09 02   ....sinkC`......
000040 63 75 70 62 6F 61 72 64  73 00 00 00 44 60 14 00   cupboards...D`..
000050 0C 00 04 02 6F 76 65 6E  45 60 14 00 10 00 05 02   ....ovenE`......
000060 73 74 6F 76 65 00 00 00  46 60 14 00 98 0F 09 02   stove...F`......
000070 6D 69 63 72 6F 77 61 76  65 00 00 00 00 00 00 00   microwave.......
000080 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000090 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000A0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000B0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000C0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000D0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000E0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000F0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
Again, this is what's in /homes/chico/kitchen. You'll see 6 visible directories. Now that we've located the refrigerator directory, it's time to find out what's in it.

As usual, find the block that contains the inode for refrigerator:

sudo debugfs -R 'imap <0x00146041>' /dev/sda1
Output:
Inode 1335361 is part of block group 163
	located at block 5244452, offset 0x0000
Then dump the inode:
       00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
--------------------------------------------------------------------------
000000 ED 41 EA 03 00 10 00 00  CE 74 7F 5F 5E 75 7F 5F   .A.......t._^u._
000010 10 75 7F 5F 00 00 00 00  EB 03 02 00 08 00 00 00   .u._............
000020 00 00 08 00 0B 00 00 00  0A F3 01 00 04 00 00 00   ................
000030 00 00 00 00 00 00 00 00  01 00 00 00 22 24 50 00   ............"$P.
000040 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000050 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000060 00 00 00 00 DD AD E8 08  00 00 00 00 00 00 00 00   ................
000070 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000080 1C 00 00 00 1C 92 58 83  A8 26 C4 13 CC FD DF 63   ......X..&.....c
000090 CE 74 7F 5F CC FD DF 63  00 00 00 00 00 00 00 00   .t._...c........
0000A0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000B0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000C0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000D0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000E0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000F0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
Then follow the pointer to get the contents of refrigerator:
sudo readblock /dev/sda1 0x00502422 0 256 | dumpit
Output:
       00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
--------------------------------------------------------------------------
000000 41 60 14 00 0C 00 01 02  2E 00 00 00 3D 60 14 00   A`..........=`..
000010 0C 00 02 02 2E 2E 00 00  FB 15 14 00 0C 00 04 01   ................
000020 6D 69 6C 6B FC 15 14 00  0C 00 04 01 65 67 67 73   milk........eggs
000030 FD 15 14 00 10 00 06 01  62 75 74 74 65 72 00 00   ........butter..
000040 FE 15 14 00 10 00 05 01  6A 75 69 63 65 00 00 00   ........juice...
000050 FF 15 14 00 10 00 06 01  63 68 65 65 73 65 00 00   ........cheese..
000060 00 16 14 00 0C 00 04 01  63 6F 6B 65 01 16 14 00   ........coke....
000070 10 00 06 01 61 70 70 6C  65 73 00 00 02 16 14 00   ....apples......
000080 10 00 07 01 63 68 69 63  6B 65 6E 00 03 16 14 00   ....chicken.....
000090 0C 00 04 01 63 61 6B 65  04 16 14 00 68 0F 03 01   ....cake....h...
0000A0 70 69 65 00 00 00 00 00  00 00 00 00 00 00 00 00   pie.............
0000B0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000C0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000D0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000E0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000F0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
Reminder, this is what's in /homes/chico/kitchen/refrigerator. You'll see 10 visible files (not directories) this time. Now that we've located the cake file, it's time to find out what's in it. The process is the same as it is for directories.

We have the inode for cake and it's inode 0x00141602. Let's dump out the inode. First, get the block that contains it:

sudo debugfs -R 'imap <0x00141602>' /dev/sda1
Output:
Inode 1316355 is part of block group 160
	located at block 5243264, offset 0x0200
Then, dump the inode:
       00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
--------------------------------------------------------------------------
000000 A4 81 EA 03 4C 00 00 00  10 75 7F 5F 79 C7 80 5F   ....L....u._y.._
000010 79 C7 80 5F 00 00 00 00  EB 03 01 00 08 00 00 00   y.._............
000020 00 00 08 00 01 00 00 00  0A F3 01 00 04 00 00 00   ................
000030 00 00 00 00 00 00 00 00  01 00 00 00 08 20 3E 00   ............. >.
000040 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000050 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000060 00 00 00 00 ED AD E8 08  00 00 00 00 00 00 00 00   ................
000070 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000080 1C 00 00 00 54 B3 DE 60  54 B3 DE 60 A8 26 C4 13   ....T..`T..`.&..
000090 10 75 7F 5F A8 26 C4 13  00 00 00 00 00 00 00 00   .u._.&..........
0000A0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000B0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000C0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000D0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000E0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000F0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
Additionally, I've highlighted 4 bytes above because they have a specific meaning. Those bytes are the actual size of the file (0x0000004C is 76 in decimal). We'll return to this value shortly.

If we follow the pointer above, it will take us to the contents of the cake file, which is the actual text that is in the file.

sudo readblock /dev/sda1 0x003E2008 0 128 | dumpit
Output:
       00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
--------------------------------------------------------------------------
000000 65 67 67 73 0A 62 75 74  74 65 72 0A 6D 69 6C 6B   eggs.butter.milk
000010 0A 66 6C 6F 75 72 0A 76  61 6E 69 6C 6C 61 0A 69   .flour.vanilla.i
000020 63 69 6E 67 0A 73 74 72  61 77 62 65 72 72 69 65   cing.strawberrie
000030 73 0A 70 65 61 63 68 65  73 0A 6C 65 74 74 75 63   s.peaches.lettuc
000040 65 0A 61 73 70 61 72 61  67 75 73 0A 00 00 00 00   e.asparagus.....
000050 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000060 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000070 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
This time, I chose to just dump out 128 bytes, because the file is smaller than that. Because all of the data in the cake file is text, I don't need to use dumpit as I can just display it:
sudo readblock /dev/sda1 0x003E2008 0 128 
Output:
eggs
butter
milk
flour
vanilla
icing
strawberries
peaches
lettuce
asparagus
The reason it stops after asparagus is because that's the end of the printable characters and character 0 doesn't print anything. This is the exact output we got from our original command:
cat /homes/chico/kitchen/refrigerator/cake
This was the entire purpose of this demonstration: To show exactly what is going on behind the scenes. So, to get back to the original question: "How many disk reads were required to locate (search), open (read), and display the file?"

Now, you should be able to answer that.

If we use ls to look at the file:

ls -l /homes/chico/kitchen/refrigerator/cake
Output:
-rw-r--r-- 1 chico chico 76 Oct  9 13:26 /homes/chico/kitchen/refrigerator/cake
We can, in fact, see that the file size is 76 bytes, which is part of the inode (metadata) that is associated with this file that was shown above.

Remember these two files from before? I said that this file:

/usr/hostname
is going to require significantly less work than locating this file:
/usr/share/icons/foo/bar/baz/bat/one/more/dir/and/were/done/file.txt
It should be clear and obvious why that is. Each directory adds 2 additional disk reads to the process. One read is for the inode and one is for the contents. So, files stored very deep in the heirarchy are much more expensive to read than ones that are shallow. To reach the hostname file, the system only has to read 2 directories (root and usr) but to reach file.txt it has to read 14 directories! (The root directory plus 13 subdirectories)

Notes:

More Inode Details

We saw that there was quite a bit of information in the inodes that we were ignoring. We were basically just interested in using the inode to find the data block(s) associated with the file/directory. Let's look a little closer at the inode for the cake file:
ls -li /homes/chico/kitchen/refrigerator/cake
Output:
1316355 -rw-r--r-- 1 chico chico 76 Oct 12 14:24 /homes/chico/kitchen/refrigerator/cake
Output annotated:
1316355  -rw-r--r--  1  chico  chico  76  Oct 12 14:24  /homes/chico/kitchen/refrigerator/cake
^^^^^^^  ^^^^^^^^^^  ^  ^^^^^  ^^^^^  ^^  ^^^^^^^^^^^^  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 inode     perms     |    |      |    |     date/time            fullpath of the file
           type      |    |      |    |
                    /     |      |     \
                   /      |      |      \
               links    owner  group    size
The ls command shows us a lot of information that is all in the inode for the file. Let's break it down. These are the fields (from left to right) and their meanings:
FieldDescription
1316355This is the inode for the file.
-rw-r--r--These are the permissions for the user, group, and others, as well as the type of file.
1The number of hard links to this file.
chicoThe owner (user) of the file.
chicoThe group that the file belongs to.
76The size of the file (in bytes).
Oct 12 14:24The date/time that the contents of the file were last modified.
Using the same technique to find the inode and dump out its contents:
sudo debugfs -R 'imap <1316355>' /dev/sda1
Output:
Inode 1316355 is part of block group 160
	located at block 5243264, offset 0x0200
Dump the inode:
sudo readblock /dev/sda1 5243264 0x200 256 | dumpit
Output:
       00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
--------------------------------------------------------------------------
000000 A4 81 EA 03 4C 00 00 00  10 75 7F 5F 76 C9 84 5F   ....L....u._v.._
000010 76 C9 84 5F 00 00 00 00  EB 03 01 00 08 00 00 00   v.._............
000020 00 00 08 00 01 00 00 00  0A F3 01 00 04 00 00 00   ................
000030 00 00 00 00 00 00 00 00  01 00 00 00 F8 60 24 00   .............`$.
000040 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000050 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000060 00 00 00 00 ED AD E8 08  00 00 00 00 00 00 00 00   ................
000070 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000080 1C 00 00 00 E4 98 9D 1D  E4 98 9D 1D A8 26 C4 13   .............&..
000090 10 75 7F 5F A8 26 C4 13  00 00 00 00 00 00 00 00   .u._.&..........
0000A0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000B0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000C0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000D0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000E0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000F0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
  1. The 2 bytes (0x81A4) at offset 0x00 are the permissions and type. We know that the file is a regular file and the permissions are rw-r--r--. In octal, these permissions would be 644. The inode has the values encoded in a hexadecimal number.
  2. The next 2 bytes (0x03EA) at offset 0x0002 are the owner (user) of the file. In decimal, this is user ID 1002. To verify this, run this command:
    cat /etc/passwd | grep chico
    
    and you'll see this (on my system):
    chico:x:1002:1003:Chico Escuela:/home/chico:/bin/bash
    
    You can plainly see that the second field (1002) is the user ID of chico. The 1003 is the group that chico is in.

  3. The next 4 bytes (0x0000004C) at offset 0x0004 are the size of the file. In this case 0x0000004C is 76 in decimal, which is what the ls command shows. Actually, these 4 bytes are just the low 32 bits of the size. Files larger than what can fit into 32 bits also have 4 additional bytes for the high 32 bits, which gives a practical maximum size of a file as 264, which is pretty large, although in practice 16 TB is the (current) limit.
  4. At offset 0x0018 (0x03EB) is the group that the file belongs to. To verify that ID 1003 is chico, run this command:
    cat /etc/group | grep chico
    
    and you'll see this (on my system):
    chico:x:1003:
    
  5. The 2 bytes after the group at offset 0x001A is the link count. This number is how many references there are to the file. There can be more than one because you can give multiple names to the same file. This allows a file to be known by more than one name. This is somewhat analagous to how references in C++ work.
  6. The bytes that are underlined are the date of the file's last modification (offset 0x0010, 0x5F84C976) and the time of the last modification (offset 0x0088, 0x1D9D98E4).
Here again is the output from the ls command:
1316355 -rw-r--r-- 1 chico chico 76 Oct 12 14:24 /homes/chico/kitchen/refrigerator/cake
The rest of the information encodes things like the creation date/time, last access date/time, checksums, version, high 32 bits of the size, and several other obsolete, reserved, and advanced pieces of information.

So, in a nutshell, the inode stores all of the information about a file with the exception of the filename. We saw that the filenames are stored in a directory's contents. This information in the inode is called metadata.

Extents

What about the actual contents of a file or directory? We saw that every inode has a pointer (block number) to the actual data blocks that store the contents. However, we know that, traditionally, an inode has several block pointers (15 to be exact). Recall the inode diagram and the (partial) inode for the cake file:
       00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
--------------------------------------------------------------------------
000000 A4 81 EA 03 4C 00 00 00  10 75 7F 5F 76 C9 84 5F   ....L....u._v.._
000010 76 C9 84 5F 00 00 00 00  EB 03 01 00 08 00 00 00   v.._............
000020 00 00 08 00 01 00 00 00  0A F3 01 00 04 00 00 00   ................
000030 00 00 00 00 00 00 00 00  01 00 00 00 F8 60 24 00   .............`$.
000040 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000050 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000060 00 00 00 00 ED AD E8 08  00 00 00 00 00 00 00 00   ................
000070 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000080 1C 00 00 00 E4 98 9D 1D  E4 98 9D 1D A8 26 C4 13   .............&..
000090 10 75 7F 5F A8 26 C4 13  00 00 00 00 00 00 00 00   .u._.&..........
0000A0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000B0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000C0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000D0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000E0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0000F0 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
Where are all (15) of the block pointers? Up until now I've just "magically" been saying that the data can be found by following the 4 bytes at offset 0x003C (in bold) and that this is the pointer to the data block (singular). Also, since all of our data (contents of blocks) thus far have been less than 4,096 bytes, we've never needed more than one pointer/block. Yes, there appears to be a lot of "empty" pointers that follow it, but there aren't 15 of them. Remember this self-check from above?

Self-check - With multiple levels of indirection, filesystems can be implemented efficiently for fragmented files. However, for non-fragmented (i.e. contiguous files), this approach is not very efficient. Explain why that is and how a better method can be used.

First, we need to see why the "old" inode scheme of many levels of indirection is good for fragmented files, but bad for non-fragmented (contiguous) files. Once we understand this, a better "solution" is obvious, and the solution is extents.

Many older and less sophisticated filesystems suffered from fragmented files so the originally inode scheme made sense. However, many modern filesystems have very few fragmented files so this scheme is sub-optimal.

As an example, let's assume we have a file called file.txt that is 18,000 bytes in size. With block sizes of 4,096 bytes, the contents of the file will require 5 blocks, with the first 4 blocks being full and the last block containing 1,616 bytes.

  1       2       3       4       5
4,096 + 4,096 + 4,096 + 4,096 + 1,616 = 18,000
With linked allocation, we would have something like this:
With indexed allocation, we would have something like this:
The blue number above the block is the (arbitrary) byte address of the block. The numbers inside the blocks are the size of the data in the block. Because the data blocks are not contiguous, the file is fragmented.

For each of these two different schemes, answer these questions:

Now, suppose the file was not fragmented (e.g. all blocks are contiguous).

Linked allocation:

Indexed allocation:

Answer the same questions: With linked allocation, you still must chase pointers as there is no random access regardless of the fragmentation of the file. With a naive indexed allocation, which is shown above, we also don't get a lot of improvement from the file system. (However, we will get some improvement from the hardware itself due to the locality of the non-fragmented blocks, e.g. prefetching.)

To see just how poorly a fragmented disk can perform, here is a forum post that I made (from July 2005). I've always been a big fan of Microsoft's Flight Simulator and have about 1,000,000 files (photo-realistic textures for parts of the United States.) Because the frame rate depends so much on reading many files per second from the slow disk, any fragmentation is going to make things even worse. You can see the significant improvements by 1) defragging the MFT (Master File Table) and 2) moving important files (e.g. textures) to the outside tracks of the spinning disks. This demonstrates that the outer tracks are moving much faster than the inner tracks (angular velocity), thereby increasing the performance.

Remember, non-fragmented blocks act more like arrays than linked lists because all of the data is contiguous. We can take advantage of this fact by using extents.

Using extents, physical view:

Using extents, logical view:
So, looking back at the (partial) inode for the cake file we can see the extents that are in use:
       00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
--------------------------------------------------------------------------
000000 A4 81 EA 03 4C 00 00 00  10 75 7F 5F 76 C9 84 5F   ....L....u._v.._
000010 76 C9 84 5F 00 00 00 00  EB 03 01 00 08 00 00 00   v.._............
000020 00 00 08 00 01 00 00 00  0A F3 01 00 04 00 00 00   ................
000030 00 00 00 00 00 00 00 00  01 00 00 00 F8 60 24 00   .............`$.
000040 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000050 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000060 00 00 00 00 ED AD E8 08  00 00 00 00 00 00 00 00   ................
The 2 bytes in blue is the number of blocks that are present in this extent. For files less than or equal to 4,096 bytes, it will always be 1. Larger files will have more blocks in the extent.

The 2 bytes in red are the high (upper) 16-bits of the address of the data block and will only be used for very large filesystems.

As an example, let's look at /usr/bin/zip which is clearly larger than a single block:

ls -li /usr/bin/zip
Output:
661483 -rwxr-xr-x 1 root root 188,296 Oct 21  2013 /usr/bin/zip
Find out which block contains inode 661483:
sudo debugfs -R 'imap <661483>' /dev/sda1
Output:
Inode 661483 is part of block group 80
	located at block 2621854, offset 0x0a00
And then dump the inode:
sudo readblock /dev/sda1 2621854 0xa00 256 | dumpit
Partial output:
       00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
--------------------------------------------------------------------------
000000 ED 81 00 00 88 DF 02 00  2A 01 AD 58 2A 01 AD 58   ........*..X*..X
000010 DF 38 65 52 00 00 00 00  00 00 01 00 70 01 00 00   .8eR........p...
000020 00 00 08 00 01 00 00 00  0A F3 01 00 04 00 00 00   ................
000030 00 00 00 00 00 00 00 00  2E 00 00 00 ED F2 2A 00   ..............*.
000040 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000050 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000060 00 00 00 00 CD C7 2E D2  00 00 00 00 00 00 00 00   ................
The inode is telling us that the extent starts with block 0x002AF2ED and extends for 46 (0x002E) blocks. If you do the math arithmetic:
4,096 * 46 = 188,614
we can see that there are exactly 318 (188,614 - 188,296) bytes that are unused in the last block. We can also see how many blocks the file used by running the stat command:
stat /usr/bin/zip
  File: '/usr/bin/zip'
  Size: 188296    	Blocks: 368        IO Block: 4096   regular file
Device: 801h/2049d	Inode: 661483      Links: 1
Access: (0755/-rwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2017-02-21 19:10:34.696610844 -0800
Modify: 2013-10-21 07:23:27.000000000 -0700
Change: 2017-02-21 19:10:34.696610844 -0800
 Birth: -
It tells us the file consumes 368 blocks. But, wait, the inode said there were only 46 blocks. What gives? The stat command is telling us how many 512-byte blocks are used by the file. Since the filesystem uses 4,096-byte blocks, just divide the value from stat by 8 and you'll get 46.

Ok, but what if, for some reason, all of the blocks are not contiguous. Maybe you have a really large file that does have "gaps" in the extents. Lets look at this file on my system

ls -li /usr/bin/rosegarden
Output:
669122 -rwxr-xr-x 1 root root 15,863,224 Oct 22  2013 /usr/bin/rosegarden
This rosegarden file is over 15 megabytes in size and may not be 100% contiguous. Let's run stat on it first to see the output:
  File: '/usr/bin/rosegarden'
  Size: 15863224  	Blocks: 30984      IO Block: 4096   regular file
Device: 801h/2049d	Inode: 669122      Links: 1
Access: (0755/-rwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2018-09-23 12:20:30.000000000 -0700
Modify: 2013-10-22 05:47:12.000000000 -0700
Change: 2018-09-23 12:20:32.525504593 -0700
 Birth: -
We can see that there are 30,984 512-byte blocks or 3,873 I/O blocks (4,096 bytes). Using the debugfs command, we can see some information about the extents:
sudo debugfs -R 'stat /usr/bin/rosegarden' /dev/sda1
Output:
Inode: 669122   Type: regular    Mode:  0755   Flags: 0x80000
Generation: 1883018732    Version: 0x00000000:00000001
User:     0   Group:     0   Size: 15863224
File ACL: 0    Directory ACL: 0
Links: 1   Blockcount: 30984
Fragment:  Address: 0    Number: 0    Size: 0
 ctime: 0x5ba7e780:7d4a4144 -- Sun Sep 23 12:20:32 2018
 atime: 0x5ba7e77e:00000000 -- Sun Sep 23 12:20:30 2018
 mtime: 0x526673d0:00000000 -- Tue Oct 22 05:47:12 2013
crtime: 0x5ba7e780:60ae0954 -- Sun Sep 23 12:20:32 2018
Size of extra inode fields: 28
EXTENTS:
(0-2047):6352896-6354943, (2048-3872):6356992-6358816
This command produces a lot more information. The lines at the bottom tell us that there are 2 extents (e.g. 2 contiguous sets of blocks). The first extent is 2,048 blocks in length (with the corresponding block addresses) and the second extent is 1,825 blocks in length. If you add those numbers together (2,048 + 1,825) you'll get 3,873, the number of 4,096-byte I/O blocks used by the file.

Here's the partial inode for the rosegarden file:

       00 01 02 03 04 05 06 07  08 09 0A 0B 0C 0D 0E 0F
--------------------------------------------------------------------------
000000 ED 81 00 00 B8 0D F2 00  7E E7 A7 5B 80 E7 A7 5B   ........~..[...[
000010 D0 73 66 52 00 00 00 00  00 00 01 00 08 79 00 00   .sfR.........y..
000020 00 00 08 00 01 00 00 00  0A F3 02 00 04 00 00 00   ................
000030 00 00 00 00 00 00 00 00  00 08 00 00 00 F0 60 00   ..............`.
000040 00 08 00 00 21 07 00 00  00 00 61 00 00 00 00 00   ....!.....a.....
000050 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000060 00 00 00 00 EC 95 3C 70  00 00 00 00 00 00 00 00   ......
The 2 bytes highlighted in red on the third line tell us that there are 2 extents in this file.

The 2 bytes highlighted in blue on the third line tell us that this inode can hold at most 4 extents.

The file size (highlighted on the first line, 0x00F20DB8) is decimal 15,863,224, which is what the other commands told us.

The first extent starts at block 0x0060F000 (decimal 6352896) and consumes 2,048 (0x8000) blocks. The second extent starts at block 0x00600000 (decimal 6356992) and extends for 2,048 (0x8000) blocks. Wait. What? That's 4,096 blocks, but the file is only 3,873 blocks. What gives?

The short answer is that the filesystem has reserved some extra blocks at the end. This allows the file to grow without getting fragmented. If the filesystem had not done this and some other file's data ended up after the first file's data, we would end up fragmenting the first file when more data was appended to it.

The long answer is more complicated and beyond the scope of this introduction. Follow some of the links below, if you're interested.

Keep in mind that the filesystem (via the inode) knows how large the file is and how many blocks are actually valid, so it isn't going to "accidentally" read the invalid blocks/bytes at the end of the extent. In fact, the 2 bytes highlighted in red on the fifth line tells us how many of the blocks in the extent are valid (0x0721 is decimal 1825).

Some obvious questions:

  1. What if the file needs more than 4 extents? (Very large files)
  2. What if the file (even a not-so-large file) is badly fragmented?

Notes:

References

Links Files