Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does read(2) in Linux C work?

Tags:

c

linux

system

According to the man page, we can specify the amount of bytes we want to read from a file descriptor.

But in the read's implementation, how many read requests will be created to perform a read?

For example, if I want to read 4MB, will it create only one request for 4MB or will it split it into multiple small requests? such as 4KB per request?

like image 745
Michael Tong Avatar asked Aug 31 '25 02:08

Michael Tong


2 Answers

  • read(2) is a system call, so it calls the vDSO shared library to dispatch the system call (in very old times it used to be an interrupt, but nowadays there are faster ways of dispatching system calls).

  • inside the kernel the call is first handled by the vfs (virtual file system); the virtual file system provides a common interface for inodes (the structures that represents open files) and a common way of interfacing with the underlying file system.

  • the vfs dispatches to the underlying file system (the mount(8) program will tell you which mount point exists and what file system is used there). (see here for more information http://www.inf.fu-berlin.de/lehre/SS01/OS/Lectures/Lecture16.pdf )

  • the file system can do its own caching, so number of disk reads depends on what is present in the cache and how the file system allocates blocks for storage of a particular file and how the file is divided into disk blocks - all questions to the particular file system)

  • If you want to do your own caching then open the file with O_DIRECT flag; in this case there is an effort not to use the cache; however all reads have to be aligned to 512 offsets and come in multiples of 512 size (this is in order that your buffer can be transfered via DMA to the backing store http://www.quora.com/Why-does-O_DIRECT-require-I-O-to-be-512-byte-aligned )

like image 131
MichaelMoser Avatar answered Sep 02 '25 15:09

MichaelMoser


It depends on how deep you go.

The C library just passes the size you gave it straight to the kernel in one read() system call, so at that level it's just one request.

Inside the kernel, for an ordinary file in standard buffered mode the 4MB you requested is going to be copied from multiple pagecache pages (4kB each) which are unlikely to be contiguous. Any of the file data which isn't actually already in the pagecache is going to have to be read from disk. The file might not be stored contiguously on disk, so that 4MB could result in multiple requests to the underlying block device.

like image 31
caf Avatar answered Sep 02 '25 17:09

caf