Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get the inode with boost::filesystem?

Tags:

c++

boost

I want to detect if I saw a file already and would like to identify it with something unique. Under Linux there is the inode number together with the device id (see stat() or fstat()). I assume under Windows I would find something similar.

To start easy, the boost::filesystem offers convenient methods, e.g. I can use boost::filesystem::recursive_directory_iterator to traverse the directory tree. The file_status gives me if it is a regular file, but not the inode number.

The closest thing I found was boost::filesystem::equivalent() taking two paths. I guess this is also the most portable design.

The thing is that I would like to put the inode numbers into a database to have a quick lookup. I cannot do this with this function, I would have to call equivalent() with all paths already existing in the database.

Am I out of luck and boost will not provide me such information due to portability reasons?

(edit) The intention is to detect duplicates via hardlinks during one scan of a folder tree. equivalent() does exactly that, but I would have to do a quadratic algorithm.

like image 881
Borph Avatar asked Oct 23 '25 18:10

Borph


1 Answers

The Windows CRT implementation of stat always uses zero for the inode, so you will have to roll your own. This is because on Windows FindFirstfile is faster than GetFileInformationByHandle, so stat uses FindFirstFile, which does not include the inode information. If you don't need the inode, that's great, performance win. But if you do, the following will help.

The NTFS equivalent to the INODE is the MFT Record Number, otherwise known as the file ID. It has slightly different properties, but to within a margin of error can be used for the same purposes as the INODE, i.e. identifying whether two paths point to the same file.

You can use GetFileInformationByHandle or GetFileInformationByHandleEx to retrieve this information. You will first have to call CreateFile to obtain the file handle.

  • You need FILE_READ_ATTRIBUTES rights only to get the file ID.
  • You should specify FILE_SHARE_READ|FILE_SHARE_WRITE|FILE_SHARE_DELETE
  • You should specify OPEN_EXISTING as the disposition.

Once you have the handle, use one of the GetFileInformation functions to obtain the file ID, then close the handle.

This information you need is available in the BY_HANDLE_FILE_INFORMATION nFileIndexLow and nFileIndexHigh members or if ReFS is in use, then a 128 bit file ID may be in use. To obtain this you must use the updated function.

like image 104
Ben Avatar answered Oct 26 '25 07:10

Ben