I need to detect files which contain my string. Files sizes can be bigger than 4gb. I cannot do that simply using tools like file_get_contents() because it try to put file in RAM.
How can I do this? Using standard PHP? Using elasticsearch or other external search engine?
If you have a linux based machine, you can use grep command:
shell_exec( 'grep "text string to search" /path/to/file');
As output you will have all the rows containing your text.
here you can find an easy tutorial for using grep!
If you need to find all files containing some text in a directory, you can use
shell_exec( 'grep -rl "text string to search" /path/to/dir' );
r stands for "recursive", so it will look in every file
l stands for "show filenames"
As a result, you will have all filenames (one per row).
You may use something like this. This is not optimized or tested at all, and may have some unnoticed bug by me, but you should get the idea:
function findInFile($file_name, $search_string, $chunk_size=1024) {
// Because we are going to look back one chunk at a time,
// having $search_string more than twice of chunks will yield
// no result.
if (strlen($search_string) > 2 * $chunk_size) {
throw new \RuntimeException('Size of search string should not exceed size of chunk');
}
$file = new \SplFileObject($file_name, 'r');
$last_buffer = '';
while (!$file->eof()) {
$chunk = $file->fread($chunk_size);
$buffer = $last_buffer . $chunk;
$position_in_buffer = strstr($buffer, $search_string);
if ($position_in_buffer !== false) {
// Return position of string in file
return
$file->ftell() - strlen($chunk) + $position_in_buffer
;
}
$last_buffer = $chunk;
}
return null;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With