Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ReadFully() Comes at the risk of choking?

I noticed when I use readFully() on a file instead of the read(byte[]), processing time is reduced greatly. However, it occured to me that readFully may be a double edged sword. If I accidentlly try to read in a huge, multi-gigabyte file, it could choke?

Here is a function I am using to generate an SHA-256 checksum:

public static byte[] createChecksum(File log, String type) throws Exception {
    DataInputStream fis = new DataInputStream(new FileInputStream(log));
    Long len = log.length();
    byte[] buffer = new byte[len.intValue()];
    fis.readFully(buffer); // TODO: readFully may come at the risk of
                            // choking on a huge file.
    fis.close();
    MessageDigest complete = MessageDigest.getInstance(type);
    complete.update(buffer);
    return complete.digest();
}

If I were to instead use:

DataInputStream fis = new DataInputStream(new BufferedInputStream(new FileInputStream(log)));

Would that allieviate this risk? Or... is the best option (in situations where you can't garuntee data size) to always control the amount of bytes read in and use a loop till all bytes are read?

(Come to think of it, since the MessageDigest API takes in the full byte array at once, I'm not sure how to attain a checksum without stuffing all the data in at once, but I suppose that is another question for another thread.

like image 316
E.S. Avatar asked Nov 21 '25 10:11

E.S.


2 Answers

You should just allocate a decently-sized buffer (65536 bytes perhaps), and do a loop where you read 64kb at a time, using "complete.update()" to append to the digester inside the loop. Be careful on the last block so you only process the number of bytes read (probably less than 64kb)

like image 166
faffaffaff Avatar answered Nov 23 '25 01:11

faffaffaff


Reading the file will take as long as it takes, whether you use readFully() or not.

Whether you can actually allocate gigabyte-sized byte arrays is another question. There is no need to use readFully() at all when downloading files. It's for use in wire protocols where say the next 12 bytes are an identifier followed by another 60 bytes of address information and you don't want to have to keep writing loops.

like image 20
user207421 Avatar answered Nov 23 '25 00:11

user207421



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!