I have some code that operates on a file object in Python.
Following Python3's string/bytes revolution, if file was opened in binary mode, file.read() returns bytes.
Conversely if file was opened in text mode, file.read() returns str.
In my code, file.read() is called multiple times and therefore it is not practical to check for the result-type every time I call file.read(), e.g.:
def foo(file_obj):
while True:
data = file.read(1)
if not data:
break
if isinstance(data, bytes):
# do something for bytes
...
else: # isinstance(data, str)
# do something for str
...
What I would like to have instead is some ways of reliably checking what the result of file.read() will be, e.g.:
def foo(file_obj):
if is_binary_file(file_obj):
# do something for bytes
while True:
data = file.read(1)
if not data:
break
...
else:
# do something for str
while True:
data = file.read(1)
if not data:
break
...
A possible way would be to check file_obj.mode e.g.:
import io
def is_binary_file(file_obj):
return 'b' in file_obj.mode
print(is_binary_file(open('test_file', 'w')))
# False
print(is_binary_file(open('test_file', 'wb')))
# True
print(is_binary_file(io.StringIO('ciao')))
# AttributeError: '_io.StringIO' object has no attribute 'mode'
print(is_binary_file(io.BytesIO(b'ciao')))
# AttributeError: '_io.BytesIO' object has no attribute 'mode'
which would fail for the objects from io like io.StringIO() and io.BytesIO().
Another way, which would also work for io objects, would be to check for the encoding attribute, e.g:
import io
def is_binary_file(file_obj):
return not hasattr(file_obj, 'encoding')
print(is_binary_file(open('test_file', 'w')))
# False
print(is_binary_file(open('test_file', 'wb')))
# True
print(is_binary_file(io.StringIO('ciao')))
# False
print(is_binary_file(io.BytesIO(b'ciao')))
# True
Is there a cleaner way to perform this check?
I have a version of this in astropy (for Python 3, though a Python 2 version can be found in older versions of Astropy if needed for some reason).
It's not pretty, but it works reliably enough for most cases (I took out the part that checks for a .binary attribute since that's only applicable to a class in Astropy):
def fileobj_is_binary(f):
"""
Returns True if the give file or file-like object has a file open in binary
mode. When in doubt, returns True by default.
"""
if isinstance(f, io.TextIOBase):
return False
mode = fileobj_mode(f)
if mode:
return 'b' in mode
else:
return True
where fileobj_mode is:
def fileobj_mode(f):
"""
Returns the 'mode' string of a file-like object if such a thing exists.
Otherwise returns None.
"""
# Go from most to least specific--for example gzip objects have a 'mode'
# attribute, but it's not analogous to the file.mode attribute
# gzip.GzipFile -like
if hasattr(f, 'fileobj') and hasattr(f.fileobj, 'mode'):
fileobj = f.fileobj
# astropy.io.fits._File -like, doesn't need additional checks because it's
# already validated
elif hasattr(f, 'fileobj_mode'):
return f.fileobj_mode
# PIL-Image -like investigate the fp (filebuffer)
elif hasattr(f, 'fp') and hasattr(f.fp, 'mode'):
fileobj = f.fp
# FILEIO -like (normal open(...)), keep as is.
elif hasattr(f, 'mode'):
fileobj = f
# Doesn't look like a file-like object, for example strings, urls or paths.
else:
return None
return _fileobj_normalize_mode(fileobj)
def _fileobj_normalize_mode(f):
"""Takes care of some corner cases in Python where the mode string
is either oddly formatted or does not truly represent the file mode.
"""
mode = f.mode
# Special case: Gzip modes:
if isinstance(f, gzip.GzipFile):
# GzipFiles can be either readonly or writeonly
if mode == gzip.READ:
return 'rb'
elif mode == gzip.WRITE:
return 'wb'
else:
return None # This shouldn't happen?
# Sometimes Python can produce modes like 'r+b' which will be normalized
# here to 'rb+'
if '+' in mode:
mode = mode.replace('+', '')
mode += '+'
return mode
You might also want to add a special case for io.BytesIO. Again, ugly, but works for most cases. Would be great if there were a simpler way.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With