I'm currently sending pickled Python (3.8) objects via sockets between two running programs. I have a buffer of bytes I want to reconstruct into their corresponding objects on the receiving end.
To my understanding, the socket.recv
method is not guaranteed to catch all bytes sent, and it's up to the caller to call socket.recv
again to pick up the rest of the data. So, at any given time, my buffer could contain partial packets.
Additionally, due to my use of threading, I could receive multiple messages before I check the buffer.
Here's my question:
Given that I'm receiving a byte stream of arbitrary length, which may contain fewer or more than one pickle object, what is the best way to reassemble them? Is there a character I can use as a terminator that is guaranteed to not conflict with pickle
?
Is there a character I can use as a terminator that is guaranteed to not conflict with pickle?
Unfortunately there isn't. Pickle packs data in binary form, therefore any sequence of bytes can appear inside a pickled object.
what is the best way to reassemble them?
The most common (and probably also the easiest) thing to do when dealing with this kind of problem is to send a fixed-size header that indicates the size of the data that is going to be received.
You can use struct.pack()
to create an 8-byte header containing the binary representation (as an 8-byte network-endian unsigned integer) of the size of the pickled object, and send it before the actual data. On the receiving end, you'll first receive the 8-byte header, then you'll decode it to know the size of the data that was sent, and finally receive exactly that number of bytes.
Here's a (simplified) example:
Sender:
class Example:
pass
data = pickle.dumps(Example())
size = len(data)
header = struct.pack("!Q", size)
# open socket...
sock.sendall(header)
sock.sendall(data)
Receiver:
class Example:
pass
def receive_exactly(sock, n):
data = b''
while n > 0:
chunk = sock.recv(n)
n -= len(chunk)
data += chunk
return data
# open socket...
header = receive_exactly(sock, 8)
size = struct.unpack("!Q", header)[0]
data = receive_exactly(sock, size)
e = pickle.loads(data)
Note that the two above snippets only serve as simple examples, you should do the proper error checking and handling when using sendall()
and recv()
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With