I'm currently building a machine that uses an Arduino Mega2560 as its main controller. The Arduino is connected to over serial, gets a command, executes it and spits out a bunch of measurement data every 1ms. I have a Raspberry Pi running Python to give the user a nice GUI to send the command, and to present the data in a readable form.
The problem I face: the Arduino is able to spit out 15 byte of data each millisecond (so that's only 15kbyte/s), but the code I'm running can only cope with about 15 byte each 10 milliseconds, so 1.5kB/s.
When I run cat /dev/ttyACM0 > somefile, I nicely see all datapoints.
I have the following slimmed down Python code
# Reset Arduino by starting serial
microprocBusy = True
serialPort = serial.Serial("/dev/ttyACM0", baudrate=460800, timeout=0)
time.sleep(0.22);
serialPort.setDTR(False);
time.sleep(0.22);
serialPort.setDTR(True);
time.sleep(0.10);
logfile = open(logfilenamePrefix + "_" + datetime.datetime.now().isoformat() + '.txt', 'a')
# Bootloader has some timeout, we need to wait for that
serialPort.flushInput()
while(serialPort.inWaiting() == 0):
    time.sleep(0.05)
# Wait for welcome message
time.sleep(0.1)
logfile.write(serialPort.readline().decode('ascii'))
logfile.flush()
# Send command
serialPort.write((command + '\n').encode('ascii'))
# Now, receive data
while(True):
    incomingData = serialPort.readline().decode('ascii')
    logfile.write(incomingData)
    logfile.flush() 
    if(incomingData[:5] == "FATAL" or incomingData[:6] == "HALTED" or incomingData[:5] == "RESET"):
        break;
    elif(incomingData[:6] == "RESULT"):
            resultData = incomingData;
logfile.flush() 
When I run this, the first ~350 datapoints come in, then I see some mangled data and miss about 2000 datapoints, after which I see another 350 or so datapoints. The CPU usage is at 100% during the process
What is going wrong? Is PySerial poorly optimized, or is there some mistake in my code I missed? I could just run cat /dev/ttyACM0 > somefile from Python and then read that file, but that's not really a nice solution, is it?
Thanks a lot :)
I realize that this is an old thread, but it has been viewed 3000 times as of this writing and I would hate for someone to be turned off of pySerial on just this encounter.
I believe the most likely culprit for the author's problem is the implicit parsing that is going on between reads:
incomingData = serialPort.readline().decode('ascii')
The readline() method is telling pyserial to parse to the next line.  You are also doing a decode() in the middle of your receive cycle.  All of this is occurring right in the middle of your stream.
A better approach might look like this:
waiting = port.in_waiting  # find num of bytes currently waiting in hardware
buffer += [chr(c) for c in port.read(waiting)] # read them, convert to ascii
# ...keep accumulating the buffer for as long as is reasonable...
processSerialData(buffer)  # whatever processing needs to happen, split your
                           # lines, log, or whatever else *after* you get
                           # your data
A very good solution to this can be found here.
The author states:
The code below gives me 790 kB/sec while replacing the code with pyserial's readline method gives me just 170kB/sec.
There is no statement about the baud rate set for this comparison. The value of 9600 baud in the example below is only for testing.
This solution also avoids having 100 % CPU usage.
class ReadLine:
    def __init__(self, s):
        self.buf = bytearray()
        self.s = s
    
    def readline(self):
        i = self.buf.find(b"\n")
        if i >= 0:
            r = self.buf[:i+1]
            self.buf = self.buf[i+1:]
            return r
        while True:
            i = max(1, min(2048, self.s.in_waiting))
            data = self.s.read(i)
            i = data.find(b"\n")
            if i >= 0:
                r = self.buf + data[:i+1]
                self.buf[0:] = data[i+1:]
                return r
            else:
                self.buf.extend(data)
ser = serial.Serial('COM7', 9600)
rl = ReadLine(ser)
while True:
    print(rl.readline())
I've switched from PySerial to PyTTY, which solves my problem. Just plugging it into this code (with some small changes, like replacing serialPort.inWaiting() == 0 by serialPort.peek() == b'' for example) makes my code able to handle the datastream and not get above 50% CPU usage, which means it is at least 10x as fast. I'm still using PySerial to set the DTR lines though.
So, I guess the answer to the question is that indeed PySerial is indeed poorly optimised.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With