I may have spotted one last low-hanging fruit in the orchard, and it may be gold but there’s a chance it’s just brass.
Currently motion and sampling processing run in series on a single thread. Critically, the sampling code waits for the next data ready interrupt. That’s wasted time that the motion processor could use gainfully for a bit more processing, but it can’t because the code is blocked waiting for that interrupt. When 10 sets of samples have been gathered, the sampling code hands back to the motion code; the motion code takes 3ms, and during that time, nobody is waiting for the data ready interrupt and so valid samples are lost.
Running sampling and motion processing on separate Python threads doesn’t work because they are running on a single CPU with the python interpreter scheduling running the show. Throw in the GIL, and they’re effectively still running serially.
Now many many years ago, I wrote a kernel driver for an SDLC device. It had an ISR – an interrupt service routine – which got called at higher priority when a new packet arrived; it did a tiny bit of processing and exited, allowing the main thread to process the new data the ISR had caught and cached.
That was in the kernel but I think something very similar is possible using the GPIO library in user-land. It’s possible to register a callback that is made when the data ready interrupt occurs. In the meantime, motion processing runs freely. The interrupt handler / callback run on a separate OS (not python) thread, so hopefully the GIL won’t spoil things.
So here’s how it should work:
- A data_ready callback gets registered to read the sensors – it’s triggered by the rising edge of the hardware data ready interrupt.
- Once installed, the callback is made for each new batch of data every 1ms and a batch of sensor data is read.
- The callback normally just caches the data but every 10 samples, it copies the batch into some memory the motion thread is watching, and kicks it into life by sending a unix signal (SIGUSR1).
- The motion thread sits waiting for that signal (signal.pause()) – just like for the “threading” code does now.
- Once received it has 10ms to process the data before the next batch comes through from the callback – that’s plenty of time.
The subtle difference is that the waiting is happening in the kernel scheduler rather than in the python scheduler, meaning the python motion code can run in between each new data ready interrupt callback.
From looking at the GPIO code, the callback thread is not a python thread and should not be affected by the GIL nor the overhead of python threading. Which means the motion processing can happen at python level while sensor sampling happens on a ‘C’ level thread.
It definitely worth a go.
If it works, it also allows me to climb nearer to moral high ground too: I’ll be able to revert to the standard RPi.GPIO library rather than my version which performance tuned the wait_for_edge() call. The callback function doesn’t have the same inefficiencies. One of my guiding principles of this project was to keep it pure, so it feels good to return to the one true path to purity and enlightenment. Fingers crossed!