Zoe++

Zoe is now running my split cluster gather + process code for the RaspiCam video macro-blocks.  She has super-bright LEDs from Broadcom with ceramic heatsinks so the frame doesn’t melt and she’s running the video at 400 x 400 px at 10fps.

The results?:

And this peops, is nearly a good as it can be without more CPU cores or (heaven forbid) moving away from interpreted CPython to pre-compiled C*.  Don’t get me wrong, I can (will?) probably add minor tweaks to process compass data – the code is already collecting that; adding intentional lateral motion to the flight plan costs absolutely nothing – hover stably in a stable headwind is identical processing to intentional forwards movement in no wind.  But beyond that, I need more CPU cores without significant additional power requirements to support GPS and Scanse Sweep. I hope that’s what the A3 eventually brings.

I’ve updated everything I can on GitHub to represent the current (and perhaps final) state of play.


* That’s not quite true; PyPy is python with a just in time (JIT) compiler. Apparently, it’s the dogs’ bollocks, the mutts’ nuts, the puppies’ plums. Yet when I last tried, it was slower, probably due to the RPi.GPIO and RPIO libraries needed. To integrate those with pypy requires a lot of work which up until now has simply not been necessary.

Crass assumption

The new code I posted yesterday is based on the assumption that the motion processing code takes less than 1ms, so that no samples will be missed.  I was aware that the assumption is probably not true, but thought I really ought to check this morning; net is that it takes about 3.14ms and so 3 samples are missed.  That’s more than enough to skew the accelerometer integration resulting in duff velocities and drift

I couldn’t spot any areas of the code that could be trimmed significantly, so it’s back to working out why pypy runs some much slower than CPython – about 2.5 times based on the same test.

I will be sticking with this latest code as I believe its timing is still better than the time.time() version.  I just need to speed it up a little.  FYI the pypy performance data from their site suggests there should be more than a 6 fold performance improvement based upon the pypy version (2.2.1) used for the standard Raspian distribution; that’s more than enough.

I am still tinkering with kitty++ in the background, and have the macro-block data, but need to work out the best / correct way to interpret that.  But it’s blocked because for some reason kitty’s Rasperry Pi can’t be seen by other computers on my home network other than by IP address.  Just some DHCP faff I can’t be bother to deal with at the moment.

The last nail in FIFO’s coffin

Based upon a comment from yesterday’s post, I did some more rigorous testing to track whether FIFO could be made to work.

The main change was to drop the sampling rate to 250Hz to allow loads of time to read the FIFO.  With that in place, the FIFO never filled and I could just read n batches of 12 byte sensor data successfully.

With that sorted, next I ran a sanity check – an unpowered flight with Phoebe sitting on the floor.  There were a number of I2C read errors.  Here’s what a got from the ‘accelerometers’ during an 8 second passive flight:

FIFO stats

FIFO stats

The first 2 seconds are correct – 1g of gravity in the Z axis, and 0g gravity in the X and Y axes.  And then all hell breaks loose – or it would if the motors were powered up.  Suddenly, gyro data has crept into the accelerometer range, so Z axis gravity drops to near zero and after that who knows WTF went wrong.  The errors do seem to correspond to the count of I2C errors I track, which I think means that if I want to pursue the FIFO direction, then I have to swap I2C for SPI.  And in fact, swapping to SPI might be the best next approach for the data register reading as well.  The only downside is that it requires the PCBs to be reworked.

I just wish PyPy had yielded what it claimed it could do regarding performance as then I wouldn’t have this new battle to fight.

No go FIFO

The FIFO solution which should address all the problems of corrupted and lost data, along with accurate timings has failed.  There are multiple factors:

The MPU-6050 FIFO isn’t a real FIFO as I know it – the amount of data in the FIFO doesn’t decrement as you read from the FIFO.  That means the FIFO needs to be emptied and then reset.  And that means the data read needs to be stored in another FIFO in my code.  Also the enforced FIFO reset means potential loss of data if new data arrived in the small gap between emptying the FIFO and resetting it – and in testing, I’ve seen this happen.  And this is a real problem; each batch of sensor data is 12 bytes long (3 axes * 2 bytes * (accelerometer + gyro)).  Lose a byte or two of accelerometer data in the FIFO read / reset gap, and all of a sudden, data read from the FIFO as accelerometer readings now contains gyro readings.  In addition, the FIFO is read byte by byte.  That means 12 1 byte reads for a full batch of data compared to 1 14 byte read directly from the data registers.  The latter is a lot more efficient – in fact the 12 x 1 byte reads are so slow that the FIFO is filling up faster than the data can be read, and starts overflowing; yes I can reduce the sampling rate (and I did to 500Hz) and that improved matter but, to cap it all, there are still I2C bus errors which means FIFO data can still get lost, again shifting the data so the gyro data slips into what ought to accelerometer data readings.  Put together, the FIFO doesn’t stand a feline in the fire’s chance of working.

Which means I need to drop back to plan A – PyPy.  I think the problem here is the PyPy I2C code which is out of date in the Raspberry Pi Raspian distribution.  I’m hoping someone reading this blog can encourage the RPF to update the distribution to include the latest copy of PyPy and it’s I2C / smbus library – please 🙂

Until / unless that happens, I’m stuck again with Phoebe and Chloe all dressed up and nowhere to go.

Phoebe, PyPy and Performance

I’ve just managed to get my Quadcopter code to run under PyPy rather than CPython – that means the code is compiled in advance for each run (Just In Time or JIT) rather than interpreted line by line.  Sadly, this took the performance down to 58% rather than the 95% I’d achieved with CPython 🙁

However, the PyPy code in the standard Raspian distribution is very out of date (version 2.2.1 compared to the current 2.6) so there’s more investigation to be done.

In passing, I also updated the Raspian distribution (sudo apt-get dist-upgrade) installed on Phoebe, and amazingly, that has taken me to about 98%!

Time to go see if that’s made a real difference…

CPython, Cython or PyPy?

I’m pretty sure that my interpreted CPython code is as efficient as possible, so if I want to capture all samples (for accurate integration) rather than the 95% I currently get, I need to make the motion processing code take less than 1ms and the most obvious currently-available solution (until the RPi A2 is launched) is to move from the interpreted CPython to the compiled Cython or PyPy.

I considered this a long time ago when trying to speed up the code, but in the end, I didn’t need to make the move as various tweaks to the code* improved the performance by a factor of 5 or 6.

But now is the time to make that leap of faith.  I’ll update you on progress.


*Primary performance enhancements:

  • customized GPIO library to optimize rising edge performance for the data ready hardware interrupt
  • run sampling at 1kHz but only run motion processing for each batch of 10 (averaged) samples.
  • minimized calls to time.time() to just time stamping each batch of 10 samples – another irony that calling time.time() is the most time consuming call in this code.