FIFO overflow

The IMU FIFO has a fixed size, so it can fill up if not read frequently; it’s then configurable whether the IMU stops adding to the FIFO or overwrite what’s there already.  Either way, this results in data corruption.  I’d heard, and a quick test confirmed that the FIFO size  is 512 bytes.  The same test also revealed it takes 0.064s to empty a full FIFO byte-by-byte or 0.000125s (1/8000)s to read a single byte or 1.5ms to read the 12 bytes that are a full set of sensor data.

For the FIFO code to work, I need to read the FIFO and update the ESCs often enough in comparison to the sensor sampling rate to ensure the FIFO never overflows.  The rate for updating the ESCs is arbitrary; experience say about 100Hz is a good value.

The other factor is how long the motion processing takes; again experience with the hardware interrupt code suggests this is around 2ms.

A flight loop looks like this:  sleep, empty FIFO, motion_processing.

The question is how long to sleep in order to get the 100Hz ESC update frequency and not allow the FIFO to overflow?

esc_period = 100Hz = 0.01s
read_period = 0.0015s
motion_period = 0.002s
num_samples = sampling_rate * dt

esc_period ≈ sleep_period + n_samples x read_period + motion_period
0.01 ≈ sleep_period + sampling_rate x dt x 0.0015 + 0.002
sleep_period ≈ 0.01 - 0.002 - sampling_rate x dt x 0.0015

I’ve updated the code, and then collapsed it to a simplified version which is now on GitHub.  With some basic PID tuning, flights are back to the equivalent quality Phoebe could do although I need to keep testing for finer tuning and confidence building.

 

Babbage takes to the air

It’s Charles Babbage‘ 224th Birthday today, so how better to celebrate than to take his namesake, the Raspberry Pi Babbage Bear for a flight!

I’m so pleased with the QCIMUFIFO.py (Quadcopter Inertial Motion Unit First In First Out) code that I’ve decided to make it the primary development source, renaming Quadcopter.py to QCDRI.py (Quadcopter Data Ready Interrupt).  They are both on GitHub along with a version of qc.py which makes it easier to select which to run.

FIFO first flight

In a 10 minute slot between 40mph winds and torrential rain, the sun came out and I quickly grabbed a couple of outdoor flights with Zoe running the IMU FIFO code.  She passed the primary test by not drifting significantly which means further flights can move indoors.  She needs some gyro PID tuning – she’s flying with Phoebe’s defaults – she was see-sawing around the X (roll) axis.  She was running at 500Hz sampling with ESC updates at a nominal 100Hz, and she just about kept up with herself.

My hope is that once the Christmas chaos has died down, I can get that tuning done, and show her of on video to the world.

FIFO food for thought

OK, so FIFO is good, and definitely better than using the hardware interrupt but far from perfect.  It does capture every sample regardless of what else if going on, which is great, but due to two factors, it doesn’t actually create free time to read other inputs to be read.  This doesn’t mean other inputs can’t be read but reading those input delays the next ESC update, meaning that the flight might be jittery perhaps to the extent of being unstable.

The two factors are that

  • reading the FIFO register is a bit by bit operation rather than a single 14 byte read when reading the sensor registers directly – this is slower
  • To ensure the ESCs are updated at a reasonable frequency (100Hz is a good value), it’s now necessary to call time.time() a couple of times, which as I’ve mentioned before, ironically it wastes time.

There are a couple of plus sides too:

  • because it doesn’t use the hardware interrupt, it doesn’t need the custom GPIO performance tweaks I had to make – this satifies my desire to use standard python libraries if at all possible
  • Not using the GPIO library (mine or the standard one) partially opens up the possibility of using PyPy, although that still needs testing as the RPIO library is still required for the hardware PWM.

Anyway, the new props for Zoe arrived today, so the next step is to check both the interrupt and FIFO code to see how they perform in a real flight.

FIFO Phoenix

The Tangerine Dream train of thought puffed through the IMU FIFO station, and got me thinking I needn’t wait for the B2 – I can test it with Zoe.

Cutting to the chase, the results of my FIFO code test suggests the flight controller is a time traveller!  The HoG has got the Infinite Improbability Drive working, and there’s a lovely cuppa tea steaming away!  It can spend time doing slow things like reading lots of other non-time-critical inputs (altimeter, compass, GPS, remote control…) and then go back in time to pick up all the readings from the time critical accelerometer and gyro knowing exactly when they happened and process them as though each had only just happened. 

To be autonomous, my code has to catch every sample from the accelerometer and gyro.  Any missed readings result in drift.  The code until today did this by waiting for a hardware interrupt that is configured to happen every 4ms (250Hz sampling).  So every 4ms, it catches the interrupt, reads the sensors and processes them.  That doesn’t leave very much time to do anything else before the next interrupt.  This has been a psychological block for me: I have camera motion trackers, ultrasonic range finders, GPS, altimeter and compass sensors all waiting to be used.  But I never got very far with any of them as ultimately I knew I only had a fraction of the millisecond spare.  By using the IMU FIFO, the code could take control of time by reading the cache of sensor readings when it wanted to (within reason), and thus make the space to process input from other sensors.

I’ve known for a while that the IMU FIFO could take back control over time from the hardware interrupt, but my previous investigation hit a very hard brick wall: I2C errors were corrupting the data I was reading from the FIFO.  For example, here’s gravity from 6 months ago while Phoebe was sitting passively on the floor:

FIFO stats

FIFO stats

But I’ve not seen I2C errors since the move to Jessie, so I gave it another go with Zoe:

IMU FIFO Accelerometer

IMU FIFO Accelerometer

Obviously, this is just a single test run, but if this proves to be reliable, it is truly liberating.  It opens up a whole new world of sensors, the hard part only being which to add first!

Tangerine Dream

A comment from Jeremy yesterday planted a seed, which has rapidly become a germinating coconut!

It started with using a RPi B2 to give me the extra CPU cores I wanted rather than wait for the fictional RPi A2.  I’d not considered a B2 as only A+ fit between the top and bottom plates of the frame, but in the past I’ve used Bs, As and B+s sat on the top-plate.  Immediately I thought of Chloe who’s currently just a Phoebe clone – but a RPi B2 could make her so much more – I started rebuilding Chloe to shuffle the bits around to make space for the RPi B2 on the top plate.

I’d need a case of course for protection in her new exposed position on the top plate, and the Tangerine PiBow appealed for some reason.  Perhaps it’s the seasonal stocking filler from Santa?

Then it dawned on me that since using Jessie for Zoe, I’d not seen any I2C errors which have plagued the accuracy of sensor readings for a year now, and that meant I could reconsider using some of the FIFO variants of the code I’d shelved.

By using a FIFO, control of time is no longer tied to the data ready interrupt: additional function does not have to be squeezed into the few milliseconds between sensor readings; the code can do what it wants (for example with additonal sensors) and periodically empty the FIFO and process the contents, with accurate elapsed time based on the number of samples retrieved from the FIFO.

For now, I’m going to focus on getting Zoe and her PiZero up to Phoebe’s standard, but I hope to then raise the Tangerine phoenix that’s Chloe from the flames.

 

Threads, ISRs and FIFOs

I have 4 strands of development on the go at the moment, all using different ways to collect data samples and run motion processing on them.  The aim in all cases it to capture every batch of samples available as efficiently as possible and run motion processing over them.

The Quadcopter.py code runs all the code serially, blocked waiting for each set of samples and when a batch of ten have been collected, motion processing runs using an average of that batch of ten.  This is the fastest version capturing more valid samples, despite being single threaded; some of this is due to the optimised GPIO library required for the blocking wait for the data ready interrupt.  However the serial operation means that motion processing needs to take less than one millisecond in order to ensure all samples are captured before the next batch of data is ready @ 1kHz.  Currently motion processing takes about 3ms and it’s hard to see how to trim it further.

The QCISR.py code runs sampling and motion in separate threads, and only requires the standard RPi.GPIO library.  The motion processing is the main thread; the data sampling thread is a an OS not python thread, used by the GPIO code to call into the sampling code each time a data ready interrupt occurs.  The expectation here was that because the sampling thread is always waiting for the next batch of data, none will ever be missed.  However it seems that the threading causes sufficient delays that in fact this code runs 10% slower.  It currently uses the same batching / averaging of data model as above.  The advantage here (if the threading didn’t have such an impact) is the the motion processing has ten milliseconds to run its course while the next batch of ten samples is being sampled on a separate thread.

The QCOSFIFO.py code runs sampling and motion processing in separate Python threads, setting up an OS FIFO to pass data from sampling to motion processing.  The motion thread sits waiting for the next batch on a select.select() call.  Currently, although data is being written, the select.select() never unblocks – possibly because the FIFO is intended as an IPC mechanism, but there is only a single process here.  I’ll need to move sampling to a child process to proceed on this further.

The QCIMUFIFO.py code tries to use the IMU FIFO, and the motion processing code periodically empties the FIFO and processes the data.  This is single threaded, but no samples should be lost as they are all queued up in the IMU FIFO.  The data pushed into the FIFO are batches of (ax, ay, az, gx, gy, gz) each taking 2 bytes every sampling period.  The code reads these 12 bytes, and breaks them down into their individual components.  This could be the perfect solution, were it not for the fact I2C errors cause the loss of data.  This results in the boundaries between (ax, ay, az, gx, gy, and gz) slip, and from then on, none of the samples can be trusted.  This seems to happen at least once per second, and once it does, the result is a violent crash.

For the moment, Quadcopter.py produces the best results; QCISR.py has the potential to be better on a multicore system using threads; QCOSFIFO.py would be a much better solution but requires splitting into two processes on a multi-core CPU; finally QCIMUFIFO.py is by far and away the best solution with single threaded operation with no possible data loss and reliable timing based on the IMU sampling rate, if it weren’t for the fact either the FIFO or the reads thereof are corrupted.

There’s one variant I haven’t tried yet, based upon QCISR.py – currently there’s just some shared data between the sampling and motion processing threads; if the motion processing takes longer than the batch of ten samples, then the data gets overwritten.  I presume this is what’s happening due to the overhead of using threads.  But if I use a python queue between the sampling and motion threads (effectively a thread safe FIFO), then data from sampling doesn’t get lost; the motion thread waits on the queue for the next batch(es) of data, empties and processes the averaged batches it has read.  This minimizes the processing done by the sampling thread (it doesn’t do the batching up), and IFF the sampling thread is outside of the python GIL, then I may be able to get all samples.  This is where I’m off to next.

I’ve included links to zipped-up code for each method in case you’d like to compare and contrast them.

Despicable me!

After discussion on the PyPy dev e-mail alias, it seems that to get PyPy performance, I need to change GPIO and RPIO from using the CPython ‘C’ API to using CFFI to call the GPIO / RPIO ‘C’ code from PyPy (or CPython come to that).  It’s the CPython ‘C’ API that’s the performance hog for anything but CPython.  But it’s not entirely clear to me how to use CFFI on the GPIO / RPIO ‘C’ code.  I don’t think it’s tricky – but I’m ignorant so there’s a lot of learning to do.  I think some googling’s needed to find some examples.

There’s a plan B though: kitty uses an OS FIFO to read data from the picamera on a separate thread while the camera thread is still filming.  This works well.  I could do the same thing, moving motion processing to a separate thread (though it’ll probably need to be a process due to the GIL) and feed it sensor data over a FIFO.  The motion process just waits on a select() for the next batch of sensor data, and processes it (taking ~ 3ms) while the next batch of sensor data is collected and averaged (taking ~ 10ms).  No data is lost, and timing is wholly driven by the MPU sampling rate.  I’m pretty certain this could work.

But that lead me to realize there’s a dirty, no, absolutely filthy and perhaps despicable hack I can do. The motion processing is consistently taking just over 3ms.  If I add that time to the 10ms taken to sample the sensors 10 times, then assuming the sensor readings in those 10 samples are pretty consistent (I’m assuming this already to some extent by averaging them), then including the 3ms for motion processing will make the velocity integration more accurate when there is acceleration.  Essentially it’s interpolating the 3 missing samples lost during motion processing based upon the average of the 10 that were collected successfully.  I could also do the same for missed data samples.

So I tried it out, and after a few minor tweaks, it worked.  I wanted to get a video to back up all these boring words, but by then, her main LiPo was running low, and she got all wibbly wobbly at that point.  I’ll try later when it’s up to full charge again.

So for now, dirty will do nicely, though I do intend to try the FIFO method too, initially with threads to get the code working, and then I’ll move over to using processes which is a little bit trickier.

Wasted time.time()

I’ve been doing some fine tuning of the HoG code as a result of the FIFO trial I did.  The FIFO timing came from the IMU sampling rate – there’s a sample every 1ms and so there’s no need for using time.time() when the IMU hardware clock can do it.

I’ve now applied the same to the hardware interrupt driven code; motion processing now takes less time as it was the only call to time.time(), and as such that means there’s a smaller chance of missing samples too.  I’ve also added code so that when there are I2C / data corruption errors, that counts as another 1ms gap since the code then waits for the next hardware interrupt.

I’ve also removed the zero 0 calibration again, as I’m fairly convinced it’s pointless – I _think_ the deletion of (Butterworth filter extracted) gravity from acceleration means that calibration is pointless.  That also adds a very marginal performance increase to motion processing.

A few test flights showed no regression (nor noticeable improvement), so I’ve uploaded the code to GitHub.  You also need to grab the latest version of my GPIO library as in now imports as GPIO rather than as a clone of RPi.GPIO.

The last nail in FIFO’s coffin

Based upon a comment from yesterday’s post, I did some more rigorous testing to track whether FIFO could be made to work.

The main change was to drop the sampling rate to 250Hz to allow loads of time to read the FIFO.  With that in place, the FIFO never filled and I could just read n batches of 12 byte sensor data successfully.

With that sorted, next I ran a sanity check – an unpowered flight with Phoebe sitting on the floor.  There were a number of I2C read errors.  Here’s what a got from the ‘accelerometers’ during an 8 second passive flight:

FIFO stats

FIFO stats

The first 2 seconds are correct – 1g of gravity in the Z axis, and 0g gravity in the X and Y axes.  And then all hell breaks loose – or it would if the motors were powered up.  Suddenly, gyro data has crept into the accelerometer range, so Z axis gravity drops to near zero and after that who knows WTF went wrong.  The errors do seem to correspond to the count of I2C errors I track, which I think means that if I want to pursue the FIFO direction, then I have to swap I2C for SPI.  And in fact, swapping to SPI might be the best next approach for the data register reading as well.  The only downside is that it requires the PCBs to be reworked.

I just wish PyPy had yielded what it claimed it could do regarding performance as then I wouldn’t have this new battle to fight.