It’s Autumn, and that means the MPU-9250 is running outside of its ideal temperature zone.  Every year, what flies beautifully during the summer deteriorates once the temperature falls firmly into the teens.  That’s what’s happened yesterday with ambient @ 15°C.  What was a rock solid hover two days ago has become unstable.  Previous years, I’ve been able to move indoors with one of my smaller models, but that’s not an option this year.  Hermione is simply too big to fly in the house and she needs to be this big to incorporate all the sensors and RPi 3B.  Additionally, GPS is virtually inaccessible indoors.

To make things worse, all summer Hermione has been running an IMU which is outside of the spec accuracy range: it reads gravity as 0.88g – her accuracy should be 3% not 12%.  She just about got away with it during the summer temperatures, but not now down in the teens.

Net?  I’ve a new MPU-9250 on the way which will meet their spec; this should help somewhat, but at the same time, I think it’s fair to say that like the hedgehogs and squirrels round here, Hermione will be going into GPS hibernation, and only waking occassionally to test the sweep mapping of the indoors with very limited lateral flight movement.

As I result, I have updated the latest code onto GitHub.

A look back at time

With the latest code, sampling the sensors at 250Hz, and doing motion processing at 50Hz, I’m capturing every sample: in an approximately 10.6 second flight measured by time.time() wrapped around the flight code, I’m seeing 530 motion processing runs and 2650 samples to  0.1% accuracy.

And yet she drifts a couple of meters in that 10 seconds.  For a while I thought that was it: an A2 couldn’t capture any more samples, and I had no choice but to break out more sensors, which would also be dependent on the A2.

But then, one of those thoughts occurred to me – yes, once again when I was dozing on the sofa to refresh my brain: the rotation rates from the gyro are a lot more critical now.  They are used to approximate the current angle of tilt based upon the previous one.  That then is used to rotate accelerometer readings to the earth frame, pass them through the gravity extraction filter, and then rotate the extracted gravity back to the quad frame to calculate better Euler angles.

And here’s the crux: previously I’d been using the MPU9250 digital low pass filters (dlpf) to filter out ‘noise’, but there was a price: the filtering causes a delay as well as a reduction in noise.  And that delay means the prediction of the current angle would always be wrong.

The gyros are not ‘noisy’ and they are factory calibrated.  So I decided to turn the gyro dlpf to it’s minimum 188Hz with 1.9ms delay; and while I was at it, I thought it worth doing something similar with a accelerometer DLPF.

And sure enough, back to zero drift for the flight.

For anyone using the code I posted to GitHub yesterday, search for

cli_alpf = 3
cli_glpf = 2

and change them to

cli_alpf = 1
cli_glpf = 1

and see what happens!

Weird timing

Since I reconfigured the sampling rate to 500Hz from 1kHz, everything has been working much better.  I’ve increased flight times to 11.5s (1.5s warm-up, 2s take-off, 6 seconds hover and 2 seconds landing), and also been able to move to within 2m of both Chloe and Phoebe without risk of them slashing one of my arteries.  All great.  And all because there’s now enough time between samples for motion processing to take place, and hence now samples are lost.

The only thing that’s odd though is timing: all of HoG’s timing is based on the sampling rate: elapsed time = number of samples / sampling rate.

In the current flights I get 5760 samples = 11.52 seconds elapsed time as expected.

But if I wrap the flight code with a couple of time.time()’s to measure the actually flight time, it comes out at 8.64 seconds, suggesting that the sampling rate isn’t 500Hz as configured, but more like (11.52 / 8.64) * 500Hz = 666.666666666666… to 32 digit accuracy!  And that’s weird as it’s not possible to configure the IMU to sample at this rate – sampling rates are (1kHz / an integer).

The nearest thing I can find in the MPU-9250 data sheet section “4.4 Register 25 – Sample Rate Divider” is this:

Data should be sampled at or above sample rate; SMPLRT_DIV is only used for 1kHz internal sampling.

Perhaps by setting the ADC sample rate to 500Hz, the MPU-9250 also sets the data ready interrupt frequency to 666Hz to ensure the above rule is maintained.  There is no mention of 666Hz throughout the documentation; if my supposition is correct, then it’s a little poor to hide this fact to be inferred from a single sentence embedded in the depths of the very large register map document!

P.S. As well as affecting the length of flights, it also has an effect on velocities and angles which both use integration over time of the accelerometer and gyro respectively. The former isn’t significant in the grand scheme of things, but the latter probably is: merging angles from the accelerometer and integrated gyro will mean short term, the gyro angles are over-estimated. The ‘fix’ is simple, though again a bit of a hack as it needs to know through testing the difference in ADC sampling- and data ready interrupt frequencies.

Phoebe’s HoG has no I2C errors

I’ve complete the build of Phoebe’s HoG.  A couple of test flights (without motors powered up) showed no problems over I2C, so it’s a hardware problem with Chloe – one of her A+, PCB, or MPU-9250 breakout as currently Phoebe and Chloe are using the same disk image.

My best guess would be her MPU-9250 breakout, and I have replacements on the way, so we’ll see what happens after the swap of Chloe’s IMU.  Until they arrive, there’s probably no point flying Chloe again.

Yet more data sampling diagnostics

In this run, I’d swapped to using 50us interrupt pin pulses, driven by the MPU-9250.  I’ve also dropped the data sample rate to 500Hz and the motion processing down to 37Hz, so that would suggest about 13 – 14 samples batching into the motion processing code. Here’s the equivalent graph to yesterday.

500Hz 50us sample trigger

500Hz 50us sample trigger

It didn’t do what I was expecting. Here’s a graph that makes it clearer what’s odd:

500Hz sampling, 37Hz motion

500Hz sampling, 37Hz motion

This is showing batches of 18-19 and 21-23 samples per 37Hz (27ms) processing cycles.  So regardless of the sampling frequency set to 500Hz, I’m still getting data at a 1kHz sample rate.

The periodic spikes of only 12 samples per processing cycle are increasing likely to be due to python underworld garbage collection.  The original single threaded code didn’t generate garbage, but the single-threaded version of the multi-thread code might do – it’s the block of data that’s filled in the sampling code, and read by the motion processing code.  Its lifetime is only milliseconds.  Perhaps I’ll have a closer look about how to skip those data copies in the single threaded version.  Easy in ‘C’ because I can pass pointers around for the sampling thread to copy data direct to the motion processing thread, but I’m not sure how / if that’s possible in python.

Finally, I still get an I2C missing or two in the warm up code, which I still have no idea how this could happen.  Perhaps garbage collection too?

I2C problems – progress report

I’ve been tinkering this morning with the hardware interrupt handling to try to track down how the I2C was being kicked to read data more than 500 times a second, when I’d reduced the sampling rate down to 500Hz.

So far, I’ve:

  • updated my custom GPIO code for yet higher performance on the hardware interrupt from the MPU-9250 by adding the EPOLLONESHOT which means epoll() only listens for the next rising edge event, and then disables the fd until it’s MODed – I’m assuming this should be faster that ADDing and DELeting each time and ensure we catch the next rising edge rather than any historic ones.
  • change the hardware interrupt to open-gate and added a pull-up on the GPIO input pin used to detect rising edge of the interrupt pulse
  • disabled the device tree and dropped back to the previous kernel device management
  • added the I2C baudrate=400000 back to /etc/modprobe.d/i2c_bcm2708.conf
  • added the I2C combined=1 to /etc/modprobe.d/i2c_bcm2708.conf but that caused garbage results and ultimately a kernel crash so I’ve backed that out.

I’m still getting I2C read errors, but at least the interrupt is now working in line with the baudrate: a reduced baudrate results in a reduced loop speed in the code as it’s taking longer to deliver data from the MPU-9250.

However, what I’d expect is that the interrupt would be in line with the sample rate, yet at 333Hz sample rate set for the MPU-9250, I’ve just got 799 sampling loops per second.  There’s something very odd about the interaction of the I2C baudrate, the data sample rate, and the data ready interrupt in the MPU-9250.

More digging anon.  Might be time to deploy the oscilloscope.

Diagnosing I2C problems (work in progress)

The symptoms of the I2C problems I’m seeing are python exceptions from the smbus library; when caught, as well as incrementing the ‘misses’ count, I have code that re-reads the sensors.  The trouble is, the I2C exceptions I’m catching are actually symptomatic of a bigger problem: duff data from reads which don’t trigger exceptions.  Just one set of duff data results in long lasting errors in angles, acceleration, gravity measurement etc etc etc.  Put simply, you can forget any chance of stability in a flight.

I’d solved the problem with Phoebe using the hardware interrupt to trigger the I2C read.  And it worked like a dream, and Chloë and Zoë inherited that solution.

But the problem came back with the birth of HoG (using the latest Raspbian distribution) and the swap to the MPU-9250.  So I’ve been trying things today to work out what’s gone wrong.

First step was to solder together a couple of pads on the MPU-9250 breakout to connect the pull-up resistor – this should be unnecessary as the Raspberry Pi I2C bus already has a pull-up and adding another could be detrimental.  Anyway, there was no change in behaviour – still I2C missing.

Next step was to track down whether it’s the sensor, the code, the A+ or the kernel causing the problem.  Zoë still lives in a half-alive state so I could swap SD cards around as see what happened:

  • HoG’s SD card @ kernel 3.18.7+ sees I2C read misses with MPU-9250 (HoG’s hardware)
  • Zoë’s SD card @ kernel 3.12.28+ sees I2C read misses with MPU-9250 (HoG’s hardware)
  • HoG’s @ kernel 3.18.7+ sees zero I2C read misses with MPU-6050 (Zoë’s hardware)
  • Zoë’s SD card @ kernel 3.12.28+ sees zero I2C read misses with MPU-6050 (Zoë’s hardware)

That rules out the kernel version, the A+ hardware, and my software.  It suggests a problem with the MPU-9250 breakout board, the PCB it’s connected to, the wiring or a long term I2C driver problem that only shows it’s ugly face with the MPU-9250.

My best guess is the I2C barometer on the same breakout; it uses the same bus, but as yet, I’ve not configured it in any way.  Perhaps as a result, it’s being noisy as a result, annoying all the other passengers on the bus as a result?

Update: on a whim, I reduced the data rate from 1kHz to 500Hz to allow more time for the sensor data to be read.  What I saw was the data rate go up from just under 700Hz to over 700Hz.  That suggests the data ready interrupt isn’t working; data is only being made ready at 500Hz, so where are those other pulses coming from?

I had a looking at my custom GPIO code, and a slightly contraversial change I made; I backed this out and tried again.  This time, no edge were detected at all!

Something in that area is very dodgy.  More tomorrow, no doubt.

News update

Sorry it’s been quiet here; thought I’d better update you what’s going on.  I’ve been doing test flights as the weather allows trying to track down a problem.  The symptoms have been inconsistency between flights, where a few are fantastic, but most end with a prop digging into the ground, and an alloy arm bent at the wrist.

For quite a long time since adding the hardware interrupt to announce to the code that new data was ready from the sensors, I’ve always received data I could trust from the MPU-6050.  With the build of HoG and the move to the MPU-9250, that’s no longer the case – I’m getting i2c read exceptions, yet the code in that area is untouched.

This might be related to the MPU-9250 registers, or it might be related to the latest distribution of Raspbian I installed on HoG.  Not clear yet.

I’ve also started looking at ‘Kitty’ – code using RaspiCam and picamera to identify the position of a laser pointer dot on the ground, so that the direction / distance of that dot could be turned into flight plan targets so that HoG would follow it.  Not a difficult piece of fun to add to HoG but while she’s not behaving, testing is restricted.

So it’s going to continue to be quiet here for a while until I’ve got I2C / GPIO hardware interupts back to the standard they were.


Maiden voyage

HoG lost her flight virginity today, and she lost it with enthusiasm – a little too much to be honest.  Three second flight plan – one second takeoff to 0.5m, one second hover and one second descent.  All maiden flights are a huge gamble: in HoGs case, she had

  • new arms
  • new props
  • new motors
  • new frame
  • new calibration method
  • new butterworth filter parameters.

Given that, I’d say her performance was surprisingly good!

She took off vertically from sloping ground.  That alone is nearly an unqualified success.  For it to have been a complete success though, she would have stopped at 0.5m off the ground and hovered.  Instead, she whizzed up to 3m  and then I hit the kill switch even before she had the change to try to hover.

A few lessons learnt even from such a short flight though:

  • zero g calibration seems to work, but it needs doing for X, Y and Z axis
  • having dlpf set to 180Hz rather than the normal 20Hz probably wasn’t a smart move regardless of how good the Butterworth might be
  • aluminium arms bend and don’t straighten when they hit the ground at over 7.5ms-1!

New arms are on the way and will arrive tomorrow, allowing me to do the zero g calibration of the Z axis also!

But what’s Zero-G calibration, and how do you do it without going into space?

Historically, I’ve been jumping through hoops trying to get sensor calibration stable, controlling the temperature to 40°C while rotating her in the calibration cube to measure ±g in all three axes to get gains and offsets.  Yet despite all that effort, the sensors, and hence Zoë, still drifted, even if only modestly over time, still enough that she couldn’t fly in the back garden for more than a few seconds without hitting a wall.

The move to the MPU-9250 for HoG from Zoë’s MPU-6050 IMU initially seemed a retrograde step – it didn’t seem to be able to measure absolute temperature, only the difference from when the chip was powered up.  And that meant the 40°C calibration could no longer work.  Lots and lots of reading the spec’s yielded nothing initially,

But in passing I’d spotted some new registers for storing accelerometer offsets to allow them to be included in the IMU motion processing.  That suggested there was a way to get valid offsets.  Additionally, again in passing, I’d spotted a couple of Zero-G specifications: critically that the Zero-G level change against temperature was only ±1.5mg / ºC.  That means an offset measured in a Zero-G environment hardly drifts against temperature.   And a Zero-G environment doesn’t mean going up to space – it simply means reading the X and Y axis values when the Z-axis is aligned with gravity.  So with HoG sat on the floor, X and Y offsets are read, and then holding her against a wall gives the Z offset.  So calibration and updating the code takes only 5 minutes and requires no special equipment.

Delight and despair at the same time: delight that I now had a way forwards with the MPU-9250 (and it would work with the MPU-6050 also), but despair at the time and money I’d spent trying to sort out calibration against temperature.

MPU-9250 – first impressions

A more careful glance at the MPU-9250 suggests actually most of the MPU-6050 code works fine; most of the new function is in new registers or unused bits in existing registers.

However, there are some changes for which I need to work out how to update the code:

  • my main processing loop frequency has dropped to 300Hz from the 800Hz I’ve achieved with the MPU-6050.  The spec does describe changes in this area, but it’s not at all clear. UPDATE: The problem actually lies with the new device tree mechanism added to the Raspberry Pi distribution in 2015/1/31.  To get the bus speed up to 400kbps, you need to add into /boot/config.txt:
  • the offset / gains provided for the MPU-6050 to map between the 16 bit temperature sensor reading have changed, and the spec does not provide a revised version; as an example it thinks it’s 40ºC in my house instead of the 22ºC my thermometer says.  An initial analysis (offset, gain) of ~(55,200) whereas the MPU-6050 values are (36.53, 340):
     ºC = sensor * gain + offset

I can find the offset and gain values for the temperature mapping through experimentation; I’m a lot more worried about the processing speed – I think I’m going to have to read that section a lot more carefully. Luckily, the local shop had printer paper in stock!