Getting more than I’d bargained for.

Both Phoebe and Zoe now rocket up to the sky at at least 3ms-1 when they should be climbing at 0.3ms-1.

The main relevant change is that both were flying with alpf 0 – 460Hz.  Reducing this to 2 (92Hz) with Phoebe yesterday got the vertical climb rate under some level of control, but the horizontal drift was back.  I’m assuming Zoe will show the improved behaviour at alpf 2 also.

Clearly there’s something in the accelerometer readings that’s integrating (after gravity is removed) to lower velocities than expected.  Then it dawned on me: sampling at 1kHz with the minimal low pass filter set to 0 means that >2g spikes could get picked up; with the range of the sensor set to ±2g, the value could overflow, resulting in the <0g values I’d been seeing.

A very quick test with Phoebe in a howling wind proved me right.  I tried it with Zoe and the result was better but a long way from perfect so more work required there, including fixing her broken arm before tomorrow’s engineering conference.  Oops!

JIT Jamboree Jubilation

Zoe at 1kHz sampling, 8s flight from Andy Baker on Vimeo.

If you are interested in what triggered the transformation, have a look at these comments.

The net results are I’m getting perfect data reads at 1kHz sampling and alpf set to 2.  And that’s as good as the sensor can possibly provide.

So Zoe is good enough for the Cotswold Jam on Saturday (sold out, sorry), and my employer’s Engineering Conference the following week, and hopefully the Raspberry Pi Birthday Party in early March.

There’s probably some PID tuning to be done that might stop the low-frequency wobbles due to the gyro PID I gain being a bit too enthusiastic.  That would then curtail the drift too.

I’ll post more videos if that turns out to be true.

The benefits of having a Brazilian…

comment on the blog: courtesy of Gustavo yesterday, I’m now able to empty the FIFO a batch at a time with a single I2C read (12 bytes) rather than 12 reads of 1 byte.  That’s made my code a lot faster, buying even more time to check other sensors etc.

It’s all to do with i2c.readList (i2c.smbus.read_i2c_block_data): I’d used readList when I was accessing the data registers direcly; reading 14 bytes from register 59 actually gave me a single byte each from registers 59 to 72 – the full set of sensor data I needed.  As a result, I’d ruled out using readList at register 116 (FIFO) getting me 12 bytes per read from just that single register; but that’s what Gustavo does and sure enough it worked for me too.

Sadly a quick flight this morning still showed the same negative G problems – have a look at the az values – these should read roughly 16384 at hover; these aren’t all the samples, just the ones showing negative G during the 8s flight:

[WARNING] (MainThread) __init__ 1397, Zoe is flying.
[WARNING] (MainThread) __init__ 312, SRD:, 1
[WARNING] (MainThread) __init__ 411, IMU core temp: 19.888756
[WARNING] (MainThread) fly 1522, fly = False, flight plan = , calibrate_0g = 0, hover_target = 150, shoot_video = False, vvp_gain = 400.000000, vvi_gain = 200.000000, vvd_gain= 0.000000, hvp_gain = 1.500000, hvi_gain = 0.100000, hvd_gain = 0.000000, prp_gain = 110.000000, pri_gain = 11.000000, prd_gain = 0.000000, rrp_gain = 90.000000, rri_gain = 9.000000, rrd_gain = 0.000000, yrp_gain = 80.000000, yri_gain = 8.000000, yrd_gain = 0.000000, test_case = 1, rtf_period = 1.500000, tau = 5.000000, diagnostics = False
[WARNING] (MainThread) load0gCalibration 551, 0g Offsets:, 0.000000, 0.000000, 0.000000
[WARNING] (MainThread) fly 1522, fly = True, flight plan = fp.csv, calibrate_0g = 0, hover_target = 380, shoot_video = False, vvp_gain = 400.000000, vvi_gain = 200.000000, vvd_gain= 0.000000, hvp_gain = 1.500000, hvi_gain = 0.100000, hvd_gain = 0.000000, prp_gain = 110.000000, pri_gain = 11.000000, prd_gain = 0.000000, rrp_gain = 90.000000, rri_gain = 9.000000, rrd_gain = 0.000000, yrp_gain = 80.000000, yri_gain = 8.000000, yrd_gain = 0.000000, test_case = 0, rtf_period = 1.500000, tau = 5.000000, diagnostics = False
[WARNING] (MainThread) load0gCalibration 551, 0g Offsets:, 0.000000, 0.000000, 0.000000
[WARNING] (MainThread) fly 1661, pitch -1.718589, roll -2.401715
[WARNING] (MainThread) fly 1662, egx 0.000000, egy 0.000000, egz 0.976289
[WARNING] (MainThread) fly 1739, 0 data errors; 0 i2c errors; 0 2g hits
[CRITICAL] (MainThread) readFIFO 469, ax: -20304; ay: 8588; az: -556; gx: 1501; gy: 1042; gz: 1181!
[CRITICAL] (MainThread) readFIFO 469, ax: 6156; ay: -784; az: -552; gx: -379; gy: 454; gz: -75!
[CRITICAL] (MainThread) readFIFO 469, ax: 1580; ay: -2772; az: -564; gx: -2126; gy: 304; gz: -339!
[CRITICAL] (MainThread) readFIFO 469, ax: -8784; ay: -1812; az: -608; gx: -2906; gy: 239; gz: -864!
[CRITICAL] (MainThread) readFIFO 469, ax: 5016; ay: 1420; az: -636; gx: -1649; gy: 2227; gz: 1277!
[CRITICAL] (MainThread) readFIFO 469, ax: 9652; ay: -1816; az: -732; gx: -1665; gy: 1730; gz: 751!
[CRITICAL] (MainThread) readFIFO 469, ax: -84; ay: 2328; az: -2392; gx: -2358; gy: 1965; gz: 926!
[CRITICAL] (MainThread) readFIFO 469, ax: -5884; ay: -1184; az: -108; gx: -75; gy: 1438; gz: 351!
[CRITICAL] (MainThread) readFIFO 469, ax: -2180; ay: 7628; az: -2316; gx: 493; gy: 2743; gz: 384!
[CRITICAL] (MainThread) readFIFO 469, ax: -152; ay: 5336; az: -2208; gx: 407; gy: -860; gz: -703!
[CRITICAL] (MainThread) readFIFO 469, ax: 21212; ay: -2948; az: -392; gx: -175; gy: -1606; gz: -538!
[CRITICAL] (MainThread) readFIFO 469, ax: -7704; ay: -5088; az: -1184; gx: -899; gy: 2470; gz: 10!
[CRITICAL] (MainThread) readFIFO 469, ax: 4904; ay: -2688; az: -676; gx: -1409; gy: 2067; gz: 649!
[CRITICAL] (MainThread) readFIFO 469, ax: 11732; ay: 3232; az: -976; gx: -1023; gy: 1170; gz: 1019!
[CRITICAL] (MainThread) readFIFO 469, ax: 4920; ay: 2740; az: -3212; gx: -552; gy: 958; gz: 667!
[CRITICAL] (MainThread) readFIFO 469, ax: -10364; ay: 5020; az: -2744; gx: -553; gy: 839; gz: 310!
[CRITICAL] (MainThread) readFIFO 469, ax: 4104; ay: -1516; az: -2756; gx: -693; gy: 46; gz: 654!
[CRITICAL] (MainThread) readFIFO 469, ax: -10048; ay: 5772; az: -20; gx: -686; gy: 283; gz: 1207!
[CRITICAL] (MainThread) readFIFO 469, ax: -6844; ay: 5884; az: -1868; gx: -424; gy: 974; gz: 703!
[CRITICAL] (MainThread) readFIFO 469, ax: 6428; ay: 1472; az: -368; gx: -394; gy: 496; gz: 582!
[CRITICAL] (MainThread) readFIFO 469, ax: -3064; ay: 9312; az: -1852; gx: -575; gy: 2588; gz: 801!
[CRITICAL] (MainThread) readFIFO 469, ax: -840; ay: 6392; az: -1944; gx: -366; gy: 1737; gz: 149!
[CRITICAL] (MainThread) readFIFO 469, ax: -2468; ay: 10792; az: -1580; gx: -737; gy: -727; gz: -65!
[CRITICAL] (MainThread) readFIFO 469, ax: -4448; ay: 2652; az: -3624; gx: -974; gy: 639; gz: -111!
[CRITICAL] (MainThread) readFIFO 469, ax: 4604; ay: 7728; az: -4000; gx: -560; gy: -61; gz: -529!
[CRITICAL] (MainThread) readFIFO 469, ax: -23048; ay: 9136; az: -4992; gx: 1366; gy: -524; gz: -5!
[CRITICAL] (MainThread) readFIFO 469, ax: -3860; ay: -1384; az: -488; gx: 1241; gy: -1463; gz: 293!
[CRITICAL] (MainThread) readFIFO 469, ax: -12488; ay: 6820; az: -516; gx: 2382; gy: -3340; gz: -664!
[CRITICAL] (MainThread) readFIFO 469, ax: -2024; ay: -7380; az: -40; gx: -1119; gy: -831; gz: 76!
[CRITICAL] (MainThread) readFIFO 469, ax: -3676; ay: 1472; az: -1756; gx: -687; gy: 6099; gz: 2686!
[WARNING] (MainThread) fly 2068, IMU core temp: 18.121548
[WARNING] (MainThread) fly 2069, motion_loops 934
[WARNING] (MainThread) fly 2070, sampling_loops 9522
[WARNING] (MainThread) fly 2072, 30 data errors; 0 i2c errors; 0 2g hits

That’s 30 errors detected out of 9522 samples, so not terrible, but not perfect.  I was flying her at her new high speed sampling rate of 1kHz and with the accelerometer low pass filter set to 0 (460Hz).  Next step is to turn off the protective code that skips these negative G values and have another go with alpf 2 (92Hz) to see what happens. Certainly the other errors of -32768 in the gyro have now gone, so it’s worth a try.

Exploring negative G ‘errors’ from the IMU

I’ve been asking myself various questions and checking the answers through test flights:

  • Do the negative G values show up in passive flights?  If the motors aren’t powered up, negative G values are not seen.
  • What about alpf?  With alpf set to 0 or 1 (460 or 184 Hz accelerometer low pass filter), I see the negative G ‘error’, but set to 2 (92Hz) or greater it doesn’t happen.
  • What are the values of the other accelerometer and gyro readings when this happens with alpf set to 0?  These are all believable: ax = -4214, ay = -696, az = -270, qx = 2922, gy = 870, gz = 1531 – no magic hex numbers in there suggesting problems.
  • What about sampling frequency?  With alpf set to 0, but with the sampling frequency dropped from 500Hz to 200Hz (SMPLRT_DIV 1 => 4), there are still negative G ‘errors’.
  • What happens if the range is set to ±4g instead of ±2g with alpf set to 0?  The problem remains – there are negative G ‘errors’.

To me, that suggests that somehow, the negative G readings are not an ‘error’ but are real.  They are filtered out by the lower low pass filter frequencies.  The MPU-9250 is working perfectly.  I’ll continue running with sampling frequency of 500Hz, and alpf of 0, and with the diagnostic / protective code disabled.

This then points the blame for head-butting the ceiling towards the ESCs – it’s as though they have latched at maximum power.  The PWM signals I’m sending range from 1ms to 2ms inclusive, and I’m now wondering whether sometimes, the PWM output does hit 2ms and latches the ESC; I’ve turned down the maximum to 1.999ms and see what happens – it’s a hunch based on a long lost memory of something I’d seen or read.

The sound of silence

Sorry it’s quieter than ever – the weather is against me testing the GERMS filter code, so I’ve been twiddling my thumbs.  I have found some odd behaviour from the IMU that’s worth sharing.

The SMPLRT_DIV register sets how often the data registers are updated, and therefore how often the data ready interrupt goes high.  Normally the ADC sampling frequency is 1kHz, and the data registers are updated at a reduced frequency defined thus:

1kHz / (1 + SMPLRT_DIV)

Setting SMPLRT_DIV to 0 should give data ready interrupt (DRI) at 1kHz; 1 should product 500Hz, 2 should give 333Hz etc etc.

But what I’m seeing is that with SMPLRT_DIV <= 3, the DRI happens at 250Hz.  SMPLRT_DIV >= 3 work fine producing 250Hz (3), 200Hz (4), 166Hz (5), 143Hz (6), 120Hz(7), 111Hz (8), 100Hz (9) etc etc etc.  Because I’m using the DRI as the clock for the code, that means 250Hz is the fastest I can use.  That’s no problem really – that’s plenty fast enough allowing 4ms per sample during which I can run the motion processing, but it’s not a limitation that’s been documented in the specs

In passing I’ve fixed a rounding error bug in the pre-flight RTF period.  Not important really, but actually fixing it was what exposed the DRI frequency limit.

I’ll upload to GitHub in the next day or so once I’ve had a chance to check I’ve not broken anything.

One step beyond…

lies Madness.  With such good results from reducing the dlpf filters and the samples per motion, there was once more step: set the gyro dlpf to 0; but that has implications.  Setting the gyro dlpf to 0 sets the filter frequency to 250Hz – a good thing, and the delay to 0.97ms – an even better thing.  But it also changes the sampling rate of the ADC to 8kHz from the standard 1kHz according to the spec – still fair enough – just a few extra lines of code required…

So what should have resulted was still data samples at 250Hz, but with a lag < 1ms – even higher precision forecasting of Euler angles based on gyro results.

But what I saw was the same 14s ‘flight’ according to the number of sensor reads at 250Hz actually took 19.95s according to time.time().  That’s just plain odd – it suggests the sampling rate was actually ≈175Hz.  This corresponds to a sample rate divider of  about 48 instead of the 32 it should be (8000/166.66 vs. 8000/250).

I tried to reverse engineer what the divider should be used to get 250Hz sampling, but regardless of the value (I tried 19, 31 and 47), I always got 20s elapsed time flights.  Most odd – I think I’ll just shelve gyro dlpf 0 for now.


Another step of fine tuning

With the IMU sample rate set to 250 Hz (a 4ms gap between samples), there should be enough time to run motion processing for each sample without missing any; based on experience, motion processing takes 2 to 3ms. I’m currently averaging 5 samples before invoking motion processing (50Hz updates to the props).  Today I did some testing setting this to 2 and 1 (i.e. 125Hz and 250Hz updates to the props).

Setting 5 samples per motion processing gives 3430 samples @ 250Hz = 13.720 seconds of flight. time.time() says it took 13.77s = 99.63% of samples were caught.

Setting 2 samples per motion processing gave 3503 samples @ 250Hz = 14.012 seconds of flight. time.time() says it took 14.08s = 99.5% of samples were caught.

Setting 1 sample per motion processing gave 3502 samples @ 250Hz = 14.08 seconds of flight time. time.time() says it took 14.22s = 98.5% of samples were caught.

I’m guessing the slight decline is due to Linux scheduling; I chose to opt for 2 samples per motion processing, which updates the props at 125Hz or every 8ms.

And boy was the flight much smoother by having the more frequent, smaller increments to the props.

And I reckoned with these faster, less-lag updates to the motors, I might be able to trust the gyro readings for longer, so I changed the complementary filter tau (incrementally) to 5s from its previous 0.5s.

The sharp sighted of you may have already seen the results in the numbers: I’ve now breached my 10s flight time target by 2 seconds (the other couple of seconds is warm-up time), with the same level of drift I could only get in 6 second flights a few weeks ago. That 10s target was what I’d set myself to then look feeding other motion processing sensors based upon the RaspiCam, either laser (Kitty) or MPEG macro-block (Kitty++) tracking.

Only down side – it’s rained nearly all day, so no change of capturing one of these long flights on video. Perhaps tomorrow?

epoll and IMU interrupt interaction

epoll doesn’t differentiate between rising and falling edges – an edge is just an edge. The RPi.GPIO option to specify edge trigger is pointless given epoll doesn’t support it.  The RPi.GPIO code has code that calls epoll_wait() twice, thus reading the rising and falling edge when a button is pushed by a human.  Perfectly fine solution for “wait for button, then flash LED” type problems.

But for the IMU, the IMU interrupts and the epoll code need to be in sync about working together.  So I change both the HoG python- and my GPIO ‘C’ code.

  • EPOLLONESHOT detects an edge and then stops watching meaning there’s no backlog of interrupts building up while the python code is processing the corresponding sensor data.
  • Don’t call epoll_wait() twice to capture both rising and falling edge of a button – it will block permanently second time round with EPOLLONESHOT
  • The MPU6050 is started prior to enabling epoll otherwise epoll blocks waiting for interrupts that IMU has not been configured to send yet
  • Probably better to ask the IMU to clear the interrupt only once the data registers have been read – this then means epoll will not be watching for interrupts at the point there is a falling edge.
  • Set pull down on the interrupt GPIO pin.

This is what the ‘scope showed as a result:

Latching ONESHOT

Latching ONESHOT

The rising edge is triggered by the IMU when new data is ready to be read.  The falling edge is when the python code reads that data over I2C causing the IMU to drop the interrupt pin.

The screen spans 20ms, with a pulse every 2ms.  Hence there should be 10 rising edges, but if you count, there are only 9.  The wide pulse in the middle took more than 2ms between raising the interrupt and the data being read: a sample was lost.  I didn’t have to take lots of screen shots to capture this; this was the first screen shot I took.  The code is set to do motion processing every 5 reads, and I presume that’s the cause of the longer pulse; capturing a sample and doing motion processing takes more than 2ms.  Any screen shot will contain at least one wider pulse like this.

Overall, that’s pretty good news: the IMU interrupt, and the GPIO and HoG code are working well together.  I clearly need to reduce the time motion processing takes – and it looks like the reduction is relatively small.  Also that explains the difference in flight times measured based in interrupt- and time.time(): the HoG code reads only 5 out of 6 samples, so code relying on interrupt timing appears to take less time than it actually does (5 x 2ms < 12ms).

P.S. I’m assuming the mid-width pulse are due to Linux scheduling of my code.  That’s no problem as it’s not causing loss of samples – only the motion processing pulse is taking more than 2ms.

You’re ‘aving a laugh

I separated the data ready interrupt frequency (666Hz) from the IMU ADC sampling rate (500Hz) with the intention that this might prove better timing accuracy and therefore better angles and velocities.

In flight, timing was better, angle were probably better, but unexpectedly, vertical velocity was completely shot – both Phoebe and Chloe rose to about 4m, and their height increased during hover, and descended slowing during descent before dropping out of the sky from 3m on landing.  Phoebe partially snapped off her USB port, and after an hour trying to desolder and replace it, I gave up and ordered a new A+ arriving today.

In addition, it’s seems now like the 666Hz was just someone having a laugh* at my expense – the one flight I got yesterday was capturing data ready interrupts at 677Hz. This could be

  • 500Hz plus noise or
  • 1kHz with missed samples
  • a code bug

I checked the data ready interrupt frequency with my ‘scope when the new A+ arrived with the ADC sampling frequency set to 500Hz:

500Hz 50us data ready interrupt

500Hz 50us data ready interrupt

Given the ‘scope is saying the hardware interrupt is running at 500Hz, why does the code think it’s running at > 660Hz – clearly something in my customized GPIO library is causing double counting – probably not flushing the epoll fd after a read?

*Unrelated to the Quadcopter, but related to someone having a laugh at my expense, I ordered 2 new monitors yesterday to replace my work and home computers – more screen space for the large number of open apps at work and better full screen viewing of files, and full sRGB colour space rendering for editing photos at home.  I ordered them as a huge motivation to stop smoking.  I can’t afford them unless I give up smoking which costs me £300 a month.  Shortly after ordering, I was sorting out my work PC in preparation for the new monitor, and while the existing monitor was off its vesa wall-mount it smashed to the floor and would only show a white screen.  So stopping smoking is now mandatory, not optional!

Weird timing

Since I reconfigured the sampling rate to 500Hz from 1kHz, everything has been working much better.  I’ve increased flight times to 11.5s (1.5s warm-up, 2s take-off, 6 seconds hover and 2 seconds landing), and also been able to move to within 2m of both Chloe and Phoebe without risk of them slashing one of my arteries.  All great.  And all because there’s now enough time between samples for motion processing to take place, and hence now samples are lost.

The only thing that’s odd though is timing: all of HoG’s timing is based on the sampling rate: elapsed time = number of samples / sampling rate.

In the current flights I get 5760 samples = 11.52 seconds elapsed time as expected.

But if I wrap the flight code with a couple of time.time()’s to measure the actually flight time, it comes out at 8.64 seconds, suggesting that the sampling rate isn’t 500Hz as configured, but more like (11.52 / 8.64) * 500Hz = 666.666666666666… to 32 digit accuracy!  And that’s weird as it’s not possible to configure the IMU to sample at this rate – sampling rates are (1kHz / an integer).

The nearest thing I can find in the MPU-9250 data sheet section “4.4 Register 25 – Sample Rate Divider” is this:

Data should be sampled at or above sample rate; SMPLRT_DIV is only used for 1kHz internal sampling.

Perhaps by setting the ADC sample rate to 500Hz, the MPU-9250 also sets the data ready interrupt frequency to 666Hz to ensure the above rule is maintained.  There is no mention of 666Hz throughout the documentation; if my supposition is correct, then it’s a little poor to hide this fact to be inferred from a single sentence embedded in the depths of the very large register map document!

P.S. As well as affecting the length of flights, it also has an effect on velocities and angles which both use integration over time of the accelerometer and gyro respectively. The former isn’t significant in the grand scheme of things, but the latter probably is: merging angles from the accelerometer and integrated gyro will mean short term, the gyro angles are over-estimated. The ‘fix’ is simple, though again a bit of a hack as it needs to know through testing the difference in ADC sampling- and data ready interrupt frequencies.