If you look at yesterday’s video full screen, from top left to right, you can see a muddy patch and two chicken poos, the second poo of which is close to Hermione’s front left prop on take-off. I was back out in the dark last night, tracking them down. Here’s why:
Although the graph of camera lateral tracking and the Mavic video are almost facsimiles in direction, the scale is out; the graph shows the distance from take-off to landing to be about 1.7m whereas a tape measure from chicken poo #2 to the cotoneaster shrubbery landing point measures about 4.2m. Given how accurate the direction is, I don’t think there’s any improvement needed for the macro-block processing – simply a scale factor change of ≈ 2.5. I wish I knew more about the video compression method for generating macro-blocks to understand what this 2.5 represents – I don’t like the idea of adding an arbitrary scale of 2.5.
One further point from yesterday’s video, you can see she yaws clockwise by a few degrees on takeoff – Hermione’s always done this, and I think the problem is with her yaw rate PID needing more P and less I gain. Something else for me to try next.
I had tried Zoe first as she’s more indestructible. However, her Pi0W can only cope with 400 x 400 pixels video, whereas Hermione’s Pi B 2+ can cope with 680 x 680 pixel videos (and perhaps higher with the 5 phase motion processing) which seem to work well with the chicken trashed lawn.
not perfect, but dramatically better. The flight plan was:
- take off to 1m
- move left over 6s at 0.25m/s while simultaneously rotating ACW 90° to point in the direction of travel
I’d spent yesterday’s wet weather rewriting the macro-block processing code, breaking it up into 5 phases:
- Upload the macro block vectors into a list
- For each macro-block vector in the list, undo yaw that had happened between this frame and the previous one
- Fill up a dictionary indexed with the un-yawed macro-block vectors
- Scan the directory, identifying clusters of vectors and assigned scores, building a list of highest scoring vector clusters
- Average the few, highest scoring clusters, redo the yaw of the result from step 2, and return the resultant vector
Although this is quite a lot more processing, splitting it into five phases compared to yesterday’s code’s two means that between each phase, the IMU FIFO can be checked, and processed if it’s filling up thus avoiding a FIFO overflow.
Two remaining more subtle problems remain:
- She should have stayed in frame
- She didn’t quite rotate the 90° to head left.
Nonetheless, I once more have a beaming success smile.
Normality has returned. I took Hermione out to test what video resolution she could process. It turns out 640 x 640 pixels (40 x 40 macro-blocks) was yesterday’s video frame size. 800 x 800 pixels (50 x 50 macro-blocks) this morning was a lot less stable, and I think the 960 x 960 pixels (60 x 60 macro-blocks) explains why.
At 960 x 960, Hermione leapt up into the sky; the height breach protection killed the props at 1.5m but by then she was accelerating so hard she probably climbed to 3m before dropping back down like a brick onto the hard stone drive upside-down. Luckily only 2 props got trashed as that’s an exact match for the spares I had left.
Rocketing into the sky appears to be Hermione’s symptom of a IMU FIFO overflow. For some reason, Hermione doesn’t catch the FIFO overflow interrupt so she just carries on, but now with gravity reading much less than zero because the FIFO has shifted so in fact she’s reading gyro readings, and so had to accelerate hard to compromise. The shift happens because the FIFO is 512 bytes and I’m filling it with 12 byte batches; 512 % 12 != 0.
How does this explain the 800 x 800 wobbles? My best guess is that these 2500 macro-blocks (800² / 16²) are processed just fast enough to avoid the FIFO overflow shown by 900², but does have a big impact on the scheduling such than instead of the desired 100Hz updates to the motors, it’s a lot nearer the limit of 23Hz imposed by the code. That means less frequent, larger changes.
So that mean I need to find the balance between video frame size and IMU sampling rate filling the FIFO to get the best of both. Luckily with indestructible Zoe imminently back in the running, I can test with her instead.
Both flights use identical code. There are two tweaks compared to the previous videos:
- I’ve reduced the gyro rate PID P gain from 25 to 20 which has hugely increased the stability
- Zoe is using my refined algorithm for picking out the peaks in the macro-block frames – I think this is working better but there’s one further refinement I can make which should make it better yet.
I’d have liked to show Hermione doing the same, but for some reason she’s getting FIFO overflows. My best guess is that her A+ overclocked to turbo (1GHz CPU) isn’t as fast as a Zero’s default setting of 1GHz – no idea why. My first attempt on this has been improved scheduling by splitting the macro-block vectors processing into two phases:
- build up the dictionary of the set of macro-blocks
- processing the dictionary to identify the peaks.
Zoe does this in one fell swoop; Hermione schedules each independently, checking in between that the FIFO hasn’t filled up to a significant level, and if it has, deal with that first. This isn’t quite working yet in passive test, even on Zoe, and I can’t find out why! More anon.
OK so here’s a graph of macro-block vectors, where red dots are those whose SAD is higher than the average and green is equal or lower than average. The colour of each dot is more intense for the number of macro-blocks with that vector value..
Here’s a couple separating the greens (currently used) from the reds (currently discarded).
It’s pretty clear that both low and high SAD vectors have a common concentration around “the right value” i.e. SAD’s irrelevant for finding the most concentrated area of vectors. If there’s a way to find that area in code, throw away all vectors outside that area’s boundary and then average out those that are left, the result should be much more accurate.
Adam Heinrich from the Czeck Republic has pointed me in the direction of the RANSAC (RANdom SAmple Consensus) algorithm; a simple, iterative way to identify the collection of macro-block vectors that are self-consistent.
I’ll leave it to Wikipedia to explain how it works.
The reason it’s so good in my context is that it can broken up into batches which can be processed when the code isn’t busy doing something else: i.e. when the FIFO is less than half full and we’re waiting for the next set of blocks from the video.
Once Christmas and New Year chaos are out of the way, I hope to dedicate more of my brain to how this works, and implementing it.
Thanks, Adam, for pointing me in the right direction.
Currently, getting lateral motion from a frame full of macro-blocks is very simplistic: find the average SAD value for a frame, and then only included those vectors whose SAD is lower.
I’m quite surprised this works as well as it does but I’m fairly sure it can be improved. There are four factors to the content of a frame of macro-blocks.
- yaw change: all macro-block vectors will circle around the centre of the frame
- height change: all macro-blocks vectors will point towards or away from the centre of the frame.
- lateral motion change: all macro-blocks vectors are pointing in the same direction in the frame.
- noise: the whole purpose of macro-blocks is simply to find the best matching blocks between two frame; doing this with a chess set (for example) could well have any block from the first frame matching any one of the 50% of the second frame.
Given a frame of macro-blocks, yaw increment between frames can found from the gyro, and thus be removed easily.
The same goes for height too derived from LiDAR.
That leaves either noise or a lateral vector. By then averaging these values out, we can pick the vectors that are similar to the distance / direction of the average vector. SAD doesn’t come into the matter.
This won’t be my first step however: that’s to work out why the height of the flight wasn’t anything like as stable as I’d been expecting.
A more contrasting rug in poorer lighting:
The gravel drive:
A concrete path to the back garden:
The rear lawn:
These were all the same run, with the rig held about 1m above the ground with me strolling at about 1m/s – both very much guesstimates.
None of the shots are in ideal conditions: they each suffer from one of more of low lighting conditions, high resolution, low contrast or low colour variation. Yet each shows a cluster in roughly the same place of between 10 and 20 vector shift in the positive X direction.
This bodes very well. Next step is a manual slog to correlate the SAD values with the clusters or see whether there’s a simple mathematical algorithm to average all the values biased by the proximity of the nearest macro-block neighbour?
I took my test rig for a walk over this rug in the lounge (red-point siamese cats for scale):
I captured a single frame of macroblocks as I walked forwards across the rug:
This cluster of the vectors corresponds to the spikes in this graph of the SADs:
Critically, the X axis cluster also corresponds to the walking direction.
I need (and should be able) to get two things out of this:
- a sense of scale: the cluster is around the (-12,-3) X.Y position. The test was carried out with me carrying the rig about 1m off the ground, and walking about 1m/s. Based on my previous speculative thoughts about the macro block units, this suggests the cluster X value of -12 should represent 0.43m i.e. (size of frame in meters) / (number of macro-blocks per frame) * macro-block vector:
(2 * tan (48.8 / 2) * height) / (400 / 16) * -12 = -0.43m
This is clearly wrong. My guess is I was walking at about 1m/s and the video is running at 20fps, so the movement portrayed by this set of vectors should be about 0.05m. Clearly there’s more digging required to understand difference between 0.43 and 0.05m.
- how to filter out the good vectors from the bad. A SAD biased average of all the vectors in this frame results in an overall vector of (-8.04,-2.78) compared to the visual on the cluster of about (-12,-3); this isn’t bad, but I’m sure it could be made better by discarding some of the vectors based on their SAD values. Again more digging required here.
Despite the further digging required in both cases, I’m pretty confident that things are going in the right direction.
Because I’m stuck waiting on the post, I’m very very bored. And when I’m bored, I get very frustrated until I have something to do. That something is looking at macro-blocks calibration versus calculation.
There’s 3 sides to this:
- What do the macro-block vectors actually mean? i.e. what are the units for the values? I’m currently assuming is the number of macro-blocks that a frame has moved base upon the fact that the values are only a single byte, so can only cover 255 pieces of movement; based on that assumption, the maximum frame size would be 255 x 16 (pixels per macro-block) = 4080 pixel maximum screen resolution which is plausible. Zoe is shooting to 400 x 400 resolution, so she should get ±25 as the output values. I need to test this theory..
- I need to work out how to use the SAD readings; the low the value, the more confidence there is that the shift in the macro-blocks is accurate. I need to look into this in more detail.
- Finally, I need to know the units of macro-blocks, and how to convert the values to meters. This is where I’ve made progress.
/ | \ |
/ | \ |
/ | \ h
/ | \ |
/ | \ |
/ | \ |
I’ve found out that the camera angle of view is 62.2 x 48.8 degrees for the V2 camera. As I’m videoing a square frame, the angle θ in the diagram is 48.8°.
d can be calculated
- in meters as 2 h tan(48.8 / 2) where h is also measured in meters
- in macro blocks as frame size (400) / macro block size (16) = 25
So 1 macro block is 2 h tan(48.8 / 2) / 25 = 0.03628960 x h meters. At one meter height, d works out as ≈0.907m which matches up with my testing.
Based upon the walk graphed in the previous post, the measured distanced in the garden, and a crude guesstimation that I was carrying the test rig about 1m off the ground as I walked the walk, the macro-block output is about 87,500 pixels per meter height per meter distance or
horizontal distance = height * macro-block output / 87500
For the moment, that’s more than accurate enough.
A few things left to do before this code can be used in Hermione
- Currently, the video and macro block processing run in separate threads, connected by a shared memory FIFO; ultimately this needs splitting into into two processes which could then run on separate CPUs should a multi-core A3 appear.
- There’s some work to integrate the select.select() waiting on the FIFO into the main quadcopter scheduling loop – it replaces the current time.sleep() with listening on the FIFO with a timeout of the time.sleep() value.
- At the gory details level, I need to make sure the video and IMU X and Y axes are aligned when the new PSBs arrive.
Shouldn’t take too long except for the minor detail I’m off to DisneyLand Paris next week 🙁