I mentioned a year ago using the Raspberry Pi Camera to provide motion data; this motion data is a bi-product of the compression of h.264 video frames.

Each frame in a video is broken up into 16 x 16 pixel blocks (macro-blocks) which are compared to the blocks from the previous frame to find nearest matches.  Then an X, Y vectors is produced for each macro-blocks for the best matching block in the previous frame along with a score of absolute differences (SAD) – essentially a score of trustworthyness of the vector.  This happens at the frame rate for the video as part of the h.264 video compression algorithm, and is produced by the GPU alone.

Because of the fixed frame rate, these are effectively velocity vectors, and because the camera would be strapped under the quadcopter body, the velocities are in the quadcopter reference frame.  A crude processing of this data would simply be to average all the X,Y vectors based upon each SAD values to come up with a best guess estimate of an overall linear vector per frame.  Better processing would need to accommodate yaw also.

All this function is now available from the Python picamera interface making it very easy to create a motion tracking process which feeds the macro-block vectors + SAD data to the HoG code which can do the averaging and produce the quadframe velocity PID inputs for the X and Y axes.  The velocity PID targets are set as now to be earth-frame speeds reorientated to the quad-frame.


  • frame speed and processing on the GPU
  • no integration of (accelerometer – subtraction of gravity rotated to the quadframe) to get velocity hence no cumulative error from offsets


  • need some height information – possibly just best estimate as now

This opens up the possibility of indoor and outdoor motion processing over varying contrast surfaces, or laser tracking over surfaces without significant contrast difference.

First step is to adapt the existing Kitty code to churn out these motion vectors and stream them to a dummy module to produce the resultant velocity PID inputs.  More thoughts anon.

Kitty’s measuring up

Did some simple testing to determine the camera angle, resolution and suitable dual laser dot separation.  Kitty was sat on a stool, looking up at the ceiling, and I was waving the laser around the ceiling watching the curses display.  The picture below is upside down!

           ∧      ^      ^            
          /|\   21cm     |
         / | \____V      |
        /  |  \        205cm 
       / α |   \         |
      /    |    \        |
     /     |     \       |

U = camera position
Span of kitty laser dot detection = 170 x 170cm
Test height (camera to ceiling) = 205cm
Alpha = atan(170 / (2 x 205) = 22.5°

In addition…

Camera height at takeoff = 21cm
∴Camera span at takeoff = 17.4 x 17.4cm
Image size = 32 x 32 pixel
∴Image resolution at takeoff = 0.5 x 0.5cm
Landing leg separation = 23cm

Based on the above, 2 dots spaced by 15cm would make a good laser leash

Image span at 1m hover = 82 x 82cm
Double dot separation viewed from 1m = 15cm / 82cm x 32 pixels ≅ 6 pixels

So minimum capture resolution of 32 x 32 pixels should be fine for 1m hover, but by 2m, the risk of merged dots is too high as they’ll only be 3 dots apart.

How kitty will work?

Backgrounder on Kitty

Here’s my initial thoughts about how to implement a reference point for Phoebe to hover over and track:

  • She does pre-flight checks and warms up the sensors
  • The motors remain unpowered until she spots the laser underneath her
  • Once the laser lock is acquired she takes off to 1m hover.
  • Moving the laser at hover causes her to follow.
  • If laser lock is lost, immediate controlled descent.

In more detail…

  • 2 laser pointers in parallel at fixed spacing – hereafter known as the laser leash
  • 1 would give her a point to hover over and this is the first step in the development
  • 2 gives her orientation and height
    • the alignment of the two dots compared to the camera alignment gives yaw
    • the spacing of the two dots shrinks as she gains height. ground level separation is measured when she’s locked on to the laser.
  • loss of lock on one dot changes behaviour to drift towards the remaining one with the aim of centering the single dot and thereby reacquiring the second dot at the risk of ensuing height / yaw errors.
  • loss of both dots of both dots leads to immediate horizontal descent to ground.  If single latch reacquired during descent, then second dot acquisition procedure as above; if both dots reaquired, then normal flight resumes.
  • Various beep sequences to indicate no lock, single lock and double lock
  • double dot analysis produces quad frame velocity targets along the X, Y and Z axis plus yaw correction target around the Z axis.

Together that should mean she can be taken for a walk on a laser leash!

Primary concerns:

  • test area – garden is too bright, indoor at home has too many bits of furniture – there are redundant farm buildings within 5 minutes walk for testing – but only after a significant series of passive tests have been successful
  • feeding the periodic camera results through as updates to velocity / yaw targets earlier experiments show 1 per second is about the max rate
  • Kitty running on separate thread or process probably – assuming PiCamera blocks while taking a shot – how much will this effect performance and will the GIL block quad code completely while kitty takes a photo?  If so, it’ll need to be a separate process with an input queue the Phoebe checks periodically for target updates

First steps

  • add the beeper onto Phoebe and Chloe
  • build the double parallel laser leash
  • measure the camera dots spacing of the camera at take-off and 1m hover point


Have a break…

Have a Kit(ty)-Kat*.

I needed a break today from I2C problems**, so I’ve spent some time with Kitty.  She’s the laser pointer dot tracking code which eventually will make its way onto HoG so I can direct her flights with a laser pointer.

She uses the RaspiCam and picamera python library along with the (aptly named) curses library to show on screen the brightest point in a 32 x 32 pixel image YUV image – the first 1024 bytes (32 x 32) are Y / luminance / brightness / contrast.  The other 512 bytes are the UV values which the code has no interest in.

Kitty curses display

Kitty curses display

Currently she’s taking a photo every 0.6s or so, and there are no sleeps in the code, so there’s a bit of work to do to set exposure times to perhaps 0.1s or thereabouts so that the peak brightness can be polled periodically by HoG and used to set velocity targets for its PIDs.

Code’s up on GitHub

Normal service will resume tomorrow

*An ancient tag-line for Kit-Kat chocolate biscuit adverts for those of you too young to get the pun.

** I have done a little, checking the interrupt pin from the MPU-9250 with a ‘scope – it looks untidy, but that may be down to my usage of the ‘scope itself, so I need to dig further.