my blog my blog

Autor: TobiasWeis
Groundtruth data for Computer Vision with Blender

In the video below you can see the sequence of a car driving in a city scene and braking. The layers I rendered out for groundtruth data are the rendered image with the boundingbox of the car (top left), the emission layer ( shows the brakelights when they start to emit light, top right ), the optical flow (lower left), and the depth of each pixel in the world scene ( lower right).

Render-time was about 10h on a Nvidia GeForce GTX 680, tilesize 256×256, total image-size: 960×720. In this article I will first demonstrate how to set up the depth rendering, and afterwards how to extract, save and recover the optical flow.

LW12 Protocol and Python Package

For my new flat I wanted controllable RGB LED stripes. Problem is, most of the controllable cheap ones only have IR remotes, so the receiver must be in line of sight of the remote somehow. That has several drawbacks: you cannot install it behind some furniture without the receiver sticking out, and synchronizing across several rooms is hard.

My solution was to pick some of the RGB LED WiFi controllers (LW12). These come with a neat Smartphone-App to control them.

However, I wanted to control them with my own home-automation-system, or my own smartphone-app.

SICK PLS 101-312, Python and Linux

After fiddling around with some ultrasonic sensors for S.A.R.A.H. (my home automation system), I was looking for other options. Thanks to ebay, industrial laserscanners are now an option 🙂
In this article I will describe how I connected the scanner with a regular PC, got the password, and provide a python-class that is able to communicate with the scanner and produce nice cv-images (and a numpy-array containing the measurements).

I payed 80 bucks for this used SICK laserscanner in the bay: the PLS 101-312.



Displacement priors

What is the target of all this ? Driving in an automotive scenario with a given speed and turnrate at any moment, we want to predict the displacement of a 2D-projection (pixel) between two frames:
p(\vec{uv}_{x,y} | speed, turnrate, camera-matrix, world-geometry)

By using the camera-calibration, I can create artificial curves and walls as 3D point-sets and project them back to 2D. Using discretized values for speed, turnrate, streetwidth and wall-height, I can then simulate the displacement of these 3D-Points when they are projected to 2D (our image).
(Note for me: this is the backprojection-code, main-file:


Symmetry detection

This will probably become one of our modalities in the future: symmetry !

Thanks to the guys at hs-niederrhein, there is symmetry-detection-code that can already be used
for some first estimates:

This software implements the gradient product transform for symmetry
detection that is described in the paper

C. Dalitz, R. Pohle-Froehlich, F. Schmitt, M. Jeltsch:
„The gradient product transform for symmetry detection
and blood vesselm extraction.“ International Conference on
Computer Vision Theory and Applications (VISAPP), pp. 177-184,

And the first results look quite promising:

Lane detection

Today I will try to detect some lanes..

– We know the lane-width (plus minus)
– We are inside the middle of a lane
– We know the camera geometry
– Based on the turnrate of the IMU we can estimate the curvature of the street
– A line in pixels can be detected by a upward flank and a downward flank

Here are some exemplary results:

1) Of course, the best one first 😉

Training Cascades to detect cars

I spent some time on training several cascades to detect cars in ego-view automotive videos,
and will now document what I’ve learned.

I will use the existing OpenCV-tools.

Data preparation
-> pos/ 1000 images containing the desired object
-> (containing the filenames of the objects, number of objects in the frame and bounding boxes in the format x,y,width,height)
-> neg/ 2000 images that do not contain cars at all
-> negs.txt text-file containing the filenames of all negative images

For the positive images I used tight bounding boxes. You actually do not need as many negative images as you want to use negative samples later on, as the training-script will sample patches from the negative images given, so it can actually be less images than negative samples.

Some of the positive images, the bounding-boxes have been annotated by hand (ground-truth-data):





BCCN 2015 Poster

We presented our poster at the BCCN conference 2015 in Heidelberg. It describes our system platform and a first case study of brakelight detection.

[pdf height=“950px“][/pdf]