Dissertation - OpenCV - Recognising Letterboxes
So this is the fisrt in a series where I'll be documenting the progress of my Major Project (Dissertation) for my BEng.


DISCLAIMER:


This is NOT a tutorial series. This is a document that explains the steps I went through in order to complete my final-year project.
It contains BUGS, ERRORS, INCORRECT FACTS, and often doubles back on itself.
I hope to write a proper tutorial later in the year, but for now, run the code here at your own peril :)
- Calum


A little background - Im trying (hopefully) to recognise letterboxes in doors.
Im using openCV as a framework because it has some nice shape recognistion stuff built in, and I should be able to use this to my advantage.
The final code will run on lunix, but Im developing on OSX. Ill Port it later.

Setup


So installing OpenCV can be a MASSIVE pain, luckily I found a brilliant tutorial for this here: Installing OpenCV on OSX

The only ammendment I would add to the tutorial is to also import "libopencv_imgproc.dylib" when you import "libopencv_core.dylib" & "libopencv_highgui.dylib"
It will give you a better starting point.

Im using Xcode for development. But its really bad at C++, and for some godly unknown reason does not provide you with a way to duplicate projects.
Everything is fixed. Rediculous. Anyway....

Getting Started


Ive never used C++ or openCV before, but there is always a first for eveything.
I started by walking through this great tutorial by Nashruddin Amin: Automatic perspective correction for quadrilateral objects

Ive mirrored it here. Just in case: Mirror

The tutorial was good but I ran into a couple of issues.
The Code works seemingly by finding the edges, and contours of a black and white image which is created using various opencv 'functions'
The trouble seemed to be that this method was incredibly unreliable if there were multiple rectangles in the image, and also (due to the greyscaleing applied) seemed to miss obvious colour changes, where the hue was the same


Staring over


So now the key was to start again...
I started by googling and finding 5 or 6 different letterboxs on google.
And then I started to look at the default opencv examples: squares.cpp
Running my letterboxes through the squares.cpp script, I get:

I am not really interested in the photo with the red postbox. Ive included it for fun :)
The one with the red door and gold "odd shaped" letterbox is more of an issue.... Ill come back to it later


This is good, much better, I see all kinds of rectangles! but, I cant tell what's what, so lets colour each rectangle in a different colour.
// the function draws all the squares in the image
// ive added a random colour generator to make each box different.
// NOTE: openCV uses BGR NOT RBG...
void drawSquares( Mat& image, const vector<vector<Point> >& squares, std::string name)
{
    for( size_t i = 0; i < squares.size(); i++ )
    {
        const Point* p = &squares[i][0];
        int n = (int)squares[i].size();
        int col1 = rand() % 255 + 1;
        int col2 = rand() % 255 + 1;
        int col3 = rand() % 255 + 1;
        polylines(image, &p, &n, 1, true, Scalar(col1,col2,col3), 2, CV_AA);
    }
    
    imshow(wndname, image);
    imwrite(string("/temp/out/") + name, image);
}



Great! Now lets move things around, and get rid of thoes outlying boxes.
By rough estimate, I assume that any square that has a width or height greater than 90% the size of the image, is an outliner, And can be deleted.
So using some very rushed and unclean If-Statements, We can remove any rectangle that has a point that sits within 5% of the left or right hand margin:
void drawSquares( Mat& image, const vector<vector<Point> >& squares, std::string name)
{
    for( size_t i = 0; i < squares.size(); i++ )
    {
        const Point* p = &squares[i][0];
        
        int margin = 5; //percentage
        int margin_px = (image.cols / 100)*margin;
        int margin_left = margin_px;
        int margin_right = image.cols - margin_px;
        
        if((squares[i][0].x < margin_left) ||
           (squares[i][1].x < margin_left) ||
           (squares[i][2].x < margin_left) ||
           (squares[i][3].x < margin_left) ||
           
           (squares[i][0].x > margin_right) ||
           (squares[i][1].x > margin_right) ||
           (squares[i][2].x > margin_right) ||
           (squares[i][3].x > margin_right)
           
        ){}else{
        int n = (int)squares[i].size();
        int col1 = rand() % 255 + 1;
        int col2 = rand() % 255 + 1;
        int col3 = rand() % 255 + 1;
        polylines(image, &p, &n, 1, true, Scalar(col1,col2,col3), 2, CV_AA);
        } 
    }
    imshow(wndname, image);
    imwrite(string("/temp/out/") + name, image);
}

After this we are left with a much cleaner and closer guess to our postboxes

I also Added circles to the corner points, this code is not referenced in the code sample above, as I intend to depreciate it. Their purpose is to essentially "enumerate" the order of the rectangle. The circles are 1:white, 2:grey, 3:dark, 4:darker. this allows me to easily visualise the rectangles rotation in space.
You will note, that they are not consistant.


Enumerating, and re-ordering (rotating the vectors) of the boxes


After some thaught it occoures to me that getting the boxes into a consistant order is important both for my own sanity and so that I can merge points later.
(One step forwards, two steps back)
So Ive written a function that orders the rectangles [TopLeftX,TopLeftY],[BottomLeftX,BottomLeftY],[BottomRightX,BottomRightY],[TopRightX,TopRightY]
This is 'Anticlockwise'

You can see the function below:
// This orders the points in a clockwise direction
vector<vector<Point> > reorderPoints(const vector<vector<Point> >& squares)
{
    struct by_x {
        bool operator()(Point const &a, Point const &b) const {
            return a.x < b.x;
        }
    };
    struct by_yUp {
        bool operator()(Point const &a, Point const &b) const {
            return a.y < b.y;
        }
    };
    struct by_yDown {
        bool operator()(Point const &a, Point const &b) const {
            return a.y > b.y;
        }
    };
    struct by_topLeft_x{
        bool operator()(const vector<Point> &a, const vector<Point> &b) const {
            return a[0].x < b[0].x;
        }
    };
    
    std::vector<vector<Point> > sortedPoints = squares;
    for( size_t i = 0; i < squares.size(); i++ )
    {
        // first order by the x co-ordinate
        std::sort(sortedPoints[i].begin(), sortedPoints[i].end(), by_x());
        // order the first 2 elements in accending order
        std::sort(sortedPoints[i].begin(), sortedPoints[i].begin()+2, by_yUp());
        // order the second 2 elements in decending order
        std::sort(sortedPoints[i].begin()+2, sortedPoints[i].end(), by_yDown());
    }
    // finaly order the squares in the arraty by the topLeft X Co-ordinate.. this is for later
    std::sort(sortedPoints.begin(), sortedPoints.end(), by_topLeft_x());
    return sortedPoints;
}

This works supprisingly well considering how simple it is.

I could (should) go back and re-write my earlier script for detecting rectangles that touch the walls now, it might be much neater...
In addition, because my rectangles are correctly orientated, I can now mesure the width and height of each one, and merge points that are close.
To do this im going to merge any points that are within 5% of each other.

So First, I abstracted the outliers code into its own function. Because they are in order I can remove some of the if statments (I know its still ugly)
vector<vector<Point> > removeOutliers( Mat& image, const vector<vector<Point> >& squares)
{
    int margin = 5; //percentage
    int margin_px = (image.cols / 100)*margin;
    int margin_left = margin_px;
    int margin_right = image.cols - margin_px;
    std::vector<vector<Point> > cleanedPoints;
    for( size_t i = 0; i < squares.size(); i++ )
    {
        if((squares[i][0].x < margin_left) ||
           (squares[i][1].x < margin_left) ||
           (squares[i][2].x > margin_right) ||
           (squares[i][3].x > margin_right)
           ){}else{
            cleanedPoints.push_back (squares[i]);
            }
    }
    return cleanedPoints;
}

This leaves the drawLines function free to do what its designed to do "DRAW LINES"

I also wrote the following code to merge the points if they are close, At the moment it does not provide me with the best solution, because it gives me the "first" array guess in the region, rather than a merge of the x / y points. but it will do for now.
double dist(Point a, Point b){
    return (sqrt((a.x-b.x)*(a.x-b.x) + (a.y-b.y)*(a.y-b.y)));
}

vector<vector<Point> > mergePoints(const vector<vector<Point> >& squares){
    std::vector<vector<Point> > mergedPoints;;
    if(squares.size() > 0){
        mergedPoints.insert(mergedPoints.begin(),squares[0]);
    }
    if(squares.size() > 1){
        for( size_t i = 0; i < (squares.size()-1); i++ ){
            double distance = dist(squares[i][0],squares[i+1][0]);
            if(distance > 20){
                mergedPoints.push_back (squares[i+1]);
            }
        }
    }
    return mergedPoints;
}




More Photos, More Problems



So I have what I consider to be a "Fairly Good" basis for identifying the boxes.
The problem of course, is that each one of these photos assumes the box is basically in the center, and that there are no other boxes around.
I went and took some photos of boxes on my street (and from google) and I tried running them through:


My software never has bugs, it just develops random features


Interestingly my code has actually stood up quite well. In all the images it did actually find the letterbox. Sure there is a lot of other stuff too. but the letterbox IS there.
Its just a case of more filtering really! :)

On the design of doors...


So After looking at countless door photos, I have come to a couple of conclusions that i should be able to leverage into rules for my code.



1. Doors are symetrical.
This is really really usefull. Because doors have an inherent symetry to them, it should be possible to use this to our advantage.

2. Letterboxes are (NORMALLY) horizontally centered on the door.

3. Letterboxes are (NORMALLY) within the bottom half of the door.


So using thoes rules I intend to do the following.

If there is more than one box found, I will measure the width of all the boxes.
Whichever box is the widest, I will assume is either A) the door or B) a feature on the door, horizontally centered (theoretically this could be letterbox)

I can then assume the horizontal center of this box is going to be inline with the horizontal center of the letterbox.
I'll remove ALL elemnts that are not roughly the same center position... And then we will see where we are.

But first! Im going to have to get the centerpoint of each box. (Thank-god I re-ordered the boxes earlier)

vector<vector<Point> > centerPoints(const vector<vector<Point> >& squares)
{
    std::vector<vector<Point> > centeredPoints;
    if(squares.size() > 0){
        // first calculate the widest rectangle,
        int width = 0;
        int index = 0;
        for( size_t i = 0; i < (squares.size()); i++ ){
            int currentWidth = (dist(squares[i][0],squares[i][3]) + dist(squares[i][0],squares[i][3])) / 2;
            if(currentWidth > width){
                width = currentWidth;
                index = int(i);
            }
        }
        // find the centerpoint of it
        int centerX = squares[index][0].x + (width/2);
        //find all boxes that share a similar center
        int temp_centerX =0;
        for( size_t i = 0; i < (squares.size()); i++ ){
            int currentWidth = (dist(squares[i][0],squares[i][3]) + dist(squares[i][0],squares[i][3])) / 2;
            temp_centerX = squares[i][0].x + (currentWidth/2);
            int diff = std::abs(centerX-temp_centerX);
            if(diff < 10){
                centeredPoints.push_back (squares[i]);
            }
        }
    }
    return centeredPoints;
}



This has worked really well, Im now basically left with either a letterbox, or a door and letterbox, on each image.
(Importantly this removed the bricks too!), bricks look a lot like letterboxes!
So my final approach is to do ratio matching.

Basically all letterboxes are a similar ratio, that is, the width is at least 2 times the height.
Eliminate anything that doesnt fall into that catagory, and we MIGHT get out letterboxes!
vector<vector<Point> > removeNonLetterbox(const vector<vector<Point> >& squares)
{
    std::vector<vector<Point> > letterBoxes;
    if(squares.size() > 0){
        // first calculate the widest rectangle,
        for( size_t i = 0; i < (squares.size()); i++ ){
            int currentWidth = (dist(squares[i][0],squares[i][3]) + dist(squares[i][0],squares[i][3])) / 2;
            int currentHeight = (dist(squares[i][0],squares[i][1]) + dist(squares[i][2],squares[i][3])) / 2;
            if(currentWidth > (currentHeight*2)){
                letterBoxes.push_back (squares[i]);
            }
        }
    }
    return letterBoxes;
}


Nearly perfect, except for thoes pesky ones left over that have two boxes on them




My untrained eye seems to show me that the one with the bigger area would be the one I want
vector<vector<Point> > removeSmallerArea(const vector<vector<Point> >& squares)
{
    std::vector<vector<Point> > letterBox;
    if(squares.size() > 0){
        // first calculate the widest rectangle,
        int biggestAreaIndex = 0;
        int Area = 0;
        for( size_t i = 0; i < (squares.size()); i++ ){
            int currentWidth = (dist(squares[i][0],squares[i][3]) + dist(squares[i][0],squares[i][3])) / 2;
            int currentHeight = (dist(squares[i][0],squares[i][1]) + dist(squares[i][2],squares[i][3])) / 2;
            int currentArea = currentWidth*currentHeight;
            if(currentArea > Area){
                Area = currentArea;
                biggestAreaIndex = int(i);
            }
            letterBox.push_back(squares[biggestAreaIndex]);
        }
    }
    return letterBox;
}





Now I know that over the course of my 6 or so functions, Im calculating the height and width multiple times, so i should "probably" fix that... but for the moment it does what I want!
so ill leave it as is.

I think if I ajust my merging code, I should get even better results, but if that is usefull or not is another question.
Really all I want is the "center point" of each rectangle. As a co-ordinate for the arm to move to, before the 3D scanning process starts.
When the arm is closer, we can use the onboard camera to repeat the operation, with hopefully better results.


As im about to start making some bigger changes, Ive decided to include the full code for my project so far here:

Main.cpp revision 1.0


Increasing the Data Set



So I went for a walk and photographed a range of houses near by to increase my sample size.
I've run them through the script, and heres my outcome... There are alot of matches and there are a lot of misses. It might be good to go back and runover my code


16 Correct Guesses, 3 Incorrect Guesses
85% accuracy rating so far

Worth noting at least one of the incorrect guessses is a bad photo


I modified the colour output of the code, so that All the guesses were re-drawn in blue, and then the "correct" guesses were drawn in green.
Running this over the three images that "missed" from my sample set, proved interesting.
All three images are "non ideal", the angles are off, and gold shiny letterboxes on wood doors is proving problematic.





One of the two images, weirdly did locate the letterbox, but was later discarded (probably in one of the final stages) Im going to disregard that image, not because I dont think it could be used, but because the photo was a zoomed in photo taken from about 10m away, at an angle. Idealy our photos would be taken head-on, from a range of about 1-3m, so I believe its not worth importing.


The mystery of the golden letterboxes



So throughout this saga Ive noticed that gold letterboxes cause "Problems" i.e, They dont get detected.
There seem to be a couple of reasons for this:

1. If your close to the letterbox and its shiney, the gold appears to reflect too much, inconsistant colours cause the square finding to fail.

2. If the gold letterbox is on dark wood, the colours are similar, again, no way of matching

There are two approaches to solving this. Either I can try to find the boxes BEFORE all of the above code, (which could be slow) OR, I could wait until afterwards.
The issue with doing it afterwards is false positives. If a "false" box, such as a brick, is found, then it would not ever find the postbox.

As intensive as it might be, I think I'm going to have to add another layer of edge detection, in part 2 of this tutorial.



Bonus! (It also works with live webcam feeds!)





Donate Time!

If you would like to support me, then please consider donating below. It only takes a second, and helps me, to keep doing what I love :D