I am having quite a bit of trouble understanding the workings of plane to plane homography. In particular I would like to know how the opencv method works.

Is it like ray tracing? How does a homogeneous coordinate differ from a scale*vector?

Everything I read talks like you already know what they're talking about, so it's hard to grasp!

## Best Solution

Googling

`homography estimation`

returns this as the first link (at least to me): http://cseweb.ucsd.edu/classes/wi07/cse252a/homography_estimation/homography_estimation.pdf. And definitely this is a poor description and a lot has been omitted. If you want to learn these concepts reading a good book like`Multiple View Geometry in Computer Vision`

would be far better than reading some short articles. Often these short articles have several serious mistakes, so be careful.In short, a cost function is defined and the parameters (the elements of the homography matrix) that minimize this cost function are the answer we are looking for. A meaningful cost function is geometric, that is, it has a geometric interpretation. For the homography case, we want to find H such that by transforming points from one image to the other the distance between all the points and their correspondences be minimum. This geometric function is nonlinear, that means: 1-an iterative method should be used to solve it, in general, 2-an initial starting point is required for the iterative method. Here, algebraic cost functions enter. These cost functions have no meaningful/geometric interpretation. Often designing them is more of an art, and for a problem usually you can find several algebraic cost functions with different properties. The benefit of algebraic costs is that they lead to linear optimization problems, hence a closed form solution for them exists (that is a one shot /non-iterative method). But the downside is that the found solution is not optimal. Therefore, the general approach is to first optimize an algebraic cost and then use the found solution as starting point for an iterative geometric optimization. Now if you google for these cost functions for homography you will find how usually these are defined.

In case you want to know what method is used in OpenCV simply need to have a look at the code: http://code.opencv.org/projects/opencv/repository/entry/trunk/opencv/modules/calib3d/src/fundam.cpp#L81 This is the algebraic function, DLT, defined in the mentioned book, if you google

`homography DLT`

should find some relevant documents. And then here: http://code.opencv.org/projects/opencv/repository/entry/trunk/opencv/modules/calib3d/src/fundam.cpp#L165 An iterative procedure minimizes the geometric cost function.It seems the Gauss-Newton method is implemented: http://en.wikipedia.org/wiki/Gauss%E2%80%93Newton_algorithmAll the above discussion assumes you have correspondences between two images. If some points are matched to incorrect points in the other image, then you have got outliers, and the results of the mentioned methods would be completely off. Robust (against outliers) methods enter here. OpenCV gives you two options: 1.RANSAC 2.LMeDS. Google is your friend here.

Hope that helps.