Note: This is not the complete version.
The complete version is at my work and I'll post it when I retrieve it.
The complete version has much more content and this version is totally
It is often desirable in various applications to detect known shapes in computer images. One common shape of interest is the circle. This paper aims to provide a high level description of how circles can be detected in images with the use of the Hough transform. The Hough transform converts the original spatial information in an image into a parameter space representation. By selecting points that correspond to overlapping parameter space shapes/elements, the equation of the detected shape is rapidly obtained. Practical application of the Hough transform generally involves breaking down the parameter space into discrete accumulator cells that contain a count. This count corresponds to the number of edge pixels in the original image that constitute the desired shape.
For the purposes of this paper, the author has made the assumption that the Hough transform is to be applied to a binary image. The binary image should only contain non-zero gray levels where edge information was detected in the original input image.
In order for the Hough transform to be of use in the detection of shapes in an image, the original grayscale image must first undergo some simple preprocessing. Generally, this can be accomplished by applying the gradient operator. This is often accomplished with simple masks known as the Sobel operators. A detailed discussion of this technique is beyond the scope of this paper.
In short, the Sobel operators will highlight edges in a grayscale input image. The end result should be a binary image that contains highlighted edges, including the edges around the shape we wish to detect.
In order to detect the presence of circles in an input image, we first assume that each non-zero pixel in the binary image contributes to a circle in some way.
The general equation of a circle in Cartesian coordinates can be expressed as
(x a)2 + (y b)2 = r2 ( Equation 1 )
If x and y are replaced with the pixel coordinates xi and yi respectively, this equation becomes
(xi a)2 + (yi b)2 = r2 ( Equation 2 )
For any single pixel (xi, yi), we can see that the equation becomes variable in three dimensions, with a, b, and r being the unknowns. By evaluating Equation 2 for a single pixel, we create an equation whose solution must be expressed in three dimensions. If we multiply Equation 2 by (-1)2, we can write it in a form that better illustrates the independent variables:
(-1)2 (xi a)2 + (-1)2 (yi b)2 = (-1)2r2 ( Equation 3 )
(a - xi)2 + (b - yi)2 = (-r)2 ( Equation 4 )
and since (-r)2 = r2, we can write
(a - xi)2 + (b - yi)2 = r2 ( Equation 5 )
It should be reemphasized that a, b, and r are now the free variables. Equation 5 above shows the expression of a circle whose center is the point (xi, yi) in the a-b plane and whose radius depends on the values of a and b. Geometry tells us that Equation 5 represents an infinite cone whose point is normal (or perpendicular) to the a-b plane.
This three-dimensional space is referred to as the "parameter
space", because its axis corresponds to the parameters of the original
equation. Figure 1 illustrates the parameter space representation of the
circle equation for a single pixel:
[INSERT IMAGE HERE]
Each edge pixel in the original image corresponds to a cone in the 3D parameter space. For a set of edge pixels, the Hough transform will yield a set of cones in the parameter space. If all the edge pixels in a set belong to the same circle in the spatial image, the corresponding parameter space cones will all share a single common intersection point. If the edge pixel set contains pixels that do not belong to the circle, the corresponding cones will intersect at other points.
The common intersection point (a, b, r) can be taken as the solution to the original problem of detecting a circle in the input image. The circle in the original image will be located at x = a, y = b, and will have a radius of r.
Because it would be impossible for a computer to compute all points that lie on the infinite cone, it becomes necessary to break the parameter space into discrete elements. First, we limit the maximum parameter values of a, b, and r to fit the desired application. Then, we divide each of the axis into segments to create discrete cubic regions in the parameter space.
The size of the cubic regions in the parameter space will be directly proportionally related to the error and inversely related to the computation time. Smaller cubes will result in less error, while larger cubes will result in greater error. Similarly, smaller cubes in the parameter space means that more iterations must be performed. Using larger cubes will reduce the number of computations.
Each of the cubic regions is called an accumulator because the computation process progressively accumulates values in these regions. A three-dimensional array can be used to represent the accumulators. The accumulator array should be initialized to contain only zero values.
For each edge pixel in the input image (xi, yi), we must compute which accelerators lie on the edge of the parameter space cone. This can be accomplished by iterating over all discrete a and b to compute the radius value r.
(aj - xi)2 + (bk - yi)2 = r2
For each solution point (aj, bk,
r), we increment the value in the accumulator cell in which the point lies.
In mathematical terms, we let A(a, b, r) := A(a, b, r) + 1, or simply A(a)
:= A(a) + 1.