Masking
In order to filter out the large areas of black around the image, an initial threshold for black was required. However, not enough of the dark image area was recognized when only the four corner pixels were sampled. To improve the threshold for black, the highest pixel values (lightest blacks) were found in rectangle-shaped areas of the corners. The sides of the rectangles were equal to one tenth of the entire image?s respective side.
Once the temporary scratch images represented a decent circle, the center of each circular photograph was determined. This was found my taking note of the top, bottom, left, and rightmost black pixels outside of the circle. From these, the center and average radius of the vertical and horizontal radii was determined. The next step involved scaling the radius down from this estimate to exclude the grey outer rings as well as any part of the image below the horizon (an extension of the lab).
In order to accomplish this, Sobol matrices were implemented in the vertical and horizontal dimensions. Moreover, the Sobol matrices were expanded to 5x5 matrices so as to grasp a larger area in the gradient calculation and decrease the effect of noise. These values where converted to radial gradients using polar properties. The local gradients at all 360 degrees of the circle were summed up to give a total gradient for a particular radius.
These gradient sums were then used to determine when the horizon line has been reached. Several approaches were attempted with little success. These involved finding a common gradient threshold for all the images at which the actual canopy of the forest appeared in the image. However, since the images were not uniform, this was not effective. Two similar approaches involved comparing gradients of concentric circles. The first found the difference between the two gradient sums and divided it by the current sum. This essentially calculated the derivate of the gradient, which is actually the second derivative of the pixel values along a radial line from the center to the outer edge of a circle. A second approach tried to compare the sums to a rolling average.
The algorithm which finally succeeded involved the following. Since the gradient is high at the edges, the radius which will be considered is scaled down to 90% of the original estimate for all of the images. Next, the maximum gradient sum of all of the concentric circles (in 1 pixel increments) is determined. Knowing this maximum gradient, the radius is shrunk until the gradient sum of a particular concentric circle exceeds a percentage of the maximum gradient. By trial and error, this percentage was determined to be 63% for most images. However, for the Beech July image and the Maple April image, this percentage was too high because the images are brighter and slightly more washed out than the others. Hence, the percentage was scaled back to %58. Using this method, with a relatively consistent percentage threshold, the circular images were shrunk to the horizon line. Such an approach provides for a more accurate %sky calculation since only the part of the image which actually involves sky is included.
Thresholding
When done manually, thresholds were calculated by visually examining the histogram for an image (see below), and picking the spot that most evenly splits the valley near the center.
This is the point at which an even number of pixels should accidentally be assigned to each of the two categories of sky and tree, meaning that on average we will get an accurate count.
Automatic thresholding was done using the ISODATA clustering algorithm (see
http://palantir.swarthmore.edu/maxwell/classes/e27/F03/labs/lab01/
).
All three methods functioned, though the manual method was highly error prone. The other two were much better (see below).
Calculations
For the simple cases, where we are dealing with a single, unified image, the calculation of the percent of the image which was sky was fairly simple.
Given a threshold T, we simply scan through both the image and the mask.
Ignoring the pixels that are masked, we then sum the number of pixels greater than T and the number less than T.
We use the following ratio to get the percent of the image which is sky:
(pixels > T) / (pixels > T) + (pixels < T)
This gives us the ratio of sky pixels to the total number of pixels in the image, excluding those pixels excluded by the mask.
This is the percent sky for the total image.
The same calculation was used in doing the calculation for the segmented image, except that the data were summed accross the segments.
In this case, the segments were concentric rings, each of which had a different threshold.
The thresholds were calculated using the ISODATA clustering algorithm, and the threshold for a given segment was then used in concert with the mask to count the sky and tree pixels within a given segment.
The threshold was re-calculated for the next segment, but the sky and tree pixel counters were not zeroed, so the end result were sums based on using different thresholds in each segment.
The same average was taken using these sums to get the percent sky for the segmented image.
Extensions
We used an advanced, gradient-sobol based technique to do our masking; see the masking section. This allowed us to eliminate the horizon.
Images
Originals: