====== Automatic white balancing ====== Quite often end users don't have the patience or the expertise (or both) to properly set up the white balance of the captured scene and prefer to rely on camera "just knowing what to do". [[https://en.wikipedia.org/wiki/Color_normalization|Many algorithms]] are available today to deal with color normalization, some very sophisticated (and computationally expensive) and others, way less computationally expensive, but so simple that they are easily thrown off by things like a huge green screen background or a bright super-yellow T-shirt. Below is one of those "simplistic" approaches with a little twist, that makes it an excellent option for real-time image processing on even the most basic of the devices. ===== Overall idea ===== The general idea is based on an assumption that if one takes a look around and measures the color of every pixel, then the grand total sum of all those values should be close enough to a pure grey color. This idea has a name and it is the [[https://en.wikipedia.org/wiki/Color_normalization#Grey_world|"grey world"]]. Unfortunately, the world around is only grey if we are in a more-or-less neutral environment. Such is not the case with large patches of high-contrast colors (like [[https://en.wikipedia.org/wiki/Chroma_key|"green screens"]]) or even a relatively small patches of "highly toxic" colors, like bright yellow. These conditions affect the scene's average color sufficiently enough to throw it off and produce horrible color artifacts instead of normalizing the colors to their perceived values. To combat the problem of "toxic colors", a simple twist is introduced: ignore all the pixels that are too saturated. Now, what exactly is "too saturated" is up for a debate and is quite subjective. In our experiments, it looked like the threshold value should be chosen in such a way as to end up considering around 25-35% of all the pixels for the grey world calculation. For regular room environments, that means the threshold should be between 20 and 30 but may need adjustment either way if the scene is poorly lit (adjust down), or has lots of highly saturated bright colors (adjust the threshold up). Once we throw away all the pixels that normally contribute to the normalization going wrong, we proceed with the "usual" grey world color normalization. ===== Step-by-step algorithm ===== Below are a few blocks of steps that describe the process in a top-down approach (which is generally easier to comprehend than the bottom-up): - [[#Collect statistical data set]] - [[#Calculate color properties]] - [[#Decide the action to take]] and based on that either: - Increase the "tolerance level" and go back to top - Decrease the "tolerance level" and go back to top - Adjust the white balance - follow to the next step - [[#Figure out the adjustment]] and exit, if processing a single frame and the result is "close enough" - Loop back to step one ==== Collect statistical data set ==== For each pixel: - Perform the [[https://en.wikipedia.org/wiki/YUV#HDTV_with_BT.709|RGB->Y'UV]] conversion - Calculate the pixel's "saturation" using the following formula: - \(\frac{|U| + |V|}{Y} < threshold, U \in [-127..+127], V \in [-127..+127], Y \in [0..255], threshold \in [0..1]\) - Of course one should be careful about dividing by ''0'' so first check if the ''Y'' is not ''0'' ;-) - If the ''Y'' value is **not** ''0'' **and** ''threshold'' comparison yielded ''true'' - add that pixel's ''U'' and ''V'' components to our running totals and increment the total number of "good" pixels by ''1'' - Optionally collect the same data for ''Y'' value ==== Decide the action to take ==== - If the number of pixels that matched our filter criteria is under ''20%'' of the total pixels in the image - that means we need to increase the "tolerance level" - If the number of pixels in our data set is over ''40%'' - decrease the "tolerance level" - Otherwise we have a good data set - let's analyze it further and make the corresponding color balance adjustment(s) ==== Calculate color properties ==== Now that we have our data set we can figure out what our picture looks like from the color balance point of view. For that we just calculate the average values of \(U\) and \(V\). The simplest approach is to calculate the [[https://en.wikipedia.org/wiki/Mean#Arithmetic_mean_(AM)|arithmetic mean]] value, but other formulae may provide more accurate results. At the end of this step we have the following 3 sets of numbers: - "Main results" are the averages of \(U\) and \(V\) - these tell us how far off the picture is from being considered to be the "grey world" - Calculate the "average" \(R\), \(G\), and \(B\) values from \(U\) and \(V\). - Use the \(Y\) from [[#Collect statistical data set|previous step]] (if it was collected) or just go with the value of \(100\) which is hopefully a good representation of a well-adjusted overall brightness - Calculate the two ratios that represent how far off the ''Red'' and ''Blue'' colors are from the ''Green''((we use a fixed value of ''Green'' and never adjust it for color normalization)). Those ratios are \(^R/_G\) and \(^B/_G\) ==== Figure out the adjustment ==== Now that we have all the information we need to make a decision let's make one! - If the **absolute** value of average \(U\) is above some arbitrary threshold (we use \(0.1\)) then we need to adjust the ''Blue'' channel. In our case we just change the sensor's blue channel gain by multiplying it by the \(^B/_G\) ratio - Be careful not to overflow beyond the sensor gain's cap - If the gain is already at 0 and needs to be increased - make sure you use addition, as multiplication won't help you((or just make sure you never adjust the color channel's gain below, say, ''10'')) ;-) - Same for the ''Red'' channel, only we use the \(V\) and \(^R/_G\) values The ''Blue'' channel gets a bit over-saturated on the sensor we use so a tiny adjustment is introduced: the \(^B/_G\) ratio is offset by \(2%%\) right before it is applied to the new gain's value. ===== Sample implementation in C++ ===== [[Sample AWB implementation in C++]] shows one possible implementation of the approach described above. It does use (without much explanation) some of the external structures, but those should not be hard to deduce from their usage