A Guided Tour of Color Space

2013/8/21 10:34:28Click: 1056

Charles Poynton


This article describes the theory of color reproduction in video, and some of the engineering compromises necessary to make practical cameras and practical coding systems.


Video processing is generally concerned withcolor represented in three components derived from the scene, usually red, green, and blue or components computed from these. Once red,green, and blue components of a scene are obtained, color coding systems transform these components into other forms suitable for processing, recording, and transmission. Accurate color reproduction depends on knowledge of exactly how the physical spectra of the original scene are transformed into these components and exactly how the color components are transformed to physical spectra at the display.


I use the term luminance and the symbol Y to refer to true, CIE luminance, a linear-light quantity proportional to physical intensity. I use the term luma and the symbol Y’ to refer to the signal that is used in video to approximate luminance.I you are unfamiliar with these terms or this usage, see page 24 of my book1.



Color is the perceptual result of light having wavelength from 400 to 700 nm that is incident upon the retina.

The human retina has three types of color photoreceptor cone cells, which respond to inci-dent radiation with different spectral response curves. Because there are exactly three types of color photoreceptors, three components are necessary and sufficient to describe a color,providing that appropriate spectral weighting functions are used: Color vision is inherently trichromatic.


A fourth type of photoreceptor cell, the rod, is also present in the retina. Rods are effective only at extremely low light intensities. Because there is only one type of rod cell, “night vision” cannot perceive color. Image reproduction takes place at light levels sufficiently high that the rod receptors play no role.


Isaac Newton said, “Indeed rays, properly expressed, are not colored.” Spectral power distributions (SPDs) exist in the physical world,but color exists only in the eye and the brain.


Color specification

The Commission Internationale de L’Éclairage (CIE) has defined a system to compute a triple of numerical components that can be considered to be the mathematical coordinates of color space.Their function is analogous to coordinates on a map. Cartographers have different map projections for different functions: Some map projections preserve areas, others show latitudes and longitudes as straight lines. No single map projection fills all the needs of map users. Similarly,no single color system fills all of the needs of color users.


Any system for color specification must be intimately related to the CIE specifications. The systems useful for color specification are all based on CIE XYZ. Numerical specification of hue and saturation has been standardized by theCIE, but is not useful for color specification.


A color specification system needs to be able to represent any color with high precision. Since few colors are handled at a time, a specification system can be computationally complex.An ink color can be specified by the proportions of standard or proprietary inks that can be mixed to make the color. Ink mixture is beyond the scope of this paper.The science of colorimetry forms the basis for describing a color as three numbers. However,

classical colorimetry is intended for the specification of color, not for color image coding.


Although an understanding of colorimetry is necessary to achieve good color performance in video, strict application of colorimetry is unsuitable.


Color image coding

A color image is represented as an array of pixels, where each pixel contains numerical components that define a color. Three components are necessary and sufficient for this purpose, although in printing it is convenient to use a fourth (black) component.


In theory, the three numerical values for image coding could be provided by a color specification system. But a practical image coding system needs to be computationally efficient; cannot afford unlimited precision; need not be intimately related to the CIE system; and generally needs to cover only a reasonably wide range of colors and not all of the colors. So image coding uses different systems than color specification.


The systems useful for image coding are linear RGB, nonlinear RGB, nonlinear CMY, nonlinear CMYK, and derivatives of nonlinear RGB such as Y’CBCR. Numerical values of hue and saturation are not useful in color image coding.If you manufacture cars, you have to match the color of paint on the door with the color of paint on the fender. A color specification system will be necessary. However, to convey a picture of the car, you need image coding. You can afford to do quite a bit of computation in the first case because you have only two colored elements,the door and the fender. In the second case, the color coding must be quite efficient because you may have a million colored elements or more.


For a highly readable short introduction to color image coding, consult DeMarsh and Giorgianni2. For a terse, complete technical treatment, read Schreiber3.



Intensity is a measure over some interval of the electromagnetic spectrum of the flow of power that is radiated from, or incident on, a surface.Intensity is what I call a linear-light measure,expressed in units such as watts per square meter.


Brightness is defined by the CIE as “the attribute of a visual sensation according to which an area appears to exhibit more or less light.” Because brightness perception is very complex, the CIE defined a more tractable quantity,luminance,which is radiant power weighted by a spectral sensitivit y function that is characteristic of vision.


Hue is the attribute of a color perception denoted by blue, green, yellow, red, and so on.


Roughly speaking, if the dominant wavelength of an SPD shifts, the hue of the associated color will shift.


Saturation, or purity, is the degree of colorfulness,from neutral gray through pastel to saturated colors. Roughly speaking, the more concentrated an SPD is at one wavelength, the more saturated the associated color. You can desaturate a color by adding light that contains power at all wavelengths.


Spectral Power Distribution (SPD) and tristimulus

Physical power (or radiance) is expressed in a spectral power distribution (SPD), often in 31 components each representing power in a 10 nm band from 400 to 700 nm. The SPD of the CIE standard daylight illuminant CIE D65 is sketched at the top of Figure 2 above.


One way to describe a color is to directly reproduce its spectral power distribution. This is sketched in the middle row of Figure 2, where 31 components are transmitted. It is reasonable to use this method to describe a single color or a few colors, but using 31 components for each pixel is an impractical way to code an image.Based on the trichromatic nature of vision, we can determine suitable spectral weighting functions to describe a color using just three components,indicated at the bottom of Figure 2.


The relationship between SPD and perceived color is the concern of the science of colorimetry. In 1931, the CIE adopted standard curves for a hypothetical Standard Observer. These curves specify how an SPD can be transformed into a triple that specifies a color.


The CIE system is immediately and almost universally applicable to self-luminous sources and displays. However, the colors produced by reflective systems such as photography, printing or paint are a function not only of the colorants but also of the SPD of the ambient illumination.If your application has a strong dependence upon the spectrum of the illuminant, you may have to resort to spectral matching.


Scanner spectral constraints

The relationship between spectral distributions and the three components of a color value is conventionally explained starting from the famous color matching experiment. I will instead explain the relationship by illustrating the practical concerns of engineering the spectral filters required by a color scanner or camera, as illustrated in Figure 3 above.


The top row shows the spectral sensitivity of three wideband filters having uniform response across each of the red, green, and blue regions of the spectrum. In a typical filter, whether for electrical signals or for optical power, it is generally desirable to have a response that is as uniform as possible across the passband, to have transition zones as narrow as possible, and to have maximum possible attenuation in the stopbands.


Entire textbooks are devoted to filter design; they concentrate on optimizing filters in that way. The example on the top row of the illustration shows two monochromatic sources which are seen as saturated orange and red.Applying a “textbook” filter to that example causes these two different wavelength distributions to report the identical RGB triple, [1, 0, 0].That is a serious problem, because these SPDs are seen as different colors!


At first sight it may seem that the problem with the wideband filters is insufficient wavelength discrimination. The middle row of the example addresses the apparent lack of spectral discrimination of the filters of the top row, by using three narrowband filters. This set solves one problem, but creates another: many monochromatic sources “fall between” the filters, and are seen by the scanner as black. In the example, the orange source reports an RGB triple of [0, 0, 0],identical to the result of scanning black.


Although this example is contrived, the problem is not. Ultimately, the test of whether a camera or scanner is successful is whether it reports distinct RGB triples if and only if human vision sees different colors. To see color as the eye does, the three filter sensitivity curves must be closely related to the color response of the eye.


A famous experiment, the color matching experiment, was devised during the 1920s to characterize the relationship between SPD and color. The experiment measures mixtures of different spectral distributions that are required for human observers to match colors. Exploiting this indirect method, the CIE in 1931 standardized a set of spectral weighting functions that models the perception of color. These curves,defined numerically, are referred to as the x, y,and z color matching functions (CMFs) for the CIE Standard Observer4. They are illustrated in the bottom row of Figure 3 opposite, and are graphed at a larger scale in Figure 4 above.


CIE XYZ tristimulus

The CIE designed their system so that one of the three tristimulus values – the Y value – has a spectral sensitivity that corresponds to the lightness sensitivity of human vision. The Luminance Y of a source is obtained as the integral of its SPD weighted by the y color matching function.


When luminance is augmented with two other components X and Z,computed using the x and z color matching functions, the resulting(X,Y,Z) components are known as XYZ tristimulus values (pronounced “big-X, big-Y, big-Z” or “cap-X, cap-Y, cap-Z”). These are linear-light values that embed the spectral properties of human color vision.


Tristimulus values are computed from continuous SPDs by integrating the SPD using the x,y,and z color matching functions. In discrete form,tristimulus values can be computed by a matrix multiplication, as shown in Figure 5 overleaf.


Figure 5 Calculation of tristimulus values by matrix multiplication. The column vector at the right is a discrete version of CIE Illuminant D65. The 31-by-3 matrix is a discrete version of the set of CIE x, y and z color matching functions, here represented in 31 wavelengths at 10 nm intervals across the spectrum. The result of performing the matrix multiplication is a set of XYZ tristimulus values.


Human color vision follows a principle of superposition,first elaborated by Grassman and now known as Grassman’s Law: The tristimulus values that result after summing a set of SPDs is identical to the sum of the tristimulus values of each of the SPDs. Due to this linearity of additive color mixture, any set of three components that is a nontrivial linear combination of X, Y, and Z is also a set of tristimulus values.


The CIE system is based on the description of color as a luminance component Y, as described above, and two additional components X and Z.The spectral weighting curves of X and Z have been standardized by the CIE based on statistics from experiments involving human observers.XYZ tristimulus values can describe any color.(RGB tristimulus values will be described later.)


The magnitudes of the XYZ components are proportional to physical power, but their spectral composition corresponds to the color matching characteristics of human vision.


CIE x, y chromaticity

It is convenient, for both conceptual understanding and computation, to have a representation of “pure” color in the absence of luminance. The CIE standardized a procedure for normalizing XYZ tristimulus values to obtain two chromaticity values x and y (pronounced “little-x, little-y”). The relationships are computed by this projective transformation:


A color plots as a point in an (x, y) chromaticity diagram, illustrated in Figure 6 opposite. When a narrowband SPD comprising power at just one wavelength is swept across the range 400 to 700 nm, it traces a shark-fin shaped spectral locus in (x, y) coordinates. The sensation of purple cannot be produced by a single wavelength:To produce purple requires a mixture of shortwave and longwave light. The line of purples joins extreme blue to extreme red. The chromaticity coordinates of real (physical) SPDs are bounded by the line of purples and the spectral locus: All colors are contained in this region of the chromaticity diagram.


It is common to specify a color by its chromaticity and luminance, in the form of an xyY triple.(A third chromaticity value, z, can be computed similarly. However, z is redundant if x and y are known, due to the identity x + y + z = 1.)


To recover X and Z tristimulus values from chromaticities and luminance, use the inverse of Eq 1:


The projective transformation used to compute x and y is such that any linear combination or two spectra, or two tristimulus values, plots on a straight line in the (x, y) plane.However, the transformation does not preserve distances, so chromaticity values do not combine linearly.


There is no unique physical or perceptual definition of white. An SPD that is considered white will have CIE (x, y) coordinates roughly in the central area of the chromaticity diagram, in the region of (1⁄3, 1⁄3).


Additive mixture (RGB)

The simplest way to reproduce a wide range of colors is to mix light from three lights of different colors, usually red, green, and blue. Figure 7 overleaf illustrates additive reproduction. In physical terms, the spectra from each of the lights add together wavelength by wavelength to form the spectrum of the mixture.


As a consequence of the principle of superposition,the color of an additive mixture is a strict function of the colors of the primaries and the fraction of each primary that is mixed.


Additive reproduction is employed directly in a video projector, where the spectra from a red beam, a green beam, and a blue beam are physically summed at the surface of the projection screen.


Additive reproduction is also employed in a direct-view color CRT, but through slightly indirect means. The screen of a CRT comprises small dots that produce red, green, and blue light.When the screen is viewed from a sufficient distance, the spectra of these dots add at the observer’s retina.



In additive image reproduction, the white point is the chromaticity of the color reproduced by equal red, green, and blue components. White point is a function of the ratio (or balance) of power among the primaries.


It is often convenient for purposes of calculation to define white as a uniform SPD. This white reference is known as the equal-power illuminant,or CIE Illuminant E.


A more realistic reference that approximates daylight has been specified numerically by the CIE as Illuminant D65. You should use this unless you have a good reason to use something else.The print industry commonly uses D50, and photography commonly uses D55. These represent compromises between the conditions of indoor (tungsten) and daylight viewing.


Human vision adapts to white in the viewing environment. An image viewed in isolation – such as a slide projected in a dark room – creates its own white reference, and a viewer will be quite tolerant of errors in the white point.However, if the same image is viewed in the presence of an external white reference or a second image, then differences in white point can be objectionable.


Planck determined that the SPD radiated from a hot object – a black body radiator – is a function of the temperature to which the object is heated.A typical source of illumination has a heated object at its core, so it is often useful to characterize an illuminant by specifying the temperature (in units of kelvin, K) of a black body radiator that appears to have the same color.


Although an illuminant can be specified informally by its color temperature, a more complete specification is provided by the chromaticity coordinates of the SPD of the source. Figure 8 shows the SPDs of several standard illuminants.


Camera white reference

There is an implicit assumption in television that the camera operates as if the scene were illuminated by a source having the chromaticity of CIE D65. In practice, the scene illumination is often deficient in the shortwave (blue) region of the spectrum. Television studio lighting is often accomplished by tungsten lamps, which are very yellow. This situation is accommodated by white balancing, that is, by adjusting the gain of the red, green, and blue components of the scene so that a white object reports the values that would be reported if the scene were illuminated by D65.


Monitor white reference

In an additive mixture the illumination of the reproduced image is generated entirely by the display device. In particular, reproduced white is determined by the characteristics of the display,and is not dependent on the environment in which the display is viewed. In a completely dark viewing environment such as a cinema theater,this is desirable. However, in an environment where the viewer’s field of view encompasses objects other than the display, the viewer’s notion of “white” is likely to be influenced or even dominated by what he perceives as “white” in the ambient. To avoid subjective mismatches, the chromaticity of white reproduced by the display and the chromaticity of white in the ambient should be reasonably close.


Modern blue CRT phosphors are more efficient with respect to human vision than red or green.In a quest for brightness at the expense of color accuracy, it is common for a computer display to have excessive blue content, about twice as blue as daylight, with white at about 9300 K. This situation can be corrected by calibrating the monitor to a white reference with a lower color temperature.


Characterization of RGB primaries

Additive reproduction is based on physical devices that produce all-positive SPDs for each primary. Physically and mathematically, the spectra add. The largest range of colors will be produced with primaries that appear red, green,and blue. Human color vision obeys the principle of superposition, so the color produced by any additive mixture of three primary spectra can be predicted by adding the corresponding fractions of the XYZ components of the primaries: the colors that can be mixed from a particular set of RGB primaries are completely determined by the colors of the primaries by themselves. Subtractive reproduction is much more complicated: the colors of mixtures are determined by the primaries and by the colors of their combinations.


An additive RGB system is specified by the chromaticities of its primaries and its white point. The extent (gamut) of the colors that can be mixed from a given set of RGB primaries is given in the (x, y) chromaticity diagram by a triangle whose vertices are the chromaticities of the primaries.


In computing, there are no standard primaries and there is no standard white point. If you have an RGB image but have no information about its chromaticities, you cannot accurately determine the colors represented by the image data.


The NTSC in 1953 specified a set of primaries that were representative of phosphors used in color CRTs of that era. However, phosphors changed over the years, primarily in response to market pressures for brighter receivers, and by the time of the first videotape recorder the primaries in use were quite different than those “on the books.” So although you may see the NTSC primary chromaticities documented, they are of no practical use today.


Contemporary studio monitors have slightly different standards in North America, Europe,and Japan. However, international agreement5 has been obtained on primaries for high-definition television (HDTV), and these primaries are closely representative of contemporary monitors in studio video, computing, and computer graphics. The standard is formally denoted ITU-R Recommendation BT.709 (formerly CCIR Rec. 709). I’ll call it Rec. 709. The Rec. 709 primaries and its D65 white point are these: 

For a discussion of nonlinear RGB in computer graphics, read Lindbloom’s SIGGRAPH paper6.For technical details on monitor calibration,consult Cowan7.


Video standards specify abstract R’G’B’ systems that are closely matched to the characteristics of real monitors. Physical devices that produce additive color involve tolerances and uncertainties, but if you have a monitor that conforms to Rec. 709 within some tolerance, you can consider the monitor to be device-independent.


The importance Rec. 709 as an interchange standard in studio video, broadcast television,and high definition television, and the perceptual basis of the standard, assures that its parameters will be used even by devices such as flatpanel displays that do not have the same physics as CRTs.