PNG (Portable Network Graphics) Specification, Version 0.91

Revision date: 8 November, 1995

Previous page

9. Recommendations for Encoders

This chapter gives some recommendations for encoder behavior. The only absolute requirement on a PNG encoder is that it produce files which conform to the format specified in the preceding chapters. However, best results will usually be achieved by following these recommendations.

Bit depth scaling

When encoding input samples that have a bit depth that cannot be directly represented in PNG, the encoder must scale the samples up to a bit depth that is allowed by PNG. The most accurate scaling method is the linear equation
  output = ROUND(input * MAXOUTSAMPLE / MAXINSAMPLE)
where the input samples range from 0 to MAXINSAMPLE and the outputs range from 0 to MAXOUTSAMPLE (which is (2^bitdepth)-1).

A close approximation to the linear scaling method can be achieved by "left bit replication", which is shifting the valid bits to begin in the most significant bit and repeating the most significant bits into the open bits. This method is often faster to compute than linear scaling. As an example, assume that 5-bit samples are being scaled up to 8 bits. If the source sample value is 27 (in the range from 0-31), then the original bits are:

4 3 2 1 0
---------
1 1 0 1 1
Left bit replication gives a value of 222:
7 6 5 4 3  2 1 0
----------------
1 1 0 1 1  1 1 0
|=======|  |===|
    |      Leftmost Bits Repeated to Fill Open Bits
    |
Original Bits
which matches the value computed by the linear equation. Left bit replication usually gives the same value as linear scaling, and is never off by more than one.

A distinctly less accurate approximation is obtained by simply left-shifting the input value and filling the low order bits with zeroes. This scheme cannot reproduce white exactly, since it does not generate an all-ones maximum value; the net effect is to darken the image slightly. This method is not recommended in general, but it does have the effect of improving compression, particularly when dealing with greater-than-eight-bit sample depths. Since the relative error introduced by zero-fill scaling is small at high bit depths, some encoders may choose to use it. Zero-fill should not be used for alpha channel data, however, since many decoders will special-case alpha values of all zeroes and all ones. It is important to represent both those values exactly in the scaled data.

When the encoder writes an sBIT chunk, it is required to do the scaling in such a way that the high-order bits of the stored samples match the original data. That is, if the sBIT chunk specifies a bit depth of S, the high-order S bits of the stored data must agree with the original S-bit data values. This allows decoders to recover the original data by shifting right. The low order bits are not constrained. Note that all the above scaling methods meet this restriction.

When scaling up source data, it is recommended that the low-order bits be filled consistently for all samples; that is, the same source value should generate the same sample value at any pixel position. This improves compression by reducing the number of distinct sample values. However, this is not a requirement, and some encoders may choose not to follow it. For example, an encoder might instead dither the low-order bits, improving displayed image quality at the price of increasing file size.

In some applications the original source data may have a range that is not a power of 2. The linear scaling equation still works for this case, although the shifting methods do not. It is recommended that an sBIT chunk not be written for such images, since sBIT suggests that the original data range was 0..2^S-1.

Encoder gamma handling

See the Gamma Tutorial appendix if you aren't already familiar with gamma issues.

If it is possible for the encoder to determine the image gamma, or to make a strong guess based on the hardware on which it runs, then the encoder is strongly encouraged to output the gAMA chunk.

If the encoder is compiled as a portion of a computer image renderer and has access to sample intensity values in floating-point (or high-precision integer) form, it is recommended that the encoder perform its own gamma encoding before quantizing the data to integer values for storage in the file. Applying gamma encoding at this stage results in images with fewer banding artifacts at the same sample bit depth, or allows smaller samples while retaining the same quality.

A linear intensity level, expressed as a floating-point value in the range 0 to 1, may be converted to a gamma-corrected sample value by

  sample = ROUND((intensity ^ encoder_gamma) * MAXSAMPLEVAL)

If the renderer is simultaneously displaying pixels on the screen and writing them to the file, it should calculate an encoder_gamma value that gives correct display using

  encoder_gamma = viewing_gamma / display_gamma
This will allow PNG viewers to reproduce what is being shown on screen.

If the image is being written to a file only, the encoder_gamma value can be selected somewhat arbitrarily. A value of 0.45 is generally a good choice because of its use in video systems. However the encoder_gamma value is selected, the PNG gAMA chunk is written with that value.

Computer graphics renderers often do not perform gamma encoding, instead making sample values directly proportional to scene brightness. This "linear" sample encoding is equivalent to gamma encoding with a gamma of 1.0, so graphics programs that produce linear samples should always put out a gAMA chunk specifying a gamma of 1.0.

It is not recommended that file format converters attempt to convert supplied images to a different gamma. Store the data in the PNG file without conversion, and record the source gamma if it is known. Gamma alteration at file conversion time is a bad idea because gamma adjustment of digitized data is inherently lossy, due to roundoff error. (8 or so bits is not really enough accuracy.) Thus conversion-time gamma change permanently degrades the image. Worse, if the eventual decoder wants the data with some other gamma, then two conversions occur, each introducing roundoff error. Better to store the data losslessly and incur at most one conversion when the image is finally displayed.

If the encoder or file format converter does not have knowledge of how an image was originally created, but does know that the image has been displayed satisfactorily on a display having gamma display_gamma under lighting conditions for which a particular viewing_gamma is appropriate, then the image can be marked as having a file_gamma given by

  file_gamma = viewing_gamma / display_gamma

Gamma does not apply to alpha samples; alpha is always represented linearly.

See Recommendations for Decoders: Decoder gamma handling for more details.

Alpha channel creation

The alpha channel may be regarded either as a mask that temporarily hides transparent parts of the image, or as a means for constructing a non-rectangular image. In the first case, the color values of fully transparent pixels should be preserved for future use. In the second case, the transparent pixels carry no useful data and are simply there to fill out the rectangular image area required by PNG. In this case, fully transparent pixels should all be assigned the same color value for best compression.

Encoders should keep in mind the possibility that a viewer will ignore transparency control. Hence, the colors assigned to transparent pixels should be reasonable background colors whenever feasible.

For applications that do not require a full alpha channel, or cannot afford the price in compression efficiency, the tRNS transparency chunk is also available.

If the image has a known background color, this color should be written in the bKGD chunk. Even viewers that ignore transparency may use the bKGD color to fill unused screen area.

If the original image has premultiplied (also called "associated") alpha data, convert it to PNG's non-premultiplied format by dividing each sample value by the corresponding alpha value, then multiplying by the maximum value for the image bit depth, and rounding to the nearest integer. In valid premultiplied data, the sample values never exceed their corresponding alpha values, so the result of the division should always be in the range 0 to 1. If the alpha value is zero, output black (zeroes).

Suggested palettes

A PLTE chunk may appear in truecolor PNG files. In such files, the chunk is not an essential part of the image data, but simply represents a suggested palette that viewers may use to present the image on indexed-color display hardware. A suggested palette is of no interest to viewers running on truecolor hardware.

If an encoder chooses to provide a suggested palette, it is recommended that a hIST chunk also be written to indicate the relative importance of the palette entries. The histogram values are most easily computed as "nearest neighbor" counts, that is, the approximate usage of each palette entry if no dithering is applied. (These counts will often be available for free as a consequence of developing the suggested palette.)

For images of color type 2 (truecolor without alpha channel), it is recommended that the palette and histogram be computed with reference to the RGB data only, ignoring any transparent-color specification. If the file uses transparency (has a tRNS chunk), viewers can easily adapt the resulting palette for use with their intended background color. They need only replace the palette entry closest to the tRNS color with their background color (which may or may not match the file's bKGD color, if any).

For images of color type 6 (truecolor with alpha channel), it is recommended that the palette and histogram be computed with reference to the image as it would appear after compositing against the background color specified by bKGD, or against a black background if bKGD is not present. This definition is necessary to ensure that useful palette entries are generated for pixels having fractional alpha values. The resulting palette will probably only be of use to viewers that present the image against the same background. Viewers should ignore the palette if they intend to use a different background. It is recommended that PNG editors delete or recompute the palette if they alter the bKGD chunk in an image of color type 6.

If a viewer intends to present a transparent image against a background that is more complex than a single color, it is unlikely that the suggested palette will be of any use. In this case it is best to perform the compositing step on the truecolor PNG image and background image, then quantize the resulting image.

Filter selection

For images of color type 3 (indexed color), filter type 0 (none) is usually the most effective.

Filter type 0 is also recommended for images of bit depths less than 8. For low-bit-depth grayscale images, it may be a net win to expand the image to 8-bit representation and apply filtering, but this is rare.

For truecolor and grayscale images, any of the five filters may prove the most effective. If an encoder wishes to use a fixed filter choice, the Paeth filter is most likely to be the best.

For best compression of truecolor and grayscale images, we recommend an adaptive filtering approach in which a filter is chosen for each scanline. The following simple heuristic has performed well in early tests: compute the output scanline using all five filters, and select the filter which gives the smallest sum of absolute values of outputs. (Consider the output bytes as signed differences for this test.) This method usually outperforms any single fixed filter choice. However, it is likely that much better heuristics will be found as more experience is gained with PNG.

Filtering according to these recommendations is effective on interlaced as well as noninterlaced images.

Text chunk processing

Note that a nonempty keyword must be provided for each text chunk. The generic keyword "Comment" may be used if no better description of the text is available.

Encoders should discourage the creation of single lines of text longer than 79 characters, in order to facilitate easy reading.

If an encoder chooses to support output of zTXt compressed text chunks, it is recommended that text less than 1K (1024 bytes) in size be output using uncompressed tEXt chunks. In particular, it is recommended that the basic title and author keywords always be output using uncompressed tEXt chunks. Lengthy disclaimers, on the other hand, are an ideal candidate for zTXt.

Placing large tEXt and zTXt chunks after the image data (after IDAT) may speed up image display in some situations, since the decoder won't have to read over the text to get to the image data. But it is recommended that small text chunks, such as the image title, appear before IDAT.

Use of private chunks

Applications may use PNG private chunks to carry information that need not be understood by other applications. Such chunks must be given names with lowercase second letters, to ensure that they can never conflict with any future public chunk definition. Note, however, that there is no guarantee that some other application will not use the same private chunk name. If you use a private chunk type, it is prudent to store additional identifying information at the beginning of the chunk data.

Please note that if you use a private chunk for information that is not essential to view the image, and have any desire whatsoever that others not using your own viewer software be able to view the image, you should use an ancillary chunk type (first character is lowercase) rather than a critical chunk type (first character uppercase).

If you want others outside your organization to understand a chunk type that you invent, contact the maintainers of the PNG specification to submit a proposed chunk name and definition for addition to the list of special-purpose public chunks (see Additional Chunk Types). Note that a proposed public chunk name (with uppercase second letter) must not be used in publicly available software or files until registration has been approved.

If an ancillary chunk is to contain textual information that might be of interest to a human user, it is recommended that a special chunk type not be used. Instead use a tEXt chunk and define a suitable keyword. In this way, the information will be available to users not using your software.

Keywords should be chosen to be reasonably self-explanatory, since the idea is to let other users figure out what the chunk contains. If of general usefulness, new keywords for tEXt chunks may be registered with the maintainers of the PNG specification.

Back to PNG table of contents

Next page