PBF (Portable Bitmap Format) Specification, Third Draft
By Thomas Boutell, boutell@netcom.com, 1/15/1995
Permission granted to reproduce this specification in complete
and unaltered form. Excerpts may be printed with the
following notice: "excerpted from the PBF
(Portable Bitmap Format) specification by
Thomas Boutell." No notice is required in software
that follows this specification; notice is only required
when reproducing or excerpting from the specification itself.
The author wishes to acknowledge the contributions of the New
Graphics Format mailing list and the readers of comp.graphics.
(Mr. Boutell is solely responsible for errors of fact or design
in the PBF specification, however.)
This is the third draft of the PBF specification discussion
document, replacing the second draft. There are many
significant changes from the previous drafts.
This draft is intended solely to generate comments and
does not represent the final standard.
1. Rationale
The PBF format is intended to provide a portable,
legally unemcumbered, simple, lossless, streaming-capable,
well-specified standard for bitmapped image files which gives
new features to the end user at minimal cost to the developer.
It has been asked why the PBF format is not simply an
extension of the GIF format. The short answer is that the GIF
format is embroiled in legal disputes, does not support
24-bit images and lacks an alpha channel mechanism.
It has been asked why the PBF format is not TIFF, or
a subset of TIFF. The answer is that TIFF does not support
a compression scheme that is not legally encumbered,
and that a subset of TIFF would simply frustrate
users making the reasonable assumption that a file
saved as TIFF from Software XYZ will load into a
program supporting our flavor of TIFF. Implementing
full TIFF would violate the simplicity constraint.
It has been asked why the PBF format is not IFF,
or a sub- or superset of IFF. The same concern applies
as with TIFF: users with software that purports to
generate IFF files will not be pleased when those
files do not load in programs supporting the new
specification. In addition, the IFF specification
has rarely been accurately implemented and there
is considerable disagreement among implementations.
It has been asked why PBF does not include
lossy compression. The answer is that JPEG already does
an excellent job of lossy compression, and there is
no reason to repeat that effort. Different tools,
different jobs.
2. Design Differences from Other Formats
PBF has been expressly designed not to be completely
dependent on a single compression technique. Although
inflate/deflate compression is mentioned in this
document, PBF would still exist without it.
PBF supports an alpha channel instead of the
transparency-index approach used in GIF. An alpha
channel is much more flexible than a transparency
index, but can be just as simple in palette-color
images; conversion from one format to the other
will not be difficult to accomplish without loss
of transparency.
3. Data Representation Note
Byte Order
All integers which are not 1 byte integers will be in
network byte order, which is to say the most significant
byte comes first, and the less significant bytes in
descending order of significance (simply MSB LSB
for two-byte integers, B3 B2 B1 B0 for 4-byte
integers). References to bit 7 refer to the
highest bit (128) of a byte; references to
bit 0 refer to the lowest bit (1) of a byte.
Color Values and Gamma Correction
All color values range from zero (black) to
most intense at the maximum value. Color values
are based on a flat gamma response of 1.0, and
display hardware with other gamma values should
compensate accordingly. A display with a gamma
response of 2.0 will render midlevel grays
too darkly if this is not compensated for.
(This is not at all uncommon.)
Thus, if your display hardware has a gamma
value of 2.0, color values should be converted
to values between 0 and 1 and raised to
the (1/2.0) power for use on the actual display.
4. The Format
The Identification Header
The first four bytes always contain the following
ASCII characters:
.PBF
(The dot is included to avoid confusion with files
such as this one which discuss PBF as opposed to
being PBF files themselves.)
The Main Section
The remainder of the file consists of a series of
chunks, where each chunk consists of a 4-byte
chunk type consisting of UPPERCASE
ASCII letters and spaces (ascii 32), a 4-byte, UNSIGNED
length (not including itself or the chunk type), and the
data bytes appropriate to that chunk, if any. Note that this
provides for a chunk to be skipped even if the implementation
does not recognize that particular chunk type. The last
chunk should always be an EOF chunk.
Note also that the same chunk can appear more than
once if necessary, if so specified in the description
of the chunk. This is sometimes necessary in order
to implement streaming encoders.
The chunk-ordering mechanism present in the first two
drafts has been dropped. Instead, rules regarding chunk
order are stated in the description of each chunk.
Ancillary and Critical Chunks
Chunks which are not strictly necessary in order to
meaningfully display the contents of the file are known as
"ancillary" chunks, and their names must begin with
a capital "A" character.
Chunks which are critical to the successful display of the
file's contents begin with any other uppercase letter.
Critical chunks are necessary in order to properly
display the contents of the file. If an implementation
encounters a critical chunk type it does not know
how to handle, it must indicate this to the user and
not display the contents of the file. The image header chunk
(HEAD) is an example of a critical chunk.
A hypothetical vector-graphics chunk would also be a necessary
chunk, since without rendering it the image would appear
to be blank, or would contain a background bitmap
with no other information.
Ancillary chunks are ancillary information that enhances
the image in some fashion, but without which the image
can still be successfully displayed. Examples are the
comment and copyright chunks.
Proprietary Chunks
If you want others outside your organization to understand
a chunk type that you invent, CONTACT THE AUTHOR
OF THE PBF SPECIFICATION (boutell@netcom.com) and
specify the format of the chunk's data and your
preferred chunk type. The author will assign a permanent,
unique chunk type. The chunk type will be publicly listed
in an appendix of extended chunk types which can be
optionally implemented. In the event that Mr. Boutell
is unable to maintain the specification, the task will
be passed on to a qualified volunteer.
If you do not require or desire that others outside your
organization understand the chunk type, you may
use a chunk name beginning with Q (for critical
chunks) or with AQ (for ancillary chunks).
Chunk types with these prefixes
will never be assigned in the public specification.
Please note that if you want to use these chunks for
information that is not essential to view the image,
and have any desire whatsoever that others not using your
internal viewer software be able to view the image,
you should use AQ rather than Q. Also note that
others may use the same proprietary prefixes,
so it would be advantageous to keep additional
identifying information at the beginning of
the chunk.
Standard Chunks
All PBF implementations must accept the following
chunk types in order to be considered
PBF-compliant. All implementations must understand
and successfully render the critical chunks below.
Standalone image viewers should also be capable of
displaying the ancillary chunks below, such as the copyright
notice, but this is not necessary for applications in which
many images may be displayed at once (ie,
WWW browsers).
Chunk Type Description
HEAD Bitmapped image header
Width: 4 bytes
Height: 4 bytes
Bit depth: 1 byte
Color type: 1 byte
Compression type: 1 byte
Interlace type: 1 byte
Width and height are 4-byte integers.
Bit depth is a single-byte integer. Valid values
that software must support are 1, 2, 4, and 8.
A value of 16 is also valid, but support for
this is optional. Software that does not support
a bit depth of 16 should acknowledge this if
possible rather than indicating that the
image is at fault. Bit depths of 16 should
in any case never appear with color type 1.
Color type is a single-byte integer. Valid values
are 1, 2, 3 and 4. Color type determines the
interpretation of the image data.
Color Type Valid Bit Depths Interpretation
1 1,2,4,8 Each pixel value is a palette
index; a palette chunk will appear
2 1,2,4,8,16 Each pixel value is a grayscale
level, where the largest value is
white, and zero is black
3 8,16 Each pixel value is a three-value
series: red (0 = black, max = red),
green (0 = black, max = green),
blue (0 = black, max = blue)
4 8,16 Each pixel value is a four-value
series: red (0 = black, max = red),
green (0 = black, max = green),
blue (0 = black, max = blue),
alpha (0 = transparent,
max = opaque)
Compression type indicates the compression scheme
which will be used to compress the image data.
This draft proposes use of the inflate/deflate compression
scheme, an LZ77 derivative which is used in zip, gzip, pkzip
and related programs, because extensive research has been done
supporting its legality. Inflate and deflate code
is available in the zip/unzip packages with a very
permissive license (yes, permissive enough for
commercial purposes, see those packages for details).
At present, only compression type 0 (inflate/deflate
compression) is defined. At present, all standard PBF
images will be compressed using this scheme.
Interlace Type
At present, there are two legal values for
interlace type: 0 (no interlace) or 1
(line-wise interlace).
With interlace type 0, rows are laid out
continuously from top to bottom.
With interlace type 1, rows are stored in the
following order:
Every eighth row, starting from row 0
Every eighth row, starting from row 4
Every fourth row, starting from row 2
Every second row, starting from row 1
The purpose of this feature is to allow images
to "fade in" in a simple fashion that does
minimal damage to compression efficiency,
although the file size is slightly expanded
on average.
Other interlace types have been proposed, and will
replace this scheme in the final proposal if the gain
in visual quality is sufficient to outweigh any compression
penalties.
PLTE Palette
This chunk must appear for color type 1, and
may appear for color types 3 and 4. In the latter
two cases, the palette chunk is optional, and
provides a recommended set of from 2 to 256 colors to
which the true-color image should be quantized if the
display hardware cannot display truecolor
directly.
The number of palette entires varies from 2 to 256.
For chunk type 1, the number of entries should not
exceed the range that can be represented by the
bit depth (for example, 2~4 = 16 for a bit depth of 4).
Note that this does NOT mean that there have to
be a full 16 entries. The length of the chunk is used
to determine the number of entires.
For color type 1, each palette entry consists of a
four-byte series:
red (0 = black, 255 = red),
green (0 = black, 255 = green),
blue (0 = black, 255 = blue),
alpha (0 = transparent, 255 = opaque)
Image creation programs are strongly encouraged
to place colors which the artist or algorithm
regards as important first in the palette, when
such information is available, in order
to allow display hardware with a limited supply of
colors to make intelligent compromises.
For color types 3 and 4, in which the palette is
optional and only a suggested quantization,
the fourth byte (alpha) is NOT present; each
palette entry consists of a three-byte series:
red (0 = black, 255 = red),
green (0 = black, 255 = green),
blue (0 = black, 255 = blue)
(Note that the palette tag uses single-byte values
for each channel even if the palette is a
suggested quantization of a 16-bit image.)
ACPY Copyright notice. The notice will consist of
ISO LATIN-1 text and will not be null-terminated.
New lines should be denoted by a single
line feed (10 decimal).
ACMT Comment. The comment will consist of
ISO LATIN-1 text and will not be null-terminated.
New lines should be denoted by a single
line feed (10 decimal).
IDAT Image data.
The image data will be compressed using the
compression scheme indicated by the COMP chunk.
IMPORTANT: multiple image chunks can appear in
sequence for the SAME image. Viewers must be able
to interpret such chunks. (Simply speaking, the
viewer knows it is not finished until it has read
as many pixels as are indicated by the
image dimensions in the HEAD chunk.) This rule
exists to permit encoders to work in a fixed
amount of memory by outputting multiple chunks.
The following text describes the uncompressed
data stream which will be fed to the compressor
or received from the decompressor.
Pixels are always laid out left to right in
each row, and rows are arranged from
top to bottom, except as modified by
the interlace (ILCE) tag.
Color types 1 and 2
For color type 1, each pixel value is an index into the
palette indicating which color in the palette should be
displayed at that location. For color type 2 (grayscale),
each pixel value is a grayscale level, where the maximum
value representable by the bit depth is white.
For 1-bit images, each horizontal line of pixels is represented
by a stream of bits, in which bit 7 (128) is the
leftmost pixel in the byte and bit 0 (1) is the
rightmost. Consecutive lines may share bits if the
pixels in the line do not fit evenly into bytes.
That is, if the last pixel of the line falls
in bit 4 of a byte, the first pixel of the next
line is stored in bit 3 of the same byte.
For 2-bit images, the same scheme is followed, except that
each pixel is represented by a 2-bit portion
of a byte, with the leftmost bit being most
significant. For instance, the first pixel
of the line is represented by bits 7 (128) and
6 (64) of the byte. Consecutive lines may share bytes.
For 4-bit images, the same scheme is followed, except
that each pixel is represented by a 4-bit portion
of a byte, with the leftmost bit being most
significant. For instance, the first pixel
of the line is represented by bits 7 (128),
6 (64), 5 (32) and 4 (16) of the byte.
Consecutive lines may share bytes.
For 8-bit images, each pixel is represented by a single
byte. For 16-bit grayscale images (color type 1),
each pixel is represented by a two-byte unsigned integer.
IMPORTANT:
For 8- and 16-bit grayscale images (color type 2, bit depth
of 8 or 16), the values are next input to the CROSS filter
(for non-interlaced images; see below) or to the SUB filter
(for interlaced images; see below) in order to improve
compression before being input to the compressor itself.
This step is NOT employed for palette color images
(color type 1).
Color types 3 and 4
For color type 3, each pixel is represented by
a red value, a green value, and a blue value,
8 or 16 bits apiece respectively depending
on the bit depth (8 or 16). For color type 4,
an additional alpha (opacity) value of the
same depth is added for each pixel.
IMPORTANT:
The values are next input to the CROSS filter
(for non-interlaced images; see below) or to the SUB filter
(for interlaced images; see below) in order to improve
compression before being input to
the compressor itself.
EOF End of File
The EOF chunk appears at the end of the PBF file.
The chunk contains a four-byte checksum, calculated
by adding together ALL preceding bytes in the file, not
including the checksum itself. Bytes are added
modulo 2~32 as unsigned integers to the 4-byte unsigned
integer checksum (this is the natural outcome when
unsigned bytes are added to a four-byte integer
without regard to overflow). If the EOF checksum
does not match the sum of the preceding bytes in
the file, viewers may elect to attempt to display
the contents of the file, but must warn the user
that the checksum is incorrect.
Details of Specific Algorithms
Inflate and Deflate
See the zip/unzip package, which includes source code for
both purposes in the files inflate.c and deflate.c, with a
very permissive license. Documentation of the compression
scheme is also available; see the zip/unzip package for
references. (zip/unzip and pkzip are compatible but
not identical. pkzip is commercial software.)
The Cross Filter
The cross filter is used to improve compression on non-interlaced
truecolor images (color types 3 and 4) and 8- and 16-bit
grayscale images (color type 2).
Output the following value, using unsigned
modulo arithmetic and integers of the size
appropriate to the bit depth (8 or 16):
Pixel[x][y] - Pixel[x-1][y] - Pixel[x][y-1] + Pixel[x-1][y-1]
for each channel (red, green, blue, and sometimes alpha) of each pixel.
At the beginning of the image, the previous pixel and previous row
are considered to have had a value of zero for each channel.
To reverse the effect of the cross filter after decompression,
output the following value:
CrossedValue + Pixel[x-1][y] + Pixel[x][y-1] - Pixel[x-1][y-1]
storing the result as the value of the previous pixel for
use in uncrossing subsequent pixels.
The Sub Filter
The sub filter is used to improve compression on interlaced
truecolor images (color types 3 and 4) and 8- and 16-bit
grayscale images (color type 2).
For each pixel, output the difference between that pixel
and the previous pixel, modulo the range possible in
that bit depth. For instance, for a bit depth of 8,
if the previous pixel was 16 and the current pixel
is 64, store 48. If the previous pixel was 255 and
the current pixel is 20, store 25. Note that unsigned
addition is used. IMPORTANT: At the start of each line,
consider the previous pixel value to be zero.
The Alpha Channel
Standalone image viewers can ignore the alpha channel,
provided that they properly skip over it in order to
be in the right position to read the next pixel.
World Wide Web browsers and the like should regard any pixel
with an alpha channel value of zero as transparent (the pixel
should be given the background color of the browser), and
any pixel with the maximum alpha channel value for that
bit depth as opaque (not blending with the background at all).
Intermediate values should blend according to the percentage
of maximum specified.
-T
--
The ouzo of human kindness.
<URL:http://sunsite.unc.edu/boutell/index.html>