Title: Coding Theorems for a Discrete Source With a Fidelity Criterion
Authors: Claude E. Shannon
Published: 1959-01-01
Link: https://mast.queensu.ca/~math474/shannon59.pdf

Abstract

Consider a discrete source producing a sequence of message letters from a finite alphabet. A single-letter distortion measure is given by a non-negative matrix (dij ). The entry dij measures the ?cost? or ?distortion? if letter i is reproduced at the receiver as letter j. The average distortion of a communications system (source-coder-noisy channel-decoder) is taken to be d = ? i.j Pijdij where Pij is the probability of i being reproduced as j. It is shown that there is a function R(d) that measures the ?equivalent rate? of the source for a given level of distortion. For coding purposes where a level d of distortion can be tolerated, the source acts like one with information rate R(d). Methods are given for calculating R(d), and various properties discussed. Finally, generalizations to ergodic sources, to continuous sources, and to distortion measures involving blocks of letters are developed.

In this paper a study is made of the problem of coding a discrete source of information, given a fidelity criterion or a measure of the distortion of the final recovered message at the receiving point relative to the actual transmitted message. In a particular case there might be a certain tolerable level of distortion as determined by this measure. It is desired to so encode the information that the maximum possible signaling rate is obtained without exceeding the tolerable distortion level. This work is an expansion and detailed elaboration of ideas presented earlier [1], with particular reference to the discrete case.

We shall show that for a wide class of distortion measures and discrete sources of information there exists a function R(d) (depending on the particular distortion measure and source) which measures, in a sense, the equivalent rate R of the source (in bits per letter produced) when d is the allowed distortion level. Methods will be given for evaluating R(d) explicitly in certain simple cases and for evaluating R(d) by a limiting process in more complex cases. The basic results are roughly that it is impossible to signal at a rate faster than C / R(d) (source letters per second) over a memoryless channel of capacity C (bits per second) with a distortion measure less than or equal to d. On the other hand, by sufficiently long block codes it is possible to approach as closely as desired the rate C / R(d) with distortion level d.

Finally, some particular examples, using error probability per letter of message and other simple distortion measures, are worked out in detail.