Fundamentals 11 min read

The Discrete Cosine Transform: History, Theory, and Its Impact on Image and Video Compression

Originating from three largely unknown Indian engineers in the early 1970s, the Discrete Cosine Transform (DCT) became the cornerstone of modern image and video compression standards such as JPEG and MPEG, enabling efficient lossy compression by converting spatial data to frequency components and exploiting human visual perception.

IT Services Circle
IT Services Circle
IT Services Circle
The Discrete Cosine Transform: History, Theory, and Its Impact on Image and Video Compression

Why Compression Matters

In everyday life we constantly rely on image and video compression to transmit media quickly; without it, large files would be impossible to send over networks or would cause long waiting times.

Technologies such as JPEG for images and H.26x series for video are built on a fundamental algorithm called the Discrete Cosine Transform (DCT).

What Is DCT?

DCT (Discrete Cosine Transform) is a type of Fourier‑related transform that converts spatial pixel data into frequency domain coefficients. It separates an image block into one low‑frequency (DC) component that represents the overall appearance and several high‑frequency (AC) components that capture fine details.

Because most visual energy concentrates in low frequencies, the DC coefficient tends to be large while many AC coefficients are small or zero after quantisation, enabling strong compression.

How DCT Enables JPEG Compression

In JPEG, an image is divided into 8×8 blocks, each block is transformed by DCT, then the coefficients are quantised according to a table that reflects human visual sensitivity. Low‑frequency coefficients are kept with higher precision, while many high‑frequency coefficients are rounded to zero.

The quantised coefficients are then entropy‑coded (e.g., Huffman coding) to produce the final compressed bitstream. Decoding performs the inverse DCT (IDCT) to reconstruct an approximation of the original image.

Mathematical Details (Illustrative Example)

Consider a 3×3 pixel block. Applying DCT moves most of the signal energy into the top‑left coefficient (the DC term) and distributes the remaining information into the eight AC terms. The DC term represents the block’s average intensity (low‑frequency information), while the AC terms capture edges and textures (high‑frequency information).

Typical DCT formula (2‑D) is:

F(u,v)=\frac{1}{4}\alpha(u)\alpha(v)\sum_{x=0}^{N-1}\sum_{y=0}^{N-1} f(x,y)\cos\frac{(2x+1)u\pi}{2N}\cos\frac{(2y+1)v\pi}{2N}

where \(\alpha(0)=\frac{1}{\sqrt{2}}\) and \(\alpha(k)=1\) for \(k>0\).

Historical Background

The DCT was invented in 1971‑1973 by three relatively unknown Indian engineers: Nasir Ahmed, K. R. Rao, and T. Natarajan, while they were graduate students and early‑career researchers at the University of New Mexico. Their work was initially rejected by the NSF as “too simple,” but they persisted during a self‑funded summer project.

In January 1974 the paper “Discrete Cosine Transform” was published in *IEEE Transactions on Computers* and has since been cited over 5,800 times. The algorithm became the core of JPEG (1992), MPEG video standards, WebP, HEIF, AV1, and many other modern codecs.

Biographies of the Inventors

Nasir Ahmed – Born 1940 in Bangalore, earned his Ph.D. from New Mexico University in 1966, later became a distinguished professor there, serving as department chair and graduate school dean. He is now in his 80s.

K. R. Rao – An Indian‑American scholar who earned a Ph.D. in nuclear engineering (1960) and a second Ph.D. in electrical & computer engineering (1966). He spent most of his career at the University of Texas at Arlington and was an IEEE Fellow. He passed away in January 2021 at age 89.

T. Natarajan – Originally a Ph.D. student of Ahmed; little public information is available about his later career.

Impact and Legacy

DCT’s ability to efficiently compress visual data has transformed how we communicate, enabling real‑time video calls, streaming, and the massive sharing of images on platforms like WeChat. Its influence is often highlighted in popular media as a “world‑changing algorithm.”

References:

https://spectrum.ieee.org/krrao-tribute

https://www.islamicity.org/80703/nasir-ahmeds-algorithm-that-transformed-the-world/

https://cloud.tencent.com/developer/article/1862531

https://mp.weixin.qq.com/s?__biz=MzU1NTEzOTM5Mw==&mid=2247512538&idx=1&sn=57f46386002cf5554681f8ef9f61a3e0

image compressionJPEGVideo Codecsignal processingAlgorithm HistoryDiscrete Cosine Transform
IT Services Circle
Written by

IT Services Circle

Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.