🪴 Anil's Garden

❯

❯

Compression, Encoding, Codecs, Text Encodings and Communication

Compression, Encoding, Codecs, Text Encodings and Communication

23 Nov 20252 min read

Resources 📚

✨ The Hitchhiker’s Guide to Compression

Need-to-know

Brotli - Wikipedia
Lempel–Ziv–Welch - Wikipedia
LZMA - Wikipedia
LZ77 and LZ78 - Wikipedia
zstd is incredible, but just in case the thought hasn’t occurred to someone here… Hacker News
Base64 Encoding
uuencoding - Wikipedia
yEnc - Wikipedia
Huffman Coding - W3Schools.com
Overview of Algorithms - The Hitchhiker’s Guide to Compression
- ✨ Arithmetic Coding - The Hitchhiker’s Guide to Compression
Arithmetic coding - Wikipedia
8-bit clean - Wikipedia
In-band signaling - Wikipedia

JPEG

JPEG - Wikipedia
How are Images Compressed? 46MB ↘↘ 4.07MB JPEG In Depth

Text Encoding, UTF, ASCII and more

See Compression, Encoding, Codecs, Text Encodings and Communication and Writing Systems

UTF-8 Everywhere
UTF-8 - Wikipedia - Unicode Transformation Format – 8-bit
Unicode - Wikipedia
Unicode Basic Multilingual Plane (BMP) - Unicode is divided into a total of 17 code areas, each with 65,536 characters (16 bits), currently only about 10 percent of these are used. The first and most important plane is the Basic Multilingual Plane (Plane 0, BMP), which contains nearly all commonly used writing systems and symbols. It is the home of the characters U+0000 to U+FFFF.
ASCII Table - ASCII Character Codes, HTML, Octal, Hex, Decimal
Character encoding - Wikipedia

UTF-8, UTF-16, UTF-32: A Comprehensive Analysis by Grok 3
- UTF-8:
  - Fully compatible with ASCII, meaning an ASCII file is also a valid UTF-8 file. This allows legacy programs, like C’s printf, to handle UTF-8-encoded files containing only ASCII characters, as they look for ASCII-specific characters like ’%‘.
- UTF-16 and UTF-32:
  - Not compatible with ASCII files, requiring Unicode-aware programs to display, print, or manipulate them, even if the file contains only ASCII characters.
  - Contain many zero bytes, which can break common null-terminated string handling logic (e.g., in C/C++), making them incompatible with legacy systems that rely on such logic.

Related / See Also

Compression, Encoding, Codecs, Text Encodings and Communication
Databases and Data Interchange
Algorithms and Data Structures
Networking and Computer Networks

Graph View

Resources 📚
Need-to-know
JPEG
Text Encoding, UTF, ASCII and more
Related / See Also

Backlinks

Algorithms and Data Structures
Compression, Encoding, Codecs, Text Encodings and Communication
Databases and Data Interchange
Networking and Computer Networks
Writing Systems

Website
Bluesky
Twitter/X
GitHub
LinkedIn
Instagram
Goodreads
Letterboxd
🍋