FSST - Fast Static Symbol Table compression

A fast string compression algorithm optimized for random access and decompression speed. FSST learns a compact symbol table from training data to achieve high compression ratios on structured and repetitive text.

Implementation

This implementation is based on the FSST algorithm described in:

"FSST: Fast Random Access String Compression" by Peter Boncz, Thomas Neumann, and Viktor Leis (VLDB 2020).

Key features:

Learned symbol tables with up to 255 symbols (1-8 bytes each)
Fast decompression via direct table lookup (~1-2 GB/s)
Compact representation with 2-8 KB symbol tables
Zero-allocation decoding for high-throughput scenarios
Efficient encoding using hash tables and optimized lookup paths
Binary serialization for persistent storage and reuse
Order-independent training for deterministic results

This implementation provides a balance between compression ratio, decompression speed, and memory efficiency, making it ideal for structured data workloads.

Usage

// Train on representative data
inputs := [][]byte{
    []byte(`{"id":123,"name":"Alice"}`),
    []byte(`{"id":456,"name":"Bob"}`),
}
tbl := fsst.Train(inputs)

// Compress and decompress
compressed := tbl.Encode([]byte(`{"id":789,"name":"Charlie"}`))
original := tbl.DecodeAll(compressed)

// Or decode into a fixed buffer for zero-allocation
dst := make([]byte, 1024)
n := tbl.Decode(dst, compressed)
_ = dst[:n] // decompressed data

// Serialize table for reuse
data, _ := tbl.MarshalBinary()
var tbl2 fsst.Table
tbl2.UnmarshalBinary(data)

Performance Characteristics

FSST is optimized for workloads requiring fast decompression:

Training: O(n × k) where n is input size, k is number of training rounds (5)
Encoding: O(n) where n is uncompressed input size
Decoding: O(m) where m is compressed input size (table lookup)
Compression ratio: 1.5x to 3x on structured/repetitive text
Table size: 2-8 KB

Best suited for:

Structured text: JSON, CSV, logs, XML
Repetitive strings: database dumps, API responses
Text with patterns: URLs, timestamps, common fields

Installation

go get github.com/axiomhq/fsst

License

Distributed under MIT License (The MIT License).

See LICENSE for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
testdata		testdata
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
FSST.md		FSST.md
LICENSE		LICENSE
README.md		README.md
counters.go		counters.go
counters_test.go		counters_test.go
doc.go		doc.go
examples_test.go		examples_test.go
go.mod		go.mod
symbol.go		symbol.go
symbol_test.go		symbol_test.go
table.go		table.go
table_test.go		table_test.go
train.go		train.go
train_test.go		train_test.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FSST - Fast Static Symbol Table compression

Implementation

Usage

Performance Characteristics

Installation

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

axiomhq/fsst

Folders and files

Latest commit

History

Repository files navigation

FSST - Fast Static Symbol Table compression

Implementation

Usage

Performance Characteristics

Installation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages