Probabilistic Data Structures

When dealing with massive datasets, perfect precision often becomes a bottleneck. This series explores probabilistic data structures, which are elegant algorithms that trade a tiny margin of error for massive gains in memory efficiency and processing speed.

Articles

Data Structures for Large-Scale Systems

Praakhya Avasthi and Shubh Kanodia

Jan 19

Imagine you’ve just deployed a new microservice, a simple API gateway, and initially, traffic is pretty manageable, maybe a few requests per second. And you’re logging every IP address to track unique visitors and block malicious spammers.

Read full story

Probabilistic Data Structures: The Bloom Filter Deep Dive

Praakhya Avasthi and Shubh Kanodia

Feb 3

Probabilistic Data Structures: The Bloom Filter Deep Dive

In the last post, we hit the Memory Wall. We realised that storing millions of exact keys in RAM isn’t just expensive, it’s a recipe for crashing your pods and burning money. We introduced the concept of Probabilistic Data Structures that trade a tiny bit of accuracy for a massive gain in efficiency.

Read full story

The Counting Tax: Trading Exact Numbers for Infinite Scale

Praakhya Avasthi and Shubh Kanodia

Feb 17

The Counting Tax: Trading Exact Numbers for Infinite Scale

You are managing a core router processing 10 million packets per second. To keep the network healthy, you must defend against traffic floods designed to slow processing. These threats generally arrive in two distinct flavours:

Read full story

The Count-Min Sketch Deep Dive

Praakhya Avasthi and Shubh Kanodia

Feb 26

In our previous post, we explored the reality of scale: tracking massive data streams using traditional HashMaps quickly exhausts main memory and introduces crippling disk I/O latency.

Read full story

Code

The above posts are accompanied with implementations of the concepts discussed. Click below if you are interested to know more!

Eventually Consistent Writes GitHub

More Coming Soon

Eventual Consistency

Probabilistic Data Structures

Articles

Data Structures for Large-Scale Systems

Probabilistic Data Structures: The Bloom Filter Deep Dive

The Counting Tax: Trading Exact Numbers for Infinite Scale

The Count-Min Sketch Deep Dive

Code