HomeReadTools deskPMAD allocator trades flexibility for predictable tail latency
Tools·Jul 4, 2026

PMAD allocator trades flexibility for predictable tail latency

A new memory allocator, PMAD, aims for deterministic performance by eliminating the slow path common in general-purpose allocators. This requires developers to specify memory patterns upfront. The…

A new memory allocator, PMAD, aims for deterministic performance by eliminating the slow path common in general-purpose allocators. This requires developers to specify memory patterns upfront.

The Answer Up Front

PMAD is for developers building latency-critical systems where the worst-case performance matters more than the average case. Think high-frequency trading, game engines, or real-time audio processing. Teams that can accurately profile and declare their application's memory allocation patterns upfront will find its design compelling. You should skip it for general-purpose applications with unpredictable memory needs; the lack of a dynamic slow path means you'll get a NULL pointer instead of a slower allocation when a pre-sized pool is exhausted. The bottom line: PMAD is a specialized, opinionated tool for taming tail latency, but its performance claims require verification against your specific workload.

Methodology

This v0 review is based on the author's technical blog post and linked GitHub repository for the PMAD allocator, observed on June 18, 2026. The analysis covers the stated design philosophy, the core API, and the author's self-reported benchmark data comparing PMAD against Google's tcmalloc and Microsoft's mimalloc. We have not performed independent benchmarks, tested long-term memory fragmentation, or evaluated performance on workloads with mixed allocation sizes. The performance numbers cited here are claims made by the author and are not yet independently verified. This review will be updated if we conduct our own benchmarks or if new, verifiable data becomes available.

What It Does

PMAD's design is a direct reaction to the high tail latency found in general-purpose allocators. Instead of optimizing for the average case with complex caching tiers and fallback mechanisms, it removes those paths entirely.

Pre-allocated, fixed-size pools

Before any allocations can be made, the user must initialize PMAD with a list of fixed block sizes and the percentage of the total memory pool each size class should occupy. A single mmap call at startup reserves the entire memory region. This moves the only syscall to the initialization phase, ensuring subsequent allocations are fast and predictable. If your application needs 20% of its memory for 16-byte objects and 50% for 64-byte objects, you declare that explicitly.

Constant-time operations

The author claims pmad_alloc() and pmad_free() are constant-time operations. An allocation is simply a lookup to find the appropriate free list for the requested size and popping a pointer off it. A free operation involves reading a small header in the memory block to identify its size class and pushing it back onto the correct free list. There are no locks, no tree traversals, and no complex coalescing logic. The cost is deterministic.

No fallback mechanism

The defining feature is the absence of a slow path. If the free list for a requested block size is empty, pmad_alloc() returns NULL immediately. It will not try to find memory from another pool, split a larger block, or request more memory from the operating system. This is the fundamental trade-off: the application developer, not the allocator, is responsible for managing memory capacity.

What's Interesting / What's Not

The most interesting aspect of PMAD is its explicit philosophical bet: tail latency is the enemy, and the only way to defeat it is to eliminate the code paths that cause it. This is a refreshing, opinionated take in a world of general-purpose tools that try to be good at everything. For a specific class of performance engineering, this is the correct framing.

The author provides benchmark data to support this, focusing on the ratio of P99.9 latency to P50 latency as the key measure of predictability. However, the author's own data for a 64-byte allocation workload presents a confusing picture. While PMAD's P99.9/P50 ratio of 2.5x is better than mimalloc's 3.5x, it is worse than tcmalloc's 2.0x. The post's narrative is that PMAD was built to be "flatter" and more deterministic, but the provided data shows the established tcmalloc is even flatter on that specific test. This discrepancy between the project's primary goal and its own initial benchmark data is the most significant weakness in the pitch. It doesn't invalidate the design, but it suggests that achieving better tail latency than heavily optimized incumbents is non-trivial, even with a specialized architecture.

Pricing

PMAD is an open-source C library available on GitHub. It is free to use, presumably under a permissive license (not specified in the source article). (Pricing snapshot: June 18, 2026).

Verdict

PMAD is a well-articulated, special-purpose tool for engineers who must control worst-case allocation latency. Its design of pre-allocating all memory upfront in fixed-size pools is a classic technique for achieving deterministic performance. It is a good fit for embedded systems or services with stringent SLOs and predictable object sizes. However, developers should be cautious. The author's own benchmark claims show Google's tcmalloc achieving a better tail-to-median latency ratio in the showcased example. Before adopting PMAD, you must benchmark it against your specific workload to verify that its architectural trade-offs actually deliver a superior result for your use case compared to mature, general-purpose alternatives.

What We'd Test Next

A v2 review would require independent benchmarks. First, we would reproduce the author's 64-byte workload to verify the reported numbers for PMAD, tcmalloc, and mimalloc. Next, we would design a test harness with a mixed-size allocation pattern to see how PMAD performs under more realistic conditions. We would also measure performance degradation as pools approach exhaustion and analyze long-term memory fragmentation. Finally, a comparison against other dedicated pool allocators, not just general-purpose ones like tcmalloc, would provide a more complete picture of its place in the ecosystem.

The investor read

PMAD itself is an open-source library, not a venture-backed company. However, its existence signals a maturing market for performance-engineering tools that prioritize tail latency (P99/P99.9) over median performance. As infrastructure costs rise and service-level objectives become more stringent, developers will increasingly adopt specialized components that offer predictability. This creates opportunities for companies building observability tools that can pinpoint tail-latency sources, or for infrastructure providers (e.g., databases, serverless platforms) to offer 'deterministic performance' tiers by integrating allocators like PMAD. The investment play isn't in PMAD, but in the broader trend of specialized, high-performance systems software it represents.

Pull quote: “The post's narrative is that PMAD was built to be "flatter" and more deterministic, but the provided data shows the established tcmalloc is even flatter on that specific test.”

Sources · how we verified
  1. A single malloc took 7 milliseconds. So I deleted the slow path.

Every claim ties to a primary source. See our methodology.

Reported by the Riley desk on Founderr Pulse’s Tools beat. Every factual claim is tied to a primary source and linked; anything that can’t be stood up doesn’t run. Founderr (RIKHATH LLC) is the accountable publisher and corrects in place. How we work · About · File a correction.
R
Riley

The Riley desk covers tools — what founders are building with, switching to, and abandoning. Every claim is sourced and linked. Operated by Founderr (RIKHATH LLC) See the desk →

Founderr Pulse — free & independent. The desk for people who build & back.