Tactics·Jul 2, 2026

A 16-minute bug: How a founder traced 928 missing data rows

A customer report of missing data led to a four-step debugging process. The playbook: diff files, test hypotheses with queries, and follow the timestamps to find the race condition. A client support…

By Maya · Tactics desk·Human-reviewed·✓ Verified Jul 2, 2026·4 min read·1 source

A customer report of missing data led to a four-step debugging process. The playbook: diff files, test hypotheses with queries, and follow the timestamps to find the race condition.

A client support ticket reported a system was losing data. An automated morning email report contained 423 rows. The same report, generated manually from the web portal, showed 1,351. The 928-row discrepancy suggested a critical failure. The actual bug was not data loss, but a 16-minute timing gap between two automated jobs. The author's post-mortem provides a repeatable playbook for debugging data integrity issues.

Isolate the failure mode: missing vs. wrong

Before inspecting code, the developer compared the two report files. A diff confirmed that every line from the smaller email file was present in the larger portal file. This was a crucial distinction. The data was not incorrect or corrupted; it was absent. This narrowed the problem from a potentially complex data transformation bug to a simpler issue of data availability or filtering.

Test hypotheses with read-only queries

The developer articulated three plausible causes and tested each against the production database using read-only queries. This prevented introducing new problems while debugging.

Split data: A query confirmed all 1,351 rows belonged to the same bank code, ruling out a split report.
Overzealous deduplication: The system was designed to remove invoices seen the previous day. The developer replicated this logic against production data. The PHP array_diff function removed zero rows, clearing the deduplication logic.
Inconsistent filters: Both reports were confirmed to be filtering on the same transaction_created_at column for the same date.

Re-run the original query

With the most likely application logic hypotheses invalidated, the next step was to run the email job's exact SQL query manually. The query, according to the post, returned 1,351 rows, the correct and complete number. The same query that produced 423 rows hours earlier now produced the full dataset. This isolated the variable: time. Nothing in the query or the code had changed, but the state of the database had.

Find the race condition in the timestamps

An investigation of the system's job logs revealed the root cause. The author reports the following sequence:

07:15:45: The automated report job runs, queries the database, and sends the email.
07:31:02: A separate process imports the daily bank return file.

The report was generated 16 minutes before the data existed in the database. The 423 rows in the email were transactions processed before that 7:15 AM cutoff. The remaining 928 rows arrived with the 7:31 AM import. The bug was a race condition, a classic architectural flaw in systems with scheduled, interdependent tasks.

What We'd Change

The debugging process itself is sound. It is a disciplined, evidence-based approach that correctly isolated the fault. The issue lies in the system architecture that allowed the bug to occur. A reactive post-mortem is valuable, but proactive design prevents the ticket from being filed.

The core problem is that the reporting job and the data import job were not causally linked. They were two independent cron jobs running on a time-based schedule. The fix is to make their dependency explicit. Modern data orchestration tools like Airflow or Dagster are built for this. A simpler solution involves an event-driven pattern: the data import job, upon successful completion, emits an event that triggers the reporting job. This ensures the report is only generated after its source data is complete.

Without such orchestration, the system is brittle. If the bank file arrives late one day, the report will be incomplete or empty. If the reporting job is delayed, it might work correctly by chance. Relying on fixed-time schedules for dependent processes creates a category of errors that are difficult to reproduce because they depend on system timing, not just code logic.

Landing

The 928 missing rows were not a bug in the report's code, but a flaw in the system's workflow. The developer's methodical process of elimination correctly diagnosed a race condition that a code-level-only investigation would have missed. For founders building systems with asynchronous or batch processes, this incident is a clear signal. Ensure that jobs with data dependencies are triggered by process completion, not by the clock.

The investor read

This post-mortem signals a technically competent but architecturally immature operation. The debugging methodology is systematic and impressive for an early-stage team. However, the existence of a simple cron-based race condition in a core financial reporting process suggests a lack of robust data orchestration. For an investor, this is a yellow flag. It points to potential brittleness and scalability issues in the tech stack. An investable company in this space would demonstrate a move from reactive debugging to proactive system design, likely using event-driven architecture or established workflow orchestrators to manage dependent jobs. This signals a team learning in public, which is positive, but the underlying issue is a common symptom of a bootstrapped or non-scaled product.

Pull quote: “The report was generated 16 minutes before the data existed in the database.”

Sources · how we verified

O bug de 16 minutos: quando 'dado faltando' é uma corrida contra o relógio ↗

Every claim ties to a primary source. See our methodology.

Reported by the Maya desk on Founderr Pulse’s Tactics beat. Every factual claim is tied to a primary source and linked; anything that can’t be stood up doesn’t run. Founderr (RIKHATH LLC) is the accountable publisher and corrects in place. How we work · About · File a correction.

Maya

The Maya desk covers tactics: concrete playbooks, growth experiments, and operating decisions indie founders are running now. Every claim is sourced and linked. Operated by Founderr (RIKHATH LLC) See the desk →

Isolate the failure mode: missing vs. wrong

Test hypotheses with read-only queries

Re-run the original query

Find the race condition in the timestamps

What We'd Change

Landing

The investor read

How FamNest built an LLM judge to test its AI coach

A 90-Day Playbook for Shipping Production AI Features

A Founder Cut pgvector Query Time 98% with Three Indexing Changes