Building a Local LLM Agent for Automated Work List Generation
A management team automated work list generation by deploying a local LLM agent. This on-premise, CPU-only solution leverages Ollama and Gemma 4 E2B to process internal reports, ensuring data privacy…
A management team automated work list generation by deploying a local LLM agent. This on-premise, CPU-only solution leverages Ollama and Gemma 4 E2B to process internal reports, ensuring data privacy and consistency.
Management teams often dedicate hours to manually extracting work items from developer reports, a process prone to errors and inconsistency. One team faced this challenge, spending significant time sifting through dozens of monthly reports for tasks like 'bug fix' or 'released version 1'. Beyond inefficiency, using cloud-based AI for this task posed a security risk, exposing internal project activity to external servers. This led them to develop a local LLM agent to automate work list generation, prioritizing data privacy.
The team's solution involved a console-based application running on an internal server. This agent processes raw reports, applies transformations, and outputs a polished list of work items. The entire pipeline operates on a CPU-only server, addressing the security concerns inherent in transmitting sensitive data to external cloud services. This on-premise approach ensures that all internal project activity remains within the company's infrastructure.
Normalizing Inconsistent Report Data
Developer reports frequently exhibit varied formats, ranging from detailed entries with Jira ticket IDs to cryptic one-liners like 'fixed issue'. This inconsistency creates a significant data quality problem, making it difficult for managers not directly involved in a project to understand the context of reported work. The local LLM agent was designed to process this unstructured data automatically, normalizing chaotic report data and filtering out extraneous information. This step aims to clarify ambiguous entries, providing a standardized output that is immediately actionable.
Detecting and Eliminating Duplicate Work
Manual review processes often lead to the inclusion of duplicate tasks, either from previous months or from activities logged repeatedly over several days. This creates overlaps and inflates work lists with near-identical entries. To counter this, the team integrated a duplicate detection mechanism. For embedding generation, which is crucial for identifying similar items, the agent uses the nomic-embed-text model. This model, described as only a few megabytes in size, allows the system to compare new report entries against historical data, preventing redundant tasks from appearing in the final work list.
Ensuring On-Premise Data Privacy
Initial attempts to use cloud-based LLMs like ChatGPT for report cleanup exposed a critical security vulnerability. Handing over a full month of internal project activity to an external cloud service was deemed unacceptable for many enterprise businesses, particularly those in regulated sectors like finance or healthcare. The implemented solution circumvents this risk by running the entire AI pipeline on a CPU-only server. It uses Ollama to serve a local instance of the Gemma 4 E2B model, ensuring that all processing occurs within the company's secure internal network, maintaining complete data sovereignty.
Integrating with Existing Jira Workflows
The utility of extracted work items is significantly enhanced by integrating them with existing project management tools. The local LLM agent was built to enrich descriptions from Jira, adding valuable context to the normalized work items. While the specific mechanics of this enrichment are not detailed, the capability ensures that the generated work list is not just clean and consistent, but also deeply integrated with the project's official record. This provides managers with a comprehensive view of accomplishments, linked directly to their source in Jira.
What We'd Change
The on-premise, CPU-only approach prioritizes data privacy and cost efficiency for specific use cases. However, this architecture presents limitations for broader application or increased complexity. The reliance on a single, smaller LLM like Gemma 4 E2B, while suitable for basic extraction and normalization, might struggle with highly nuanced or exceptionally 'chaotic' reports that require more sophisticated reasoning. Larger, more capable models, often requiring GPU acceleration, could offer superior performance for intricate linguistic patterns or diverse reporting styles.
The console-based application, triggered by a cron job, provides a robust backend solution but lacks a user-friendly interface. For wider adoption across an organization, a web-based GUI could improve accessibility for non-technical managers, reducing friction in review and approval workflows. Furthermore, while duplicate detection is implemented, the system's ability to provide deeper analytical insights or generate high-level summaries beyond a clean list of accomplishments is not described. Expanding the agent's capabilities to include trend analysis or predictive insights would require additional architectural components and potentially more powerful LLMs.
Automating the extraction of work items from unstructured reports addresses a common operational bottleneck and mitigates significant data privacy risks. By deploying a local LLM agent, organizations can maintain control over sensitive internal data while improving efficiency. This approach demonstrates a practical application of on-premise AI, proving that critical internal processes can be optimized without compromising security or relying on external cloud infrastructure. The core value lies in the strategic choice to keep the entire AI pipeline within the enterprise's control.
Pull quote: “”
Every claim ties to a primary source. See our methodology.