Scaling Big Data Infrastructure for Energy Market Analytics

Enerlytica is a leader in data analysis for the energy market. With over a decade of experience, they help energy companies unlock the potential of their data. However, as data volumes and complexity grew, Enerlytica encountered critical challenges in scaling their data infrastructure. To continue offering cutting-edge analytics and insights, they needed a robust, scalable, and efficient system. That’s where we came in.

Talk to the expert

Case Study

Stabilizing and Scaling a B2B SaaS Chatbot Powered by Large Language Models

Kauz.ai specializes in creating intelligent chatbot solutions that enhance customer engagement for businesses worldwide. Company developed a chatbot that uses Large Language Models (LLMs) to provide real-time responses by drawing on knowledge bases (KBs) from various documentation sources. Their platform needed to handle document processing at scale, ensure prompt responses, and deliver a seamless user experience. However, issues with the underlying Kubernetes (k8s) setup, along with a monolithic service structure, led to recurring stability challenges.

Challenges

1. Single-Container Monolith

• The chatbot’s entire functionality (including Celery workers, queues, and application logic) ran in a single container. • This setup hindered independent scaling and caused resource contention.

2. Improper Kubernetes Configuration

• Services were running without properly assigned resource requests and limits.

• Frequent crashes and restarts occurred unpredictably, destabilizing the platform.

3. Lack of Monitoring and APM

• No Application Performance Monitoring (APM) or structured logging was in place.

• Diagnosing performance bottlenecks or system failures was challenging.

4. Rising Costs and Limited Visibility

• Because resources were untracked and unbounded, unplanned costs began to pile up.

• There was no clear approach for scaling to meet workload demands efficiently.

What We Did

1. Service Separation and Kubernetes Best Practices

• We split the monolithic container into multiple deployments, isolating key services (application servers, Celery workers, queue managers).

• Implemented proper Kubernetes resource requests and limits to ensure predictable performance.

2. Introduction of Monitoring and APM

• Deployed monitoring tools and Application Performance Monitoring solutions to track system health in real time.

• Set up alerting to quickly identify and address issues before they affected end users.

3. Autoscaling and Colocation

• Enabled task/request-based autoscaling to scale each service independently.

• Where beneficial, used colocation of certain services to balance resource usage and reduce overhead.

4. Multi-Tenancy for Cost Efficiency

• Converted parts of the platform to be multi-tenant, allowing multiple customers to share underlying infrastructure.

• This approach helped distribute costs more effectively while preserving data isolation and performance.

5. Focus on LLM-Driven Processes

• Ensured that the LLM-based chatbot could seamlessly integrate newly ingested knowledge bases.

• Prepared the system for additional advanced features, such as voice conversation capabilities (an area we have experience with for other clients).

Results

• Improved Stability & Reliability: Separating services and configuring Kubernetes resources eliminated random crashes and restarts.

• Scalable Architecture: Independent autoscaling ensures the chatbot and background workers can respond to changing workloads without over-

allocating resources.

• Enhanced Observability: With real-time monitoring and APM, performance issues can be quickly detected and addressed, leading to higher uptime

and better user satisfaction.

• Transparent Resource Usage & Reduced Costs: By defining resource requests and limits, the platform’s costs are more predictable. Multi-tenancy

and targeted autoscaling have further optimized infrastructure usage.

Next Steps

• Performance Testing: We plan to conduct more rigorous load and stress tests to validate the platform’s capabilities under peak demand.

• Further Job Splitting: Additional segmentation of the knowledge base processing tasks will enable even finer control over resource allocation, further

improving resilience.

• Ongoing Collaboration: Our team remains committed to supporting the client’s success, ensuring that the platform adapts to the evolving

requirements of LLM-based use cases.

As data workloads increased, Enerlytica faced performance bottlenecks, leading to delays and inefficiencies in processing large datasets.

Without autoscaling, the infrastructure couldn’t adapt to fluctuating demand, resulting in under-utilized resources during low demand and overwhelmed systems during peak loads. As Enerlytica onboarded more clients and processed larger datasets, they experienced limitations that hindered their ability to scale operations.

The lack of monitoring made it difficult to track the health and performance of data pipelines. This led to challenges in identifying and resolving issues, prolonging debugging times and reducing overall reliability.

The Challenge

We moved Enerlytica’s entire data processing system to Amazon Web Services (AWS). By leveraging AWS, we were able to provide a more flexible and resilient infrastructure, ensuring that Enerlytica could handle increased data loads efficiently.

We containerized the data processing workflows using Kubernetes. This enabled Enerlytica to deploy, manage, and scale their applications more effectively, distributing workloads across multiple nodes. Prefect allowed us to automate, schedule, and monitor data workflows efficiently.

With built-in monitoring and logging, Enerlytica’s team could now gain real-time insights into their data pipelines, making it easier to detect and fix issues quickly. We optimized Enerlytica’s database management system by introducing ClickHouse for high-performance, real-time analytics on large datasets. ClickHouse, a columnar database, enabled faster data querying and analysis.

What We Did

Enerlytica’s migration to AWS, coupled with the implementation of Kubernetes and Prefect, dramatically improved their ability to scale and manage complex data workflows.

Autoscaling capabilities allowed their infrastructure to dynamically adjust based on real-time demand, eliminating bottlenecks and ensuring optimal resource utilization.

With the integration of ClickHouse, data querying and analytics performance significantly increased, enabling Enerlytica to process large datasets faster and deliver insights more efficiently.

Results

Conclusion

Why This Matters for Companies Building LLM-Based Solutions

Future-Ready Solutions

Scalable Architecture: As AI-driven products grow, we prioritize scalability and reliability.

Customer-Centric Approach: We align with business goals, addressing challenges and celebrating successes. Cross-Industry Impact: Principles like modularity, monitoring, and optimization apply across various platforms.

By rethinking a B2B SaaS chatbot's architecture, we improved stability and reduced costs, ensuring every change supports our customers’ objectives through empathy and collaboration.