April/24 Highlights; EKS Upgrades & Enhanced Monitoring / Advanced Security & Deployment Improvements

Aram Karapetyan • May 17, 2024

In April our infrastructure management team navigated through 89 tasks related to Infrastructure setups, monitoring improvements, deployments, database management and more. This article delves into the significant work accomplished, highlighting the team's proactive approach to managing different infrastructures for our partners.

Infrastructure and Environment Setup

Last month, we made significant improvements in setting up and optimizing infrastructure environments to enhance performance and reliability for our clients.


Related Tasks


  1. (DMVP-3648, DMVP-3783) Deployed stage environments and set up external health checks.
  2. (DMVP-3679) Implemented EFS Lifecycle Management.
  3. (DMVP-3778) Improved pipeline speed.
  4. (DMVP-3871, DMVP-3862, DMVP-3920) Set up new URLs, adjusted rollout settings, and established VPNs for new database stages.
  5. (DMVP-4054) Upgraded the client's EKS production cluster to 1.27.

Monitoring and Metrics Improvements

The team established monitoring frameworks  to ensure performance tracking and timely alerts for system anomalies.


Related Tasks


  1. (DMVP-1819, DMVP-3493) Set up alerts for blocked requests and job queue lengths on Grafana dashboards.
  2. (DMVP-3279) Added maximum metrics to all widgets that previously displayed averages.
  3. (DMVP-3907) Introduced log-based metrics and anomalies monitoring.
  4. (DMVP-4033) Adjusted Cloudwatch dashboards and set new limits for CPU/memory monitoring
  5. (DMVP-4026) Set up alarms to be sent to OpsGenie.

Support and Troubleshooting

Ongoing support and rapid troubleshooting were key aspects of our service, ensuring seamless operations for our clients.


Related Tasks


  1. (DMVP-3394, DMVP-4040, DMVP-4016) Supported developers with tracing setup and addressed various production issues including database migrations, job failures, and service restarts.
  2. (DMVP-3394, DMVP-4040, DMVP-4016) Supported developers with tracing setup and addressed various production issues including database migrations, job failures, and service restarts.



Security and Compliance Enhancements

By keeping a close eye on security and regularly checking we made sure our client's systems were safe and met official standards.


Related Tasks


  1. (DMVP-3754, DMVP-3755, DMVP-3878) Moved passwords and external secrets to secure repositories and created modules for managing secrets in GCP.
  2. (DMVP-3955, DMVP-3952) Increased WAF alarm thresholds and reviewed security group resources.


Development and Deployment Processes

We improved the efficiency and reliability of development and deployment processes through automation and enhancements in our pipelines.


Related Tasks


  1. (DMVP-2832) Provided support for specific branch deployments to test instances.
  2. (DMVP-3842, DMVP-3938) Developed new pipelines for components and integrated deployment tools like deptrac into existing pipelines.
  3. (DMVP-3899, DMVP-3900) Reviewed and fixed naming for dashboard widgets and alarms to improve clarity and consistency.


Application and Database Management

Our focus on application and database management ensured optimized operations and scalability for our clients.


Related Tasks


  1. (DMVP-3919) Migrated data from development databases to new staging databases.
  2. (DMVP-3921, DMVP-3903) Handled various application-specific issues like fixing UI kit npm download problems and adjusting application settings in Keycloak containers.


Those were the key achievements in April for different clients.


Stay tuned for the May updates and let us know if you need our cloud infrastructure management best practices.

Share by: