case studies

Cloud Infrastructure Cost Optimization

Cloud Infrastructure project

Customer Story

Our customer’s cloud infrastructure, hosted on AWS, was experiencing escalating costs due to inefficient resource utilization.

With a dynamic workload that varied significantly between peak and off-peak hours, they were seeking solutions to scale their resources effectively, ensuring optimal performance without incurring unnecessary expenses.

Devops

Published: Feb 22, 2024

Client: Non Disclosure

Industry: Enterprise

Solution

Strategic Resource Scaling with kube-downscaler and Jenkins

Cost optimization is a crucial aspect of managing any cloud infrastructure, especially in environments like Amazon Web Services (AWS) where resource allocation can directly impact expenses.

One effective strategy for cost optimization in AWS is to optimize the utilization of resources within services like Amazon Elastic Kubernetes Service (EKS). Our team proposed a two-pronged approach to address the cost optimization challenge.

Firstly, we focused on optimizing the utilization of resources within their services, particularly through the Amazon Elastic Kubernetes Service (EKS). Here, the kube-downscaler tool played a pivotal role in managing pod scaling within the EKS cluster, enabling automated scaling down during off-peak hours to reduce costs.

Secondly, we established a Jenkins pipeline that allowed developers to manually scale resources, ensuring that scaling decisions could be made swiftly and efficiently in response to changing demands.

Kube-downscaler

kube-downscaler is a tool designed to automatically scale down Kubernetes pods during specified time ranges, typically during off-peak hours, to optimize resource utilization and reduce costs. It achieves this by leveraging Kubernetes’ API to identify and scale down pods within selected namespaces based on predefined schedules.

How Does kube-downscaler work?

Configuration: Administrators define the time ranges during which pod scaling should occur using kube-downscaler’s configuration options. These time ranges dictate when pods will be scaled down (e.g., during non-business hours) and when they should be scaled back up.

Pod Selection: kube-downscaler identifies pods eligible for scaling based on the namespaces specified in its configuration. Administrators can choose to include or exclude specific namespaces from scaling activities. For example, namespaces essential for Kubernetes operation, such as kube-system, are typically excluded to prevent disruption to critical components.

Scaling Actions: During the defined time ranges, kube-downscaler queries the Kubernetes API to retrieve pods within the specified namespaces. It then initiates scaling actions, typically by adjusting the number of replicas for relevant deployments or stateful sets to zero, effectively terminating the associated pods.
Annotation-based Exclusion: Administrators can further customize kube-downscaler’s behavior by annotating namespaces with specific metadata. For instance, by annotating a namespace with kube-downscaler/exclude: “true”, administrators can instruct kube-downscaler to skip scaling actions within that namespace, ensuring critical workloads remain unaffected.

Logging and Monitoring: kube-downscaler provides logging and monitoring capabilities to track scaling activities and ensure proper functioning. Administrators can review logs to verify scaling events and diagnose any issues that may arise.

Decision-making Process for Scaling:

Namespace Annotations: kube-downscaler first checks namespace annotations to determine whether a namespace should be excluded from scaling activities. Namespaces annotated to exclude scaling are skipped.

Time Range Evaluation: During active time ranges specified in the configuration, kube-downscaler evaluates each eligible namespace to determine whether scaling actions should be performed. If the current time falls within a configured scaling window for a namespace, kube-downscaler proceeds with scaling activities; otherwise, it moves on to the next eligible namespace.

Pod Selection: Once kube-downscaler identifies namespaces within active time ranges, it retrieves relevant pods within those namespaces using Kubernetes API queries.

Scaling Actions: Based on the retrieved pods, kube-downscaler initiates scaling actions, typically by adjusting the number of replicas for relevant deployments or stateful sets to zero, effectively terminating the associated pods.

Deployment of kube-downscaler to a Kubernetes cluster:

1. Clone the kube-downscaler Repository

				
					### Clone the kube-downscaler repository to your local machine:

git clone
https://codeberg.org/hjacobs/kube-downscaler.git
				
			

2. Navigate to the Deployment Manifests

				
					### Enter the kube-downscaler directory:
cd kube-downscaler
				
			

3. Apply the Deployment Manifests

				
					### Apply the Kubernetes deployment manifests to your cluster:
kubectl apply -f deploy/
				
			

4. Verify Deployment

				
					### Verify that kube-downscaler is deployed and running
correctly:
kubectl get pods -n kube-downscaler
				
			

In alignment with typical developer work hours, kube-downscaler can be configured to scale up pods during standard working hours, such as Monday to Friday from 9:00 to 18:00.

During this time range, when developers are actively working on projects, it’s essential to ensure that sufficient resources are available to support their activities. Therefore, kube-downscaler will scale up pods to meet the demand and maintain optimal performance levels.

Conversely, outside of the designated working hours, which encompass the rest of the week and times beyond 9:00 to 18:00 on weekdays, kube-downscaler shifts its focus to cost optimization. During these off-peak periods, when developer activity is minimal or nonexistent, kube-downscaler scales down pods to zero replicas within specified namespaces.

By scaling down pods during these times, kube-downscaler helps organizations minimize resource wastage and reduce cloud infrastructure costs without sacrificing performance or availability.

This approach ensures that resources are dynamically allocated based on workload demand, optimizing efficiency and cost-effectiveness across the Kubernetes cluster.

Create Jankins Pipeline:

Parameters

– ACTION: Dropdown list with options “Scale Up” and “Scale Down”.
– excludedNS: Hidden variable containing a comma-separated list of namespaces to exclude, such as “kube-system, kube-public”.

Pipeline Script:

				
					pipeline {
    agent any
    
    parameters {
        choice(name: 'ACTION', choices: ['Scale Up', 'Scale Down'], description: 'Select action to perform')
        string(name: 'excludedNS', defaultValue: 'kube-system,kube-public', description: 'Namespaces to exclude')
    }

    stages {
        stage('Initialize') {
            steps {
                // Split excluded namespaces into a list
                script {
                    env.filteredNS = sh(script: "kubectl get namespaces -o=jsonpath='{.items[*].metadata.name}'", returnStdout: true).trim().split("\\s+")
                    env.excludedNSList = params.excludedNS.split(',')
                }
            }
        }
        
        stage('Calculate Exclusion Time') {
            steps {
                script {
                    // Calculate exclusion time (+4 hours)
                    def excludeUntilDate = new Date() + 4.hours
                    env.excludeUntil = excludeUntilDate.format("yyyy-MM-dd HH:mm")
                }
            }
        }

        stage('Annotate Namespaces') {
            steps {
                script {
                    // Iterate over namespaces and annotate based on action
                    for (String nsName : env.filteredNS) {
                        if (!env.excludedNSList.contains(nsName)) {
                            if (params.ACTION == 'Scale Up') {
                                sh "kubectl annotate ns ${nsName} 'downscaler/downtime=-' --overwrite"
                            } else {
                                sh "kubectl annotate ns ${nsName} 'downscaler/exclude-until=${env.excludeUntil}' --overwrite"
                                sh "kubectl annotate ns ${nsName} 'downscaler/downtime=Mon-Fri 18:01-08:59 UTC' --overwrite"
                            }
                        }
                    }
                }
            }
        }
    }
}
				
			

Jankins Script purpose:

1. Initializes the job by splitting excluded namespaces into a list.
2. Calculates the exclusion time (+4 hours).
3. Iterates over namespaces and annotates them based on the selected action (“Scale Up” or “Scale Down”).
4. Annotations are applied accordingly:

If “Scale Up” is selected, the downtime annotation is removed to scale up pods.
If “Scale Down” is selected, the exclude-until annotation is set to the calculated time, and a downtime annotation is applied to scale down pods during non-working hours.

The implementation of kube-downscaler and the Jenkins pipeline for manual scaling significantly optimized our customer’s AWS resource utilization. By dynamically scaling resources in response to workload demands, we helped them achieve a more cost-effective cloud infrastructure. This strategic approach not only reduced their AWS expenses but also ensured that resources were readily available during peak business hours, maintaining high performance and reliability. 

Before Kube-Downscaler
After Kube-Downscaler

This achievement strengthened our partnership and showcases the impact of effective cloud resource management and strategic scaling on cost optimization in AWS environments.

We are proud of our DevOps team at the same time, our customer is looking forward to continue optimization and enhancement of their Cloud Infrastructure.

be our next success story!

Drop us a line! We care for your Business and are here to answer your questions 24/7