Our journey to GitOps
Originally published on an external platform.
In this blog, I will share our journey from being legitimate script kiddies to a GitOps-enabled Infrastructure. Like every other DevOps Team, we had our fair share of chaos; we owned a combination of tools depending on who was working and what problem they were trying to solve.
We thought we had a well-defined process for provisioning infrastructure:
Python/Ruby/Bashfor spinning up Cloud Components,Cheffor configuration management,Jenkinsto run Adhoc Jobs, and GitHub for storing changes (if we remembered to do so đ€Ș).
However, what we ended up with was a bunch of disparate scripts, no real version control concepts, Chef cookbooks with no idempotency and conflicting versions, and often no GitHub check-ins.
If I am being honest, we were in a very chaotic situation with no confidence in our infrastructure. Outages were happening left, right, and center, and often we didnât know why.
So we thought: What should we do to fix this? How can we reach a state where we have confidence in our infrastructure and start following standards? đ€
The Plan
We decided on a roadmap to guide us toward a stable, automated future:
- Become 100% GitOps Compliant.
- Become Cloud Agnostic.
- Maintain Infrastructure with a well-defined state and only one source of truth.
- Make Application configuration part of provisioning, with provisions to change or update them using the same process.
- Make monitoring part of provisioning.
- Implement Version Certifications.
- Eliminate most manual tasks and automate them.
The Framework
We developed a framework with the following tools to achieve our goals:

Let me explain the role of each tool:
- Terraform: We used Terraform for cloud provisioning with both Amazon and Azure. We adopted the modules mechanism where a component will call multiple modules based on the version.
- HashiCorp Vault: Used to store all the credentials required for provisioning, such as Cloud credentials and Git credentials.
- Jenkins: Used to run the provisioning, modification, updates, and destruction pipelines.
- Golang: Used to write a custom tool to knit all these technologies together.
- GitHub: Used as our absolute source of truth.
The Architecture
Here is how the whole flow looks, from start to end:

With this automation, we were able to accomplish the following:
- GitHub is our one source of truth.
- A well-defined state for our infrastructure.
- Proper lifecycle management of resources.
- Security is a first-class citizen.
- Little to no manual work related to provisioning or updates.
- Proper version control and standardization.
Version Certification
We also spent a good amount of time on Version Certification, where we certify terraform module versions with each other and maintain a version-controlled component package. A component package is simply a combination of specific Terraform modules tested together.
Figure: Component Packaging Structure
The Power of Profiles
To streamline our infrastructure delivery, we introduced the concept of Infrastructure Profiles. Instead of requiring developers to understand every nuance of VPC CIDRs or subnet IDs, we abstracted these into pre-defined archetypes.
- Standardized Blueprints: We created Profiles based on specific use-cases (e.g., âInternal Toolingâ, âHigh-Traffic APIâ, âDatabase Clusterâ).
- Automated Injection: Most Terraform variables are now populated automatically simply by selecting a profile type. This includes complex networking configurations like VPC Type, Subnets, and Load Balancer (ELB/ALB) attributes.
- Reduced Friction: This abstraction allows our engineering teams to focus on their application logic rather than the plumbing of the cloud provider.
- Consistency: By using profiles, we ensure that every environmentâfrom Dev to Prodâfollows the same structural standards, eliminating the âit worked in Devâ surprises during production rollouts.
The Certification Workflow
This is how the whole Certification Process looks:
Conclusion
Overall, we were able to achieve the goals we set out for. Now, all our deployments are initiated by one single git commit. Further modifications, updates, and destruction also follow the same controlled flow. We are now far more confident in our infrastructure. Every change we introduce is thoroughly tested. No more dealing with N number of manual configuration files or scripts.
This process also made our lives very easy for cost optimization and resource lifecycle management.
Rollbacks became straightforward as each resource has a build tag and git commit tag attached to it, allowing us to rollback whenever necessary.

Maybe what we did isnât extraordinary compared to what the industry giants or the community are already doing. However, when I look back at where we started, we have come a long way, and that gives us a great sense of accomplishment. đ€©
Hope this gives you the motivation to boost automation and reduce engineering toil. We can truly achieve great things with simplicity and the tools already available to us.