Satyajit Roy
Engineering Executive | Platform, SRE & AI Infrastructure
I design, scale, and lead hyperscale platforms delivering reliability, efficiency, and clarity at internet scale.
Strategic Expertise
What I bring to engineering organizations- Leadership of global engineering organizations
- Architecting hyperscale systems
- Turning reactive engineering into proactive, high‑velocity excellence
- Delivering 20-65% cloud cost reductions via FinOps
- Scaling AI/ML and High-Performance infrastructure
- Bridging technical risk with strategic business value
Featured Case Studies
View All Work →Hyperscale ML/Search Platform at Adobe
Architect & Technical LeaderAdobe's Core Search and Sensei platform serves as the intelligence layer behind flagship products, processing 30B+ daily requests.
AI/ML workloads were outgrowing the existing infrastructure, creating scaling, latency, and cost challenges.
- Multi Billion requests, GPU utilization +38%
- Supported 30B+ daily API requests with >99.98% availability.
- Increased GPU utilization by 38% through smarter scheduling.
Global SRE Operating Model at F5
Sr. Director of Product Engineering & Head of SREF5’s Distributed Cloud platform powers global multi‑cloud networking and security for enterprise customers.
Silos, inconsistent incident response, and burnout were slowing down a platform facing explosive traffic growth.
- 55+ engineers, MTTR −73%
- Reduced MTTR by 73% and improved incident consistency.
- Lowered attrition by 10% by eliminating hero culture.
Platform Modernization & FinOps at Arkose Labs
Director of Engineering & SREArkose Labs fights fraud at internet scale, requiring real‑time decisioning under unpredictable attack traffic.
Cloud spend was rising faster than revenue, and technical debt was slowing delivery.
- 22% cloud spend reduction
- Reduced cloud spend by 22% while supporting 7x transaction growth.
- Maintained 99.9% SLA even during attack spikes.
Enterprise CI/CD & Platform Modernization at Macys.com
Architect & Technical LeaderMacy’s needed a modern deployment platform to support rapid retail innovation and peak‑season reliability.
Deployments were slow, manual, and risky — causing downtime during revenue‑critical periods.
- Near-zero downtime releases
- Achieved near‑zero downtime releases across the e‑commerce stack.
- Cut deployment time from days to under an hour.
How I Work
Leadership & PhilosophySystems Thinking
I approach engineering organizations as distributed systems—optimizing for flow, feedback loops, and resilience at scale.
Empowered Teams
I build high-trust cultures where engineers own their outcomes, with clear paths for growth and autonomy.
Operational Excellence
Reliability is a feature. I champion SRE principles to shift from reactive firefighting to proactive stability.
Open Source
View All →Writing
View All →The Matryoshka Dolls of Modern Networking: A Technical Evolution
A layered exploration of modern networking — from packets to policy — and how it shapes cloud‑native systems.
And I thought I knew about DNS
A deep dive into DNS resolution, propagation, and common pitfalls in distributed environments.
Git Selective Ignore - Because Sometimes You Need to Keep Secrets from Git (But Not From Yourself)
How to use local ignores to keep secrets out of git without losing your sanity.