Satyajit Roy
Engineering Executive | Platform, SRE & AI Infrastructure
I design, scale, and lead hyperscale platforms delivering reliability, efficiency, and clarity at internet scale.
Strategic Expertise
What I bring to engineering organizations- Leadership of global engineering organizations
- Architecting hyperscale systems
- Turning reactive engineering into proactive, high‑velocity excellence
- Delivering 20-65% cloud cost reductions via FinOps
- Scaling AI/ML and High-Performance infrastructure
- Bridging technical risk with strategic business value
Featured Case Studies
View All Work →Zero to Production-Grade: Rebuilding Mandolin's Entire Cloud Platform from Scratch
Engineering Leader — MandolinMandolin is a healthcare AI startup ($40M Series A, Greylock), operating in hyper-growth with HIPAA/GDPR compliance requirements and enterprise healthcare tenants.
Four deployment systems, no source of truth, every service on a public endpoint no segmentation, no mTLS, static credentials scattered across codebases. No DR, no defined RTO. Huge operating cost with 45 to 1 hour per build and zero developer self-service.
- 53% infra cost reduction, 67% MTTR reduction, 53 days, solo execution (while building a team from scratch)
- 53% reduction in infrastructure costs while simultaneously growing the resource footprint.
- 67% reduction in MTTR via Resolve.ai-automated incident triage.
Hyperscale ML/Search Platform at Adobe
Architect & Technical LeaderAdobe's Core Search and Sensei platform serves as the intelligence layer behind flagship products, processing 30B+ daily requests.
AI/ML workloads were outgrowing the existing infrastructure, creating scaling, latency, and cost challenges.
- Multi Billion requests, GPU utilization +38%
- Supported 30B+ daily API requests with >99.98% availability.
- Increased GPU utilization by 38% through smarter scheduling.
Enterprise Elasticsearch Consolidation at Adobe
Architect & Technical LeaderAdobe’s search infrastructure was fragmented across 18+ managed clusters with varying versions, driving high licensing costs and operational complexity.
Managed service lock-in and version fragmentation were creating a multi-million dollar licensing burden without the necessary operational control.
- Millions in annual savings, 30% cost reduction
- Reduced annual Elasticsearch licensing costs by millions of dollars (30% net savings).
- Achieved full operational control over search performance and security posture.
Global SRE Operating Model at F5
Sr. Director of Product Engineering & Head of SREF5’s Distributed Cloud platform powers global multi‑cloud networking and security for enterprise customers.
Silos, inconsistent incident response, and burnout were slowing down a platform facing explosive traffic growth.
- 55+ engineers, MTTR −73%
- Reduced MTTR by 73% and improved incident consistency.
- Lowered attrition by 10% by eliminating hero culture.
FedRAMP High & Zero-Trust Architecture at F5
Sr. Director of Product Engineering & Head of SREF5's Distributed Cloud platform required the highest levels of security to serve federal and highly regulated enterprise customers.
Achieving and sustaining high-bar compliance (FedRAMP High) while maintaining rapid feature velocity in a multi-cloud environment.
- FedRAMP High, PCI-DSS, SOC 2
- Successfully achieved FedRAMP High, PCI-DSS, and SOC 2 certifications.
- Accelerated feature velocity by 40% by shifting security and compliance left.
Platform Modernization & FinOps at Arkose Labs
Director of Engineering & SREArkose Labs fights fraud at internet scale, requiring real‑time decisioning under unpredictable attack traffic.
Cloud spend was rising faster than revenue, and technical debt was slowing delivery.
- 22% cloud spend reduction
- Reduced cloud spend by 22% while supporting 7x transaction growth.
- Maintained 99.9% SLA even during attack spikes.
How I Work
Leadership & PhilosophySystems Thinking
I approach engineering organizations as distributed systems—optimizing for flow, feedback loops, and resilience at scale.
Empowered Teams
I build high-trust cultures where engineers own their outcomes, with clear paths for growth and autonomy.
Operational Excellence
Reliability is a feature. I champion SRE principles to shift from reactive firefighting to proactive stability.
Open Source
View All →Writing
View All →The Matryoshka Dolls of Modern Networking: A Technical Evolution
A layered exploration of modern networking — from packets to policy — and how it shapes cloud‑native systems.
And I thought I knew about DNS
A deep dive into DNS resolution, propagation, and common pitfalls in distributed environments.
Git Selective Ignore - Because Sometimes You Need to Keep Secrets from Git (But Not From Yourself)
How to use local ignores to keep secrets out of git without losing your sanity.