A back-end that seamlessly serves AI content to 200 million devices in 190 countries supporting a dozen or so languages and countless third parties requires large infrastructure, with hundreds of clusters, thousands of compute instances, buckets for object storage, stateful data to manage, and secrets to protect.
I wrote a lot of Terraform. Call it 300,000 lines. This purely IaC based approach went a long way towards repeatable, reliable deployments, but it also gave me unforgettable experiences with HCL syntax changes, outdated providers, design patterns & anti-patterns, and all the ways one might get around Terraform's lack of DRY syntax (from workspaces to generators).
To balance scalability, cost management, and management complexity, we started with homegrown service discovery and ultimately graduated to a global service mesh.
I implemented this mesh using technologies such as HashiCorp Console, Vault (mTLS), and Envoy.
I played a key role in solving the problem of facilitating seamless conversations during times when the back end may be in a degraded state.
Viv's third-party integrations and capsule architecture meant that we needed to make provisions for running untrusted code.
I worked with the dev team and red team leads to develop a threat model and sandbox system that used a defense-in-depth approach from layer 7 all the way down the stack.
I implemented layers of isolation at the container, cluster, and network. I built systems to automate static code analysis and other scanning tools used to evaluate code prior to deployment.