This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Platform at EarnIn strives to operate as a platform-as-product organization. We build the foundation that empowers product teams to ship quickly and safely: golden-path CI/CD, Kubernetes runtimes, observability, self-service workflows, and paved paths exposed through our developer control plane. We hold a high bar for operational excellence and measure our impact through developer velocity, reliability, and cost efficiency.
Job Responsibility:
Design and evolve GitOps-based continuous delivery with Argo CD and Argo Rollouts (progressive delivery, automated rollbacks), integrating with our CI pipelines and standardized Helm/Kustomize workflows
Advance our Kubernetes platform on AWS EKS with strong multi-env hygiene, security (Pod Identity/RBAC), and rollout strategies
partner on cluster-level upgrades and “cluster vending” patterns for safer blue/green upgrades over time
Leverage our developer control plane (Cortex) to expose paved paths, scorecards, and self-service actions (bootstrap, deploy, SLOs, operations) so teams can move from idea to production smoothly
Strengthen observability and operational excellence: SLOs/error budgets, Datadog metrics/traces/logs, RUM (where applicable), and blameless postmortems that lead to preventive actions and automation
Partner with SRE to embed reliability gates into pipelines (pre-merge and pre-deploy validation, canary policies), and improve MTTD/MTTR through better telemetry and predictable rollback strategies
Contribute to and maintain service scaffolds, templates, and shared frameworks that encode standards (testing, security, telemetry), and keep supported language/framework versions aligned to platform baselines
Apply AI responsibly across diagnostics, validation, and CI/CD/ops workflows (e.g., anomaly detection, test generation, performance triage), measuring outcomes and iterating for impact
Requirements:
Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field, or related experience
4+ years in platform, infrastructure, or backend engineering with deep hands-on experience in Kubernetes (preferably EKS) and cloud-native architectures on AWS
Expertise in GitOps and CD (ArgoCD/Argo Rollouts) and CI (GitHub Actions
reusable workflows, shared actions) for multi-service systems at scale
Strong coding skills in Go and/or Python (Java/Kotlin is a plus), treating infrastructure as software
fluency with Helm/Kustomize and Terraform (IaC) for platform automation
Solid observability skills (Datadog APM/metrics/tracing/logs) with a track record of improving reliability and driving SLO/error-budget culture
Experience with service mesh (e.g., Linkerd) and traffic management patterns for progressive delivery and resilience
Demonstrated AI fluency applied to the SDLC (validation, diagnostics, automation) with a bias for measurement and iteration
Excellent communication and collaboration skills
mentoring mindset and ability to influence teams toward consistent standards and safer delivery at speed
Nice to have:
Experience evolving multi-cluster strategies (blue/green upgrades, cluster vending), automated validation (e.g., Testkube), and DR/chaos practices is a plus
Hands-on contributions to developer productivity insights (lead time, change fail rate) and FinOps observability for cost-aware engineering decisions are a plus
What we offer:
healthcare
internet/cell phone reimbursement
a learning and development stipend
opportunities to collaborate with and travel to our Palo Alto HQ and Bangkok Site