Senior Service Reliability Engineer Job at Thoughtworks (Singapore)

Senior Service Reliability Engineer

Thoughtworks

Location:
Singapore , Singapore

Category:
IT - Software Development

Contract Type:
Not provided

Salary:

Not provided

Save Job

Apply Position

Job Description:

As a Service Reliability Engineer (SRE) in DAMO service line, you will take a multifaceted approach to ensure technical excellence and operational efficiency within the infrastructure domain. Specializing in reliability, resilience and system performance, you take a lead role in championing the principles of Site Reliability Engineering. By strategically integrating automation, monitoring and incident response, you facilitate the evolution from traditional operations to a more customer-focused and agile approach. Emphasizing shared responsibility and a commitment to continuous improvement, you cultivate a collaborative culture, enabling organizations to meet and exceed their reliability and business objectives.

Job Responsibility:

You will conduct SRE and Disaster Recovery (DR) maturity assessments
You will engineer automation solutions using Ansible to replace manual workflows
You will own and manage the current manual Disaster Recovery process/pipeline
You will improve site reliability through mechanisms and architectures that enhance fault tolerance and reduce MTTR/MTTD
You will drive the integration of observability automation into the CI/CD pipeline
You will handle production incidents, lead client communication, and create root cause analysis documentation
You will monitor performance of production systems and improve scaling to meet SLA and SLO targets
You will work closely with application development teams to advise and implement reliability improvements
You will improve system observability across logging, metrics and alerting, reducing false alarms to eliminate unnecessary toil and improving overall process efficiency, while implementing chaos engineering practices to regularly validate system reliability
You have a clear understanding of client goals and business needs, setting direction for site reliability in alignment with business expectations - including high availability targets such as 99.999% with minimal/no disruption where required.

Requirements:

You have expertise in Ansible orchestration including advanced strategies, failure logic handling, and Jinja2 templating
You have the ability to integrate Terraform with Ansible for seamless provisioning-to-configuration workflows
You have hands-on experience with Python, Go, Bash or PowerShell scripting
You have working knowledge of at least one public cloud (AWS/Azure/GCP)
You have experience with observability tools (Grafana, Datadog, NewRelic, ELK, Dynatrace, etc.) and can use data for RCA
You have familiarity with DevOps, SRE and GitOps concepts and practices
You have knowledge of container technologies and orchestration (Kubernetes, EKS, Docker Swarm, Nomad, etc.)
You have understanding of modern architecture (microservices, serverless, NoSQL, REST APIs) and experience debugging and building metrics/dashboards
You have experience designing infrastructure aligned with Cloud Well-Architected principles (reliability, security, cost, performance, operations)
You are able to mentor team members through workshops and knowledge enablement
You are able to create comprehensive documentation and runbooks
You have strong communication and articulation skills in English
You have strong collaboration and negotiation skills with client and cross-functional teams
You have a resilient problem-solving mindset and don’t give up easily when debugging issues
You can remain calm and composed during high-pressure production incidents
You can recommend improvements backed by strong technical reasoning
You can understand both business and technical requirements and break them down into deliverables
You have strong ownership and willingness to take responsibility beyond strict role boundaries
You are willing to participate in rotation-based or need-based 24x7 availability support
Candidates must be Singaporean citizens or already hold Singaporean Permanent Residency (PR) at the time of application.

What we offer:

Learning & Development: There is no one-size-fits-all career path at Thoughtworks: however you want to develop your career is entirely up to you. But we also balance autonomy with the strength of our cultivation culture. This means your career is supported by interactive tools, numerous development programs and teammates who want to help you grow. We see value in helping each other be our best and that extends to empowering our employees in their career journeys.

Additional Information:

Job Posted:
January 12, 2026

Employment Type:

Fulltime

Thoughtworks - All Job Offers

Job Link Share:

Senior Service Reliability Engineer

Thoughtworks

Location:
Singapore , Singapore

Category:
IT - Software Development

Contract Type:
Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Additional Information:

Job Posted:
January 12, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for Senior Service Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior Software Engineer, Site Reliability

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior Service Reliability Engineer

Thoughtworks

Location:Singapore , Singapore

Category:IT - Software Development

Contract Type:Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Additional Information:

Job Posted:January 12, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for Senior Service Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior Software Engineer, Site Reliability

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Location:
Singapore , Singapore

Category:
IT - Software Development

Contract Type:
Not provided

Job Posted:
January 12, 2026