CrawlJobs Logo

Lead Application Reliability Engineer

https://www.citi.com/ Logo

Citi

Location Icon

Location:
United States, Irving

Category Icon
Category:
IT - Software Development

Job Type Icon

Contract Type:
Employment contract

Salary Icon

Salary:

125760.00 - 188640.00 USD / Year

Job Description:

The selected candidate will become the key engineer in supporting and advancing the platform used for threat-modeling process in Citi. The responsibilities will cover (among others) maintaining and supporting the threat-modeling application as well as developing relevant tools used throughout the threat-modeling process. The application is comprised of web servers and backend data storage databases and supporting it requires understanding of middleware, database, container, and AWS cloud environment as well as change-control and compliance processes.

Job Responsibility:

  • Ensure high availability and optimal performance of the threat-modeling application through proactive monitoring, incident management, and efficient troubleshooting
  • Perform routine and emergency application and infrastructure maintenance, including patching, upgrades, and configuration management, adhering strictly to change control procedures
  • Conduct root cause analysis (RCA) for production incidents and implement preventative measures to minimize future occurrences
  • Develop and maintain automation scripts and tools (e.g., using Python, Bash) to streamline operational tasks, improve monitoring, and facilitate efficient deployments
  • Proactively identify, recommend, and implement enhancements to existing application maintenance practices, operational workflows, and system reliability
  • Serve as a technology subject matter expert for internal and external stakeholders, contributing to technology domain roadmaps and firm-mandated controls and compliance initiatives
  • Appropriately assess and mitigate risk in all technical decisions, ensuring compliance with applicable laws, rules, regulations, and internal policies, while escalating and reporting control issues with transparency
  • Present technical work to senior stakeholders, the team, and other technical teams
  • Mentor and train junior team members, fostering a culture of knowledge sharing and continuous improvement

Requirements:

  • 6+ years of relevant experience in an Engineering role, preferably in Financial Services or a large, complex, and/or global environment
  • Experience managing and troubleshooting Linux Operating Systems (e.g., Red Hat Enterprise Linux (RHEL), CentOS, Ubuntu), including System Administration Tasks like User Management, Service Restarts, and File System Checks
  • Proficiency in Scripting for Automation (e.g., Bash, Python) and with Configuration Management Tools (e.g., Ansible, Puppet, Chef) for system administration and infrastructure automation
  • Experience with container orchestration using Helm and Kubernetes on platforms like AWS EKS, GCP GKE, or OpenShift
  • Working knowledge of Relational Databases (e.g., PostgreSQL), including basic querying
  • Proven track record of maintaining applications and their technology stacks compliant with security and configuration requirements, including successfully passing internal and external security audits by demonstrating secure configuration of applications and infrastructure and ensuring continuous compliance with regulatory standards (e.g., SOX, GDPR) through automated checks and reporting
  • Demonstrated adherence to strict change control procedures, executing all changes through a formalized change management process (e.g., ITSM, ServiceNow) with proper documentation and approvals
  • Experience with Ticketing Systems (e.g., Jira, ServiceNow)
  • Working understanding of Middleware Components (e.g., Nginx, Tomcat or equivalents)
  • Familiarity with Development Concepts (e.g., Git, CI/CD, Pipelines, SDLC)
  • Strong communication skills, both written and verbal, for technical and non-technical audiences
  • Demonstrated analytical and diagnostic skills, with an ability to identify process improvements and best practices
  • Ability to work independently, manage multiple tasks, take ownership of initiatives, and operate effectively in a matrixed environment under pressure and tight deadlines
  • Associate Level Certification Required: (Require a Minimum of 1 or more of the following) Kubernetes and Cloud Native Associate (KCNA), Certified Kubernetes Application Developer (CKAD), Certified Kubernetes Administrator (CKA), Kubernetes and Cloud Native Security Associate (KCSA), Red Hat Certified System Administrator or like certification, AWS Certified Developer, AWS Certified SysOps Administrator, CompTIA Cloud+, Google Associate Cloud Engineer or other GCP certification, HashiCorp Certified: Terraform Associate
  • Bachelor's degree/University degree or equivalent experience

Nice to have:

  • Associate Cybersecurity Certification: GIAC Security Essentials (GSEC), ISC2 Systems Security Certified Practitioner (SSCP), CompTIA CySA+, Microsoft Certified: Security Operations Analyst Associate
  • Information Protection Administrator Associate
What we offer:
  • medical, dental & vision coverage
  • 401(k)
  • life, accident, and disability insurance
  • wellness programs
  • paid time off packages, including planned time off (vacation), unplanned time off (sick leave), and paid holidays
  • discretionary and formulaic incentive and retention awards

Additional Information:

Job Posted:
October 18, 2025

Expiration:
November 03, 2025

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:
Welcome to CrawlJobs.com
Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.