CrawlJobs Logo

Sr. AI Site Reliability Engineer

schwab.com Logo

Charles Schwab

Location Icon

Location:
United States , San Francisco

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

190000.00 - 270000.00 USD / Year
Save Job
Save Icon
Job offer has expired

Job Description:

At Schwab, you will build a rewarding career while making a difference in the lives of our millions of clients. Here, innovative thinking meets creative problem solving as we work together to challenge the status quo. We believe in the power of collaboration and value being together in the office, which is why this role is based on-site in our San Francisco office. Joining Schwab means joining a company committed to transforming the financial industry and putting clients at the center of everything we do. Schwab’s AI Strategy & Transformation team, known as AI.x, is the central hub for Artificial Intelligence at Schwab. We are an integrated product, engineering, strategy and risk team, all based in San Francisco. We help set the enterprise vision for AI, invest in the most promising opportunities, and accelerate delivery across the company. We also build the core platform that powers AI at scale and explore next-generation GenAI efforts that will redefine how we serve our clients. As a Senior AI Site Reliability Engineer on AI.x, you will play a key role in ensuring our AI solutions are reliable, scalable, and resilient—enabling us to deliver innovative experiences to millions of clients. This role is more than a reliability engineering position. It is an opportunity to join a high-profile team shaping Schwab’s future with AI, to build and maintain solutions that matter to millions of clients, and to grow your career in one of the most exciting areas of technology today.

Job Responsibility:

  • Design, implement, and manage the reliability and operational excellence of GenAI applications and platforms
  • Work closely with architects, engineers, and business leaders to align reliability practices with Schwab’s enterprise strategy
  • Mentor and coach junior engineers
  • Help to build strong operational practices and foster a culture of continuous improvement
  • Lead by example in solving complex reliability challenges
  • Advance SRE standards
  • Drive rapid iteration from concept to production

Requirements:

  • 8+ years of software development or reliability engineering experience
  • 4+ years as a hands-on senior engineer in startups and/or large organizations
  • Bachelor’s degree in Computer Science or related field
  • 5+ years of experience building and operating complex products from scratch and running them in production
  • 3+ years of experience supporting applications that use Artificial Intelligence (AI) models to deliver real business impact
  • 3+ years of experience building and maintaining data pipelines and infrastructure for large datasets
  • 3+ years of experience with containers and cloud-native applications
  • Ability to operationalize them in the public cloud with infrastructure as code
  • Experience implementing monitoring, alerting, and incident response for large-scale distributed systems
  • Proven track record in driving reliability, scalability, and performance improvements for production AI systems

Nice to have:

  • Strong computer science fundamentals and experience working across different parts of the tech stack
  • Experience working with proprietary or open-source LLMs (Gemini, Claude, OpenAI or other models) and supporting LLM-powered applications in production
  • Focus on quality and reliability in everything you do
  • Experience writing and running evaluations to ensure quality and monitor consistency in LLM-generated responses and actions
  • Strong communication skills
  • Experience mentoring junior engineers
  • Demonstrated mindset of continuous learning and improvement
  • Ability to solve complex problems with ambiguous or incomplete data in highly distributed systems
  • Demonstrated business domain knowledge related to all products you have worked on
  • Curiosity about new technologies and processes
  • Experience with Python and front-end development preferred but not required
  • Master’s or advanced degrees in Computer Science or related fields
What we offer:
  • 401(k) with company match and Employee stock purchase plan
  • Paid time for vacation, volunteering, and 28-day sabbatical after every 5 years of service for eligible positions
  • Paid parental leave and family building benefits
  • Tuition reimbursement
  • Health, dental, and vision insurance
  • Bonus or incentive opportunities

Additional Information:

Job Posted:
February 17, 2026

Expiration:
February 24, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Sr. AI Site Reliability Engineer

Sr. Embedded Software Engineer

Location
Location
Canada , Toronto or Ottawa
Salary
Salary:
Not provided
advancedtechsearch.com Logo
Advanced Technology Search Group
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s in electrical engineering, Computer Engineering, or Computer Science
  • Experience with C/C++
  • Experience writing Python scripts
  • Ability to read and understand board schematics and device datasheets
  • Ability to debug embedded software using Oscilloscopes and Logic Analysers
  • Experience with SCM tools (GIT or SVN)
  • Strong analytical and problem-solving abilities
  • Strong communication skills
  • Ability to work in a multi-site team environment
Job Responsibility
Job Responsibility
  • Design, develop, and optimize embedded software for silicon-based systems throughout the entire lifecycle, from conceptualization to deployment, ensuring seamless integration and optimal performance
  • Collaborate with cross-functional teams including hardware engineers, software developers, and machine learning experts to integrate ML models into embedded systems
  • Architect and implement software frameworks for efficient data processing, device control, and communication protocols
  • Conduct performance analysis, debugging, and optimization of embedded systems for reliability and efficiency
  • Develop software and firmware applications to interact with hardware and third-party interfaces
  • Contribute to the architecture and design of the overall AI solution
  • Develop debug and performance analysis tools for AI solution development
  • Play a role in all the phases of embedded AI software development, from requirement gathering, analysis, design, development, testing and final release to customers
  • Provide clear and timely communication related to status and other key aspects of the project to leadership team
  • Develop and maintain software documentation, including specifications, design documents, and test plans
  • Fulltime
Read More
Arrow Right

Sr. Software Engineer

The Sr. Software Engineer (Site Reliabiilty Engineer) ensures the reliability, s...
Location
Location
United States
Salary
Salary:
Not provided
bamboohealth.com Logo
Bamboo Health
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4+ years of experience in Site Reliability Engineering, Production Support, or a similar role focused on system reliability and operations
  • Strong experience supporting and troubleshooting production systems, including ownership of support tickets and incident response
  • Proficiency in Ruby and the ability to read, debug, and contribute to application code when needed
  • Experience with monitoring, alerting, and observability tools (metrics, logs, traces, dashboards)
  • Solid understanding of SQL and database fundamentals, including performance and troubleshooting
  • Familiarity with cloud platforms (AWS preferred), including serverless architectures and distributed systems
  • Experience using automation, scripting, or tooling (e.g., Python) to reduce operational effort
  • Comfort using or learning AI-supported tools (e.g., ChatGPT, CoPilot, or role-specific tools) to improve daily workflows
  • A forward-thinking, curious mindset with an openness to experimenting with new technologies
  • Strong analytical and problem-solving skills, with sound judgment and creativity in designing solutions
Job Responsibility
Job Responsibility
  • Own the end-to-end lifecycle of production issues, including triage, investigation, incident response, postmortems, and follow-up actions
  • Troubleshoot complex, cross-system issues, identify root causes, and implement long-term fixes
  • Design, implement, and maintain monitoring, alerting, and dashboards to proactively detect reliability and performance issues
  • Use AI-assisted tools responsibly to accelerate debugging, log analysis, incident response, and knowledge sharing
  • Partner with Product, Engineering, and Customer Success to resolve customer-impacting issues efficiently and transparently
  • Reduce recurring operational issues through automation, improved tooling, and process improvements
  • Contribute code to improve reliability, observability, scalability, and operational safety
  • Document incidents and standard operating procedures to improve response consistency and team effectiveness
What we offer
What we offer
  • Receive competitive compensation including health, dental, vision and other benefits
  • Fulltime
Read More
Arrow Right

Sr Platformization/Cloud Automation Engineer

Palo Alto Networks CDSS group is looking for a seasoned platformization and clou...
Location
Location
United States , Santa Clara
Salary
Salary:
104600.00 - 169225.00 USD / Year
paloaltonetworks.it Logo
Palo Alto Networks Italia
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelors/Masters degree in Computer Science or a related field
  • 5+ years of industry experience in engineering
  • Fluent scripting skills (preferably Python or Bash) with deep experience in Unix/Linux systems from kernel to shell and beyond
  • 4+ years of working with Microservices architectures on Kubernetes
  • HandsOn experience with container native tools like Docker, Helm for managing workloads running in Kubernetes
  • Experience managing AWS and GCP at scale, with knowledge of cloud-neutral connectivity between platforms
  • Experience designing and maintaining API specifications using Swagger/OpenAPI, and working with API frameworks such as Apigee to enable secure, scalable integrations
  • HandsOn experience with infrastructure-as-code and automation tools such as Terraform, Ansible, etc.
  • Proficient in CI/CD platforms like GitlabCI, Jenkins, ArgoCD, CircleCI etc.
  • In-depth knowledge of operating systems (processes, threads, concurrency, etc)
Job Responsibility
Job Responsibility
  • Work with development teams to ensure that applications have scalability and reliability built-in from day one
  • Design, review and enhance software architecture to improve scalability, service reliability, cost, and performance
  • Drive platformization by building standardized, self-service infrastructure platforms that improve developer productivity, scalability, and operational efficiency
  • Deploy automation for provisioning and operating infrastructure at large scale
  • Partner with teams to improve CI/CD processes and technology
  • Mentor members of the staff on large scale cloud deployments
  • Drive the adoption of observability practices and a data-driven mindset
  • Setup processes like on-call rotations, Postmortems, Run books to continue supporting the infrastructure owned by the SRE team while finding ways to reduce the time to resolution and improve the reliability of services
  • Support, optimize and deploy mission critical, front-end and back-end production
  • Improving site performance, monitoring, and overall stability of our infrastructure
  • Fulltime
Read More
Arrow Right

Sr. Product Manager - Web Personalization (Agentic Web)

Microsoft’s mission is to empower every person and every organization on the pla...
Location
Location
United States , Redmond
Salary
Salary:
106400.00 - 203600.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Business, Marketing, Communications, Finance, Engineering, or related field AND 5+ years of marketing operations, program management, technical product management or related experience OR equivalent experience
  • 3+ years technology/process improvement experience
  • 8+ years in web personalization, SEO/AEO, or digital experience roles, with demonstrated ownership of roadmaps and measurable outcomes
  • Understanding of information architecture, modular content systems, funnels, attribution constraints, and Core Web Vitals
  • 2+ years experience taking a product, feature, or experience to market (e.g., design, addressing product market fit, and launch, internal tool/framework)
  • Experience with content modeling, metadata/taxonomy design, and modular design systems that support dynamic assembly at scale
  • Understanding of privacy, consent, and data minimization principles for personalization, measurement, and AI-assisted experiences
Job Responsibility
Job Responsibility
  • Own the Agentic Web strategy and operating model: Define the annual vision and roadmap across AI discovery, personalization, and on-site agent journeys, with clear intake, prioritization, and operating rhythms to sustain optimization at scale
  • Run AI discovery (AEO) end to end: Set standards for answer-ready content and machine interpretability, and continuously monitor, troubleshoot, and evolve performance with SEO, content, and engineering as answer engines change
  • Launch and manage web personalization as a product: Own the full lifecycle from requirements and launch through ongoing roadmap delivery, tuning, and continuous improvement based on business priorities and customer signals
  • Ensure personalization platform reliability: Define SLAs, instrumentation, monitoring, and quality gates (signal freshness, latency, delivery health), and drive incident response and post-incident improvements with engineering and analytics
  • Productize and operate interactive agent experiences: Define agent vision and requirements, then own post-launch iteration of intent capture, journey handoffs, escalation paths, and recommendation quality through measurable learning loops
  • Build and maintain the intent and content model: Steward the unified intent layer, taxonomy, metadata, and module mapping that connects discovery signals, in-session behavior, and agent context so the system stays scalable and governable
  • Lead measurement and experimentation for long-term lift: Own dashboards and performance reviews, run an ongoing test-and-learn cadence to prove incrementality, scale winners, retire underperformers, and align cross-functional teams on actions and outcomes
  • Champion trust, safety, and governance: Embed privacy, accessibility, accuracy, provenance, and brand-voice guardrails into both launches and ongoing operations, ensuring the system remains compliant and trustworthy as content and models evolve
  • Fulltime
Read More
Arrow Right
New

Technical Author-Technical Publications

Position Title: Technical Author-Technical Publications. Reports to: Principal E...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
https://www.randstad.com Logo
Randstad
Expiration Date
April 25, 2026
Flip Icon
Requirements
Requirements
  • Sound technical expertise in the field of technical publications, preferably heavy engineering or automotive
  • Ability to understand/interpret the engineering drawings
  • Good working knowledge creating/updating Operator & Maintenance Manuals (OMM), Service Manuals, Instruction & Training Manuals
  • Proficiency in authoring tools such as Adobe InDesign, Adobe Frame Maker, and Arbortext Editor
  • Knowledge on illustration tools such as Arbortext IsoDraw, Corel Draw, Adobe Illustrator, Adobe Photoshop
  • Working knowledge on CAD tools such as Solidworks, NX would be an added advantage
  • Good communication, and interpersonal skills
  • Graduate in Mechanical Engineering/Automobile, or equivalent from a reputed college
  • Relevant experience of 3 to 5 Years
  • Good in teamwork and co-ordination with other teams
Job Responsibility
Job Responsibility
  • Service and Repair manuals [SRM], Operator & Maintenance Manuals [OMM] using various authoring tools
Read More
Arrow Right
New

Pcb technician

Job Description: Expertise in all type of components soldering Ex. Chip Compone...
Location
Location
India , Pune
Salary
Salary:
Not provided
https://www.randstad.com Logo
Randstad
Expiration Date
April 06, 2026
Flip Icon
Requirements
Requirements
  • Expertise in all type of components soldering Ex. Chip Component, IC’s, through hole components
  • Expertise in component value measurement & component body marking
  • Knowledge about PCB assembly, inspection & Testing
  • Knowledge of pick, Place, Stencil Printer & Reflow Machine
  • Knowledge about electronic circuits & harness diagram
  • Knowledge of assembled PCB board bring up test
  • Maintain 5S
  • Knowledge of electronic components & Connectors
  • experience 5
Read More
Arrow Right
New

TMT Technician

We are seeking a qualified TMT Technician with a B.Sc. degree and 2-6 years of e...
Location
Location
India , Jamshedpur
Salary
Salary:
Not provided
https://www.randstad.com Logo
Randstad
Expiration Date
April 10, 2026
Flip Icon
Requirements
Requirements
  • B.Sc. degree
  • 2-6 years of experience in healthcare
  • Treadmill Test (TMT)
  • Echocardiography (Echo)
  • Strong communication and interpersonal skills
  • Attention to detail
Job Responsibility
Job Responsibility
  • Perform TMT (Treadmill Test) procedures
  • Conduct Echocardiograms (Echo)
  • Ensure patient safety and comfort during tests
  • Maintain and calibrate equipment
  • Record and document test results accurately
  • Collaborate with healthcare professionals
What we offer
What we offer
  • Competitive contract compensation
  • Opportunity to work in a reputable healthcare setting
  • Gain valuable experience in cardiac diagnostics
Read More
Arrow Right
New

Chauffeur ce international - adr

We are looking for an international CE driver with ADR certification for the tra...
Location
Location
Belgium , Fleurus
Salary
Salary:
Not provided
https://www.randstad.com Logo
Randstad
Expiration Date
May 13, 2026
Flip Icon
Requirements
Requirements
  • Valid Category CE driving license with Code 95
  • Medical certificate
  • Driver card
  • Valid ADR certificate
  • Good command of English and/or Dutch, French, or German
Job Responsibility
Job Responsibility
  • Transport of radioactive and nuclear materials across Europe
  • Management of logistical documents such as route paperwork and checklists
Read More
Arrow Right