AI/HPC System Performance Engineer Job at Meta (Austin)

Sr AI/HPC Applications and Performance Engineer

Sr AI/HPC Applications and Performance Engineer role at Hewlett Packard Enterpri...

Location

United States

Salary:

161500.00 - 370500.00 USD / Year

Hewlett Packard Enterprise

Expiration Date

Until further notice

Requirements

15+ years' experience
Deep expertise in AI and HPC applications and performance engineering including simulation, modeling and emulation capabilities
Expertise in large-scale AI and HPC systems
Experience architecting, designing, and developing innovative software system design tools and languages
Excellent analytical and problem-solving skills
Experience in leading overall architecture of software systems for products and solutions
Designing and integrating efficient and scalable software systems running on multiple platform types into overall architecture
Evaluating and selecting forms and processes for software systems testing and methodology
History of innovation with multiple patents or deployed solutions in the field of software design
Excellent written and verbal communication skills

Job Responsibility

Develops organization-wide architectures, strategies, and methodologies for software systems design and development across multiple platforms and organizations
Identifies and makes informed recommendations regarding new technologies, innovations, and outsourced development partner relationships
Reviews, evaluates, and influences designs and project activities for compliance with development guidelines and standards
Provides tangible solutions that improve product quality and mitigate failure risk
Contributes to domain expertise, business acumen, and experience to influence decisions of executive business leadership
Brings creativity and innovation to the organization
Provides guidance and mentoring to less-experienced team members
Acts as an internal authority on software systems design
Contributes to the external technical community through whitepapers, patents, or other significant innovations

What we offer

Health & Wellbeing benefits
Personal & Professional Development programs
Unconditional Inclusion environment
Comprehensive benefits suite supporting physical, financial and emotional wellbeing

Fulltime

Software Engineer - AI/HPC Specialist

We are looking for software engineers to help scale and improve the efficiency o...

Location

Norway , Oslo

Salary:

Not provided

New

Senior Software Engineer

Microsoft Azure High Performance Computing & AI Engineering (HPC & AI Eng) team ...

Location

United States , Multiple Locations

Salary:

119800.00 - 234700.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, OR Java, JavaScript, or Python OR equivalent experience
3+ years of experience in operating AI/HPC systems, developing and running AI/HPC applications on clusters, or operating Cloud Infrastructure
2+ years of specialized experience with one of AI/HPC system management OR High-Speed Networks OR HPC Storage OR managing Cloud Infrastructure
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter

Job Responsibility

Collaborates with appropriate stakeholders to determine user requirements for a scenario
Drives identification of dependencies and the development of design documents for a product, application, service, or platform
Creates, implements, optimizes, debugs, refactors, and reuses code to establish and improve performance and maintainability, effectiveness, and return on investment (ROI)
Leverages subject-matter expertise of product features and partners with appropriate stakeholders (e.g., project managers) to drive a workgroup's project plans, release plans, and work items
Acts as a Designated Responsible Individual (DRI) and guides other engineers by developing and following the playbook, working on call to monitor system/product/service for degradation, downtime, or interruptions, alerting stakeholders about status and initiates actions to restore system/product/service for simple and complex problems when appropriate
Proactively seeks new knowledge and adapts to new trends, technical solutions, and patterns that will improve the availability, reliability, efficiency, observability, and performance of products while also driving consistency in monitoring and operations at scale

Fulltime

New

Principal Software Engineer

Microsoft Azure High Performance Computing & AI Engineering (HPC & AI Eng) team ...

Location

United States , Multiple Locations

Salary:

139900.00 - 274800.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python - OR equivalent experience
5+ years hands on experience designing and developing high volume low latency pipelines using products such as AzPubSub, Event Hubs, Azure Stream Analytics, Kafka, Grafana, Event Hubs, Prometheus or equivalent products
3+ years of experience with one of AI/HPC system management OR High-Speed Networks OR HPC Storage OR managing Cloud Infrastructure
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter

Job Responsibility

Architect, design and develop high volume low latency end to end event pipelines that can provide first-to-know-insights on events causing job interrupts and job reliability
Conduct analysis of existing event pipelines to evaluate fidelity, granularity and latency of critical events
Contribute to improving key metrics such as Job Mean Time to Interrupt, Nodes in Service, Mean Time to Resolve on flagship supercomputers by enabling data scientists and domain experts to use the telemetry to identify events & issues at the intersection of datacenter and hardware, develop hypothesis, conduct A/B tests and synthesize results
Partner with cross organizational teams to evaluate available telemetry and latency drive architecture, design, development and deployment of end-to-end solutions to manage core infrastructure including current & next generation datacenter, IT hardware, power & cooling technologies
Drive engineering and operational excellence based on issues and learnings from strategic customers on their usage scenarios to improve product features and capabilities
Partner with teams on continuous learning and continuous improvement programs by leading the resolution of complex incidents, driving root cause analyses and championing initiatives to minimize future customer impact

Fulltime

AI Research Lab Research Associate

We are currently seeking highly qualified interns to accelerate research towards...

Location

United States , Milpitas

Salary:

43.27 - 93.15 USD / Hour

Hewlett Packard Enterprise

Expiration Date

May 26, 2026

Requirements

Pursuing PhD degree (or other degree with significant research and innovation experience) in a relevant discipline (e.g. machine learning, computer science, electrical engineering, math, statistics, etc.)
Track record of world-class innovative contributions and ideas in machine learning
Experience with innovative solution development, such as developing proofs-of-concept, first-of-a-kind solutions, and/or technology transfer
Experience in deep learning research
Experience in developing deep learning software with high proficiency in data structures and algorithms
Strong programming skills and experience with Python, C/C++, and preferably Java
Software development experience in Deep Learning, GPU acceleration, and Model Optimization
Experience in Deep Learning and Machine Learning frameworks and models like Tensorflow, PyTorch
Experience in Transformer Neural Network architectures for Generative AI and natural language processing
Experience with Agentic AI and Generative AI workflows - desired

Job Responsibility

Conduct research and come up with solutions with a fast turnaround time
Build the software and applications for Neural Networks and Machine Learning
Work with system programming, Deep Learning frameworks and models, GPU acceleration, Model optimization, real-time streaming data, distributed computing, and deployment
Provide thought leadership and technical influence both internally and externally to HPE
Collaborate with HPE Labs research teams as well as external partners
Work in alignment with HPE's broader innovation community.

What we offer

Health & Wellbeing benefits including physical, financial and emotional wellbeing support
Personal and professional development programs
Unconditional inclusion and flexibility to manage work and personal needs.

Fulltime

Senior Solutions Architect - Data Infrastructure

NetApp is the intelligent data infrastructure company, turning a world of disrup...

Location

United States

Salary:

205700.00 - 266200.00 USD / Year

NetApp

Expiration Date

Until further notice

Requirements

8+ years in solution architecture, systems engineering, or enterprise pre-sales for storage or data infrastructure platforms, with a strong track record of driving technical wins and customer outcomes
Executive Presence & Communication. Exceptional presentation, storytelling, and whiteboarding skills, with the ability to lead technical workshops and executive briefings
Technical Depth. Expertise across NFS, SMB, iSCSI, FC, NVMe, and S3
experience with virtualization and container platforms (e.g., VMware, Kubernetes)
and strong understanding of security, cyber resilience, and AI-adjacent technologies
Hybrid Cloud Knowledge. Practical experience with hyperscaler file and object services, data mobility, and replication strategies
Solution Design Skills. Comfortable producing reference architectures and integration plans spanning compute, networking, and storage

Job Responsibility

Own Technical Win Plans. Partner with enterprise sales and field leadership on priority opportunities. Lead discovery, shape solution strategy, differentiate competitively, and drive the technical win for large, complex deals
Design End-to-End Architectures. Create scalable, resilient, and future-ready architectures across on-prem, cloud-adjacent, and public cloud environments, aligned to customer requirements for performance, availability, security, and total cost of ownership
Act as a Portfolio Evangelist. Represent NetApp’s full data infrastructure vision to customers, partners, and internal stakeholders, connecting portfolio capabilities to real-world customer outcomes
Build Trusted Executive Relationships. Develop and sustain deep relationships with customer technical and business leaders, partners, and alliances. Drive engagement across executive, architecture, and engineering communities
Generate Pipeline with Marketing. Lead webinars, workshops, and Executive Briefing Center sessions
contribute to blogs and video content
present at NetApp INSIGHT
and support regional demand-generation events to open new workloads and buying centers
Mentor and Upskill the Field. Coach Solutions Engineers and partner technical teams on solution domains, reference architectures, and repeatable best practices
Stay Ahead of the Market. Track industry trends, competitive dynamics, and portfolio evolution to provide timely guidance to customers, sales leadership, and field teams

What we offer

Volunteer time off
40 hours of paid volunteer time each year
Well-being
Employee Assistance Program, fitness, and mental health resources to help employees be their best
Time away
Paid time off for vacation and to recharge
Health Insurance
Life Insurance
Retirement or Pension Plans
Paid Time Off

Fulltime

New

Third Chef

Do you have a love of all things food and want to share that passion with others...

Location

United Kingdom , Inverness

Salary:

12.98 GBP / Hour

360 Resourcing Solutions

Expiration Date

Until further notice

Requirements

Experience in food preparation
Comfortable working within a budget
Ability to carry out ordering and stock control

Job Responsibility

Support the Head Chef and Sous Chef
Lay down the foundations of how the hospitality team operate
Prepare seasonal meals for residents and those on specialist dietary requirements
Carry out ordering and stock control
Deliver a balanced, nutritious diet

What we offer

Sociable hours & every second weekend off
Safe working environment with latest PPE and cleaning products
E-learning
12 week induction programme
Ongoing mentorship scheme
Job security during potential future lock down situations
Opportunity to gain national qualifications

Fulltime

New

Senior Systems Architect

Are you ready to take your career to the next level? Join us and work on state-o...

Location

United States , Huntsville

Salary:

Not provided

Arcfield

Expiration Date

Until further notice

Requirements

BS (10-12 years), MS (8-10 years), or PhD (5-7 years) in Systems Engineering, Engineering, or related technical field
Deep expertise in developing and managing system architectures
Experience applying Product Line Engineering (PLE) principles to system or software development
Demonstrated proficiency with pure::variants for feature modeling, variant management, and configuration
Proficiency in one or more MBSE tools and languages, such as: NoMagic / MagicDraw / Cameo Systems Modeler, SysML, UPDM, IBM Rhapsody
Experience working within Agile/SAFe frameworks
Demonstrated technical writing and presentation skills, with the ability to communicate complex concepts clearly to senior leadership and stakeholders
Domain knowledge is strongly preferred in one or more of the following areas: Military weapon systems, Missile Defense Systems, Battle Command Systems, System of Systems Architecture, Open Modular Open Systems Approach (MOSA) principles and standards
Must possess and be able to maintain an active Secret Clearance

Job Responsibility

Lead the development and documentation of comprehensive system architectures using MBSE tools and methods
Apply Product Line Engineering (PLE) approaches to manage commonality and variability across a family of systems
Utilize pure::variants to define and manage feature models, variation points, and valid product configurations
Support federated model management, including project usage, model integration, and maintaining a single source of truth across the enterprise
Define and enforce modeling quality through validation rules, metrics, and adherence to style guide standards
Prepare and deliver technical presentations to senior leadership and external stakeholders
Identify, assess, and manage technical risks
develop and execute mitigation strategies
Conduct trade studies to evaluate design alternatives and support informed decision-making
Oversee configuration management to ensure system baselines are controlled and properly documented

What we offer

Collaborate with world-class MBSE and digital engineering experts
Work on groundbreaking projects with real impact
Contribute to the development of state-of-the-art military systems
Enjoy the freedom to innovate and exercise considerable latitude in determining objectives and approaches
Be part of the vibrant Huntsville defense community, with proximity to Redstone Arsenal and critical defense programs

Fulltime

AI/HPC System Performance Engineer

Meta

Location:
United States , Austin

Category:
IT - Software Development

Contract Type:
Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Nice to have:

Additional Information:

Job Posted:
January 23, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for AI/HPC System Performance Engineer