Site Reliability Engineer II Job at PagerDuty (Toronto)

Site Reliability Engineer II

PagerDuty

Location:
Canada , Toronto

Category:
IT - Software Development

Contract Type:
Not provided

Salary:

115000.00 - 165000.00 CAD / Year

Save Job

Apply Position

Job Description:

As an intermediate Site Reliability Engineer on the Core Infrastructure team in our Toronto office, you'll help build and operate the foundational infrastructure that powers PagerDuty's real-time digital operations platform. Our systems support millions of events and alerts daily, enabling customers to detect, respond to, and resolve incidents quickly and reliably. You'll work at the intersection of platform evolution and operational excellence, building and evolving foundational network, compute, and ingress infrastructure while scaling and hardening existing systems. Your work will directly impact the reliability, scalability, and security of the services our customers rely on to keep their businesses running as PagerDuty continues to grow across products, regions, and customer use cases.

Job Responsibility:

Support and improve foundational infrastructure, including networking, compute platforms, Kubernetes clusters, and ingress/traffic management systems
Contribute to the reliability and scalability of PagerDuty's core platform by hardening existing systems and supporting the rollout of new infrastructure capabilities
Participate in agile rituals (standups, planning, retros) and communicate progress/risks early
Stay current on technical trends to suggest innovative tools and approaches to interesting problems
Monitor system health using metrics, logs, and alerts, and participate in 24/7 on-call rotations to help detect, respond to, and resolve incidents

Requirements:

3+ years of experience in Site Reliability Engineering, DevOps, or Platform Engineering roles
Hands-on experience operating Linux-based systems in production environments
Working knowledge of networking fundamentals, such as load balancing, DNS, TLS, and ingress traffic flow
Experience with container orchestration (e.g., EKS, Kubernetes)
Experience working on cloud-native infrastructure (e.g., AWS, GCP, Azure), including networking and compute concepts
Proficiency in at least one programming language (e.g., Python, Ruby, Go, etc.)
Experience with Infrastructure as Code (e.g., Terraform, CloudFormation)

Nice to have:

Experience with AWS cloud networking concepts such as VPCs, subnets, routing, security groups, and load balancers
Experience operating or contributing to production Kubernetes platforms (e.g., EKS), including cluster upgrades, networking, or ingress configuration
Experience with monitoring, observability, and logging platforms (e.g., DataDog, New Relic, SumoLogic, Splunk, Prometheus, Grafana)
Familiarity with service meshes, ingress controllers, or API gateways (e.g., Envoy, Istio, NGINX)

What we offer:

Competitive salary
Comprehensive benefits package
Flexible work arrangements
Company equity
ESPP (Employee Stock Purchase Program)
Retirement or pension plan
Generous paid vacation time
Paid holidays and sick leave
Dutonian Wellness Days & HibernationDuty - companywide paid days off in addition to PTO
Paid parental leave: 22 weeks for pregnant parent, 12 weeks for non-pregnant parent
Paid volunteer time off: 20 hours per year
Company-wide hack weeks
Mental wellness programs

Additional Information:

Job Posted:
February 14, 2026

Employment Type:

Fulltime

Work Type:

Hybrid work

PagerDuty - All Job Offers

Job Link Share: