Remote Site Reliability Engineer Jobs

Key requirements: 5 years of experience, Network architecture, Observability strategy, Terraform, Datadog, SNMPv2 and SNMPv3, GitHub for CI/CD, Cloud platforms (AWS, Azure, GCP), Scripting/automation (Python, PowerShell)

EverOps

EverOps is a San Francisco-based B2B consulting firm specializing in DevOps and IT services, focusing on cloud operations for innovative companies across various industries.

Remote policy: EverOps is a fully remote company, having operated remotely since its inception, and hires from various locations globally.

Observability Implementation Consultant

7 days ago

Contract

Key requirements: Datadog, Legacy monitoring migration, Observability concepts, Cloud platforms (AWS, Azure, GCP), Infrastructure as Code (Terraform), AI-driven alerting (Watchdog), Serverless observability, Strong troubleshooting, Excellent communication

Arctiq

Arctiq is a Toronto-based B2B DevOps and cloud solution integrator specializing in professional IT services and managed services for enterprise organizations across North America.

Site Reliability Specialist

8 days ago

Key requirements: Terraform, Kubernetes, AWS, GitLab CI/CD, Python, Bash, Go, Prometheus, Grafana, Ansible, Chef, AI-assisted engineering tools

Canada

Ubisoft

Ubisoft is a French video game developer and publisher headquartered in Saint-Mandé, specializing in AAA titles for global consumers in the B2C gaming industry.

Customer Reliability Engineer

8 days ago

Key requirements: 7 years of experience, AWS Cloud (EC2, RDS, S3, VPC, IAM), Network and Security troubleshooting, Scripting (Ruby, Python, Bash, Powershell), Infrastructure as Code (Terraform, CDK, CloudFormation), Production On-call experience, Windows/Linux server administration, Monitoring platforms (CloudWatch, Grafana, Datadog), Experience with AI-driven development environments

OpsGuru, a Carbon60 Company

OpsGuru is a Vancouver-based B2B cloud consulting firm specializing in AWS solutions, data modernization, and generative AI, serving SMBs and enterprises globally.

Remote policy: OpsGuru embraces a remote-first work environment, offering flexibility in work hours and location. While specific hiring regions are not detailed, the company supports a global team, welcoming applicants from various locations.

Senior Cloud Infrastructure Engineer

9 days ago

$150,000 to $175,000 per year

Key requirements: 5 years of experience, GCP, Kubernetes, Terraform, Networking, Production-scale infrastructure, Go, Python, Multi-region infrastructure design, Infrastructure security

ujet.cx

UJET is a B2B SaaS provider headquartered in an unspecified location, specializing in AI-powered Cloud Contact Center solutions to enhance customer experience for enterprises globally.

Senior Site Reliability Engineer (Arlington, VA) - Relocation Provided

9 days ago

Key requirements: 5 years of experience, Active Top Secret clearance, Terraform, Ansible, Kubernetes, CI/CD pipelines, Python, AWS, Grafana, Incident response expertise, SLIs/SLOs design, DoD compliance frameworks

Onebrief

Onebrief is a Honolulu-based B2G SaaS platform specializing in AI-powered workflow software for military planning and command operations, targeting defense organizations globally.

Remote policy: Onebrief operates as an all-remote company, hiring from various regions, with team members collaborating globally, including at military commands.

Build and DevOps Engineer for Compilers

9 days ago

$140,000 to $270,250 per year

Key requirements: 3 years of experience, Docker, Jenkins, GitLab CI/CD, Ansible, Kubernetes, Python, C/C++, Compiler Domain Expertise, Static and dynamic code analysis, Distributed build systems

NVIDIA

NVIDIA Corporation is a Santa Clara-based technology company specializing in designing GPUs and AI solutions for gaming, professional visualization, and cloud services, operating in both B2B and B2C markets globally.

Remote policy: NVIDIA supports flexible remote work arrangements and hires from various regions globally, including the Americas, Europe, Asia, and the Middle East, with roles that may require collaboration across time zones.

Senior Platform Engineer

Key requirements: 5 years of experience, Building Backstage, Implementing Argo Workflows, GitOps with Argo CD, Kubernetes design and operation, Building tests with k6, Policy enforcement with Kyverno, AWS, Google Cloud, or Azure experience, Infrastructure as Code (Terraform), Helm and/or Kustomize, Linux expertise

EverOps

EverOps is a San Francisco-based B2B consulting firm specializing in DevOps and IT services, focusing on cloud operations for innovative companies across various industries.

Remote policy: EverOps is a fully remote company, having operated remotely since its inception, and hires from various locations globally.

Observability Engineer

$100,000 to $150,000 per year

Key requirements: 5 years of experience, Prometheus, Grafana, Datadog, OpenTelemetry, Distributed tracing, High-cardinality metrics, SLOs, CI/CD integration, Linux internals, Container platforms

Bright Vision Technologies

Bright Vision Technologies is a New Jersey-based IT staffing firm specializing in placing technical professionals in software development roles across the U.S. government and enterprise sectors.

OpenShift Platform Engineer

$100,000 to $150,000 per year

Key requirements: 5 years of experience, OpenShift, Kubernetes internals, Linux administration, Infrastructure-as-code (Ansible, Terraform, Helm), CI/CD pipelines (Tekton, Jenkins, Argo CD), Scripting (Bash, Python, Go), Monitoring and logging (Prometheus, Grafana, EFK, Tempo), Container image security, Disaster recovery strategies, Multi-tenant OpenShift platforms

Bright Vision Technologies

Bright Vision Technologies is a New Jersey-based IT staffing firm specializing in placing technical professionals in software development roles across the U.S. government and enterprise sectors.

Senior Infrastructure Engineer

$220,000 to $300,000 per year

Key requirements: 8 years of experience, Cloud infrastructure design, AWS expertise, Terraform, Docker, Kubernetes, Observability with Datadog, Security hardening, Blockchain experience, Node.js performance optimization

MLabs

MLabs is a remote fintech company specializing in DeFi solutions, providing a unified API for financial institutions to access on-chain liquidity across major blockchains.

Remote policy: MLabs supports remote work for positions located within the EMEA region, allowing for flexible hours and a remote-first environment.

NOC Engineer – Tier III

Key requirements: 8 years of experience, IP networking (BGP, OSPF, MPLS), Fiber transport systems (DWDM, CWDM), Juniper (Junos), Cisco, MikroTik, Calix XGS-PON, Fixed wireless and LTE platforms, Advanced diagnostic and analytical skills, Network monitoring tools, Lead technical response during incidents

Vero Networks

Vero Fiber Networks is a Boulder, Colorado-based telecommunications company specializing in fiber-to-the-premise (FTTP) broadband services for underserved communities across the US, operating in both B2B and B2C markets.

Remote policy: Vero Networks offers remote work opportunities for certain roles, including positions like the Director of Fiber Engineering. However, specific hiring locations and broader remote work policies are not clearly defined.

AI Infrastructure & Platform Operations Engineer (remote in the EU)

12 days ago

$60,000 to $67,000 per year

Key requirements: 3 years of experience, NVIDIA GPU infrastructure, Kubernetes platform operations, AI infrastructure or HPC environments, InfiniBand networking, Infrastructure automation technologies, Observability platforms, Large-scale distributed systems, Site Reliability Engineering (SRE)

Mirantis

Mirantis is a Campbell, California-based B2B company specializing in open source cloud computing and Kubernetes-native AI infrastructure, serving diverse sectors including automotive, healthcare, and financial services.

Remote policy: Mirantis supports flexible remote work, primarily within the EU, and offers the option to work from their Helsinki hub, fostering a remote-first culture.

Senior AI Infrastructure & Platform Operations Engineer (remote in the EU)

12 days ago

$90,000 to $105,000 per year

Key requirements: 7 years of experience, NVIDIA GPU infrastructure, Kubernetes in production, Technical leadership, Root cause analysis, Infrastructure automation, Observability platforms, AI infrastructure environments, High-performance networking

Mirantis

Remote policy: Mirantis supports flexible remote work, primarily within the EU, and offers the option to work from their Helsinki hub, fostering a remote-first culture.

Sr. Site Reliability Engineer

18 days ago

Key requirements: 5 years of experience, Dynatrace, AI-assisted development tools, PowerShell, Python, SQL/T-SQL, AIOps, Automation of incident response, Self-healing workflows, Observability strategies, Container orchestration, 24/7 operational flexibility

FreedomPay

DevOps Generalist - 100% Remote (m/f/d)

24 days ago

€60,000 to €80,000 per year

Key requirements: 3 years of experience, Kubernetes, CI/CD, Cloud Services, Automation, Problem-Solving, Communication Mastery, Collaboration Wizardry, Self-organization

Digistore24

Digistore24 is a German-based e-commerce platform specializing in digital product sales and affiliate marketing, operating primarily in the B2B and B2C sectors.

Remote policy: Digistore24 embraces a flexible remote work culture, allowing team members to work from home or coworking spaces, as long as they can ensure reliable internet access. The company operates globally, welcoming talent from various regions.

Systems Engineer (Esports)

24 days ago

Key requirements: 5 years of experience, Ansible, Terraform, CI/CD pipelines, Virtualization technologies, Containerized workloads, Linux, Windows, Automation, Cloud platforms

Riot Games

Riot Games is a Los Angeles-based video game developer and publisher specializing in competitive multiplayer esports titles, operating globally with a B2C business model.

Senior Site Reliability Engineer

25 days ago

Key requirements: 5 years of experience, Terraform, Pulumi, Kubernetes (EKS, GKE), Prometheus, Grafana, Datadog, Go, Python, TypeScript, Incident Management, Service mesh (Istio, Cilium)

2K is a Novato, California-based video game publisher specializing in a diverse range of genres including sports, action, and role-playing, primarily operating in the B2C market.

Senior Reliability Engineer, DGX Cloud

26 days ago

$168,000 to $333,500 per year

Key requirements: 10 years of experience, Large-scale production systems, SLO program establishment, Chaos engineering, Failure injection, Go, Python, GPU infrastructure, HPC infrastructure, Observability tools

NVIDIA

Site Reliability Engineer – AI Applications (M/W/X)

Key requirements: 5 years of experience, SRE experience, AI infrastructure operation, Python, Rust, Cloud platforms, Docker, Kubernetes, CI/CD with GitLab, Distributed systems, Observability practices, Incident management, Capacity planning

France

Ubisoft

Ubisoft is a French video game developer and publisher headquartered in Saint-Mandé, specializing in AAA titles for global consumers in the B2C gaming industry.

Senior AI Infrastructure & Platform Operations Engineer (remote in the EU)

$90,000 to $130,000 per year

Key requirements: 7 years of experience, NVIDIA GPU infrastructure, Kubernetes in production, Technical leadership, Infrastructure automation, Observability platforms, AI infrastructure environments, Root cause analysis, Performance analysis, High-performance networking, Complex incident management

Mirantis

Remote policy: Mirantis supports flexible remote work, primarily within the EU, and offers the option to work from their Helsinki hub, fostering a remote-first culture.

AI Infrastructure & Platform Operations Engineer (remote in the EU)

Remote policy: Mirantis supports flexible remote work, primarily within the EU, and offers the option to work from their Helsinki hub, fostering a remote-first culture.

Principal Site Reliability Engineer

$160,000 to $185,000 per year

Key requirements: 8 years of experience, Kubernetes, Microsoft Azure, Python, Automation tooling, Distributed systems, Observability platforms, Incident response, SaaS platform compliance, Cloud-native services, Root Cause Analysis

Accela

Accela is a San Ramon-based govtech B2G SaaS provider specializing in cloud-based Civic Platform software for state and local governments, enhancing efficiency and citizen engagement.

Remote policy: Accela supports remote work opportunities, with roles such as the Manager of Technical Support being available remotely. However, specific hiring locations are not defined, and candidates are encouraged to inquire about remote work options during the application process.

Senior SRE Engineer (Observability Focus)