Remote Site Reliability Engineer Jobs

Explore 69 fresh remote Site Reliability Engineer jobs. Whether you're working from home or from anywhere in the world, our curated listings deliver clear insights for your next move.

Filter by Location

Subscribe to our Telegram bot to receive instant notifications about new remote jobs

TelegramSubscribe Now

Latest Site Reliability Engineer Jobs (69)

Senior Site Reliability Engineer, EU or UK

about 9 hours ago
Full-time
Europe
Key requirements: Linux systems management, AWS or Azure or Google Cloud, Docker, CI/CD pipeline, Prometheus or OpenTelemetry or eBPF, Cloud security and IAM policies, Python, Automation and API coding
Auros

Auros is a Hong Kong-based B2B cryptocurrency market making firm specializing in high-frequency trading and liquidity provision services for the global digital asset market.

Remote policy: Auros Global embraces a hybrid work model, allowing remote and flexible work arrangements while hiring from various regions globally, including the UK and EU.

Senior Platform Engineer

about 9 hours ago
Full-time
Worldwide
$100,000 to $150,000 per year
Key requirements: AWS, Kubernetes, Terraform, CI/CD pipelines, Go, SRE principles, AI-assisted engineering tools, Cloud security, Production observability technologies
Trust Wallet

Trust Wallet is a leading B2C multi-chain, non-custodial cryptocurrency wallet enabling users to manage over 10 million digital assets, headquartered remotely with a global user base.

Remote policy: Trust Wallet operates as a fully remote company, hiring globally with team members working from various countries. Candidates must have the right to work in their respective locations.

Senior Site Reliability Engineer

about 10 hours ago
Full-time
Worldwide
$113,082 to $175,725 per year
Key requirements: 6 years of experience, Puppet, Kubernetes, Python, Linux troubleshooting, Distributed caching systems, TCP/IP, HTTP, TLS, DNS, Incident response, Automation of tasks, Monitoring tools (Prometheus, Grafana)
Wikimedia Foundation

The Wikimedia Foundation is a San Francisco-based nonprofit organization providing free, multilingual educational content through its wiki-based projects, including Wikipedia, targeting a global audience.

Remote policy: The Wikimedia Foundation is a remote-first organization, hiring globally from various countries including the United States, Canada, and many others across different continents. Team members collaborate across time zones, supporting a diverse and inclusive workforce.

Senior Software Engineer, Infrastructure Automation and Distributed Systems

about 16 hours ago
Full-time
North America
$224,000 to $431,250 per year
Key requirements: 12 years of experience, Infrastructure automation, Distributed systems design, Python, Go, Perl, Ruby, Linux, Networking, Storage, Containers, Multi-cloud infrastructure, Kubernetes, OpenStack, Docker, Slurm, NVIDIA Collective Communication Library (NCCL)
NVIDIA

NVIDIA Corporation is a Santa Clara-based technology company specializing in designing GPUs and AI solutions for gaming, professional visualization, and cloud services, operating in both B2B and B2C markets globally.

Remote policy: NVIDIA supports flexible remote work arrangements and hires from various regions globally, including the Americas, Europe, Asia, and the Middle East, with roles that may require collaboration across time zones.

Staff Software Engineer - SRE (Remote)

1 day ago
Full-time
United States
Key requirements: 8 years of experience, Site Reliability Engineering, DevOps, Kubernetes, AWS, On-call experience, Monitoring and alerting, Collaboration across teams
Rula

Rula Health is a remote-first B2C telehealth SaaS platform based in the U.S., specializing in online therapy and psychiatry services for individuals aged 5 and older, addressing over 90 mental health conditions.

Remote policy: Rula Health is a 100% remote-first company, hiring primarily in the United States, with the exception of Hawaii.

Senior Site Reliability Engineer

1 day ago
Full-time
India
Key requirements: 5 years of experience, AWS, Kubernetes, Terraform, Docker, CI/CD, Bash, Python, Authentication technologies, Monitoring tools, Data pipelines with Databricks, Infrastructure cost management
Teikametrics

Teikametrics is a US-based B2B SaaS platform specializing in AI-driven marketplace optimization for e-commerce brands, helping them maximize profitability on platforms like Amazon and Walmart.

Remote policy: Teikametrics embraces a remote-first culture, hiring talented individuals across 25 states in the USA, as well as in China and India, allowing flexibility for employees to work when they are most productive.

Senior Site Reliability Engineer

1 day ago
Full-time
India
Key requirements: 3 years of experience, AWS, Kubernetes, Terraform, Docker, CI/CD, Bash, Python, Authentication technologies, Monitoring tools, DevOps best practices, On-call support
Teikametrics

Teikametrics is a US-based B2B SaaS platform specializing in AI-driven marketplace optimization for e-commerce brands, helping them maximize profitability on platforms like Amazon and Walmart.

Remote policy: Teikametrics embraces a remote-first culture, hiring talented individuals across 25 states in the USA, as well as in China and India, allowing flexibility for employees to work when they are most productive.

Senior Production Engineer

2 days ago
Full-time
United States
$165,000 to $195,000 per year
Key requirements: 5 years of experience, AWS, Kubernetes (EKS), Terraform, Go, Python, CI/CD systems, Observability tools, SLIs/SLOs implementation, GenAI tools
Legion

Legion is a remote B2B SaaS provider specializing in intelligent automation workforce management solutions for labor-intensive industries, headquartered in the United States.

Sr Cloud Engineer (Contract-to-Hire)

2 days ago
Contract
United States
$140,000 to $165,000 per year
Key requirements: 5 years of experience, Cloud-native solutions, HITRUST & SOC2 compliance, Infrastructure automation, Containerization (Docker, Kubernetes), DevSecOps principles, CI/CD pipelines (Azure DevOps), Observability platforms (Datadog), Multi-cloud deployment, Stateful database infrastructure, Microservices architecture
Lirio

Lirio is a U.S.-based healthtech B2B SaaS company specializing in behavioral health interventions and personalized care navigation through its AI-driven platform.

Remote policy: Lirio supports remote work with opportunities for hybrid arrangements for candidates located in Tennessee. Currently, hiring is focused on candidates authorized to work in the US.

Site Reliability Engineer

3 days ago
Full-time
United States
$110,000 to $175,000 per year
Key requirements: 8 years of experience, Linux administration, Python, Cloud platforms (OCI, AWS, GCP), Configuration management (Ansible, Puppet), Database administration (MySQL, MongoDB, PostgreSQL), Production support for large-scale environments, Advanced scripting (Perl, Bash), DevOps tools (Docker, K8s, Gitlab CICD, Jenkins, Terraform), Monitoring best practices (ELK stack, Prometheus, Nagios, Grafana), Technical project leadership
Ooma, INC

Ooma, Inc. is a Sunnyvale-based telecommunications company offering cloud-based VoIP and unified communications services as a SaaS provider, targeting both B2B and B2C markets across the US and Canada.

Site Reliability Engineer - India

3 days ago
Full-time
India
Key requirements: Kubernetes, Docker, Java, Python, Continuous Delivery tools, Unix, Infrastructure components, DataDog monitoring, Automation of operational work, Self-healing patterns, Resiliency patterns
Zimperium

Zimperium, Inc. is a Dallas-based B2B cybersecurity company specializing in mobile security solutions for enterprises, offering real-time protection against mobile threats on iOS and Android devices.

Remote policy: Zimperium supports remote work and is currently hiring for various roles, including remote positions in regions such as India. For specific hiring locations and remote work details, please refer to the company's official careers page.

Software Reliability Engineer - LPU Hardware DataFlow

4 days ago
Full-time
Europe
Key requirements: 8 years of experience, Reliability engineering, Hardware testing, Driver testing, Functional programming (Haskell, Nix), System programming (C++, Rust, Java), Linux scripting (Python, Shell), Automated test pipelines, CI/CD experience, GPU reliability testing, Hardware durability testing, Driver development, Kernel debugging, Reliability standards knowledge
NVIDIA

NVIDIA Corporation is a Santa Clara-based technology company specializing in designing GPUs and AI solutions for gaming, professional visualization, and cloud services, operating in both B2B and B2C markets globally.

Remote policy: NVIDIA supports flexible remote work arrangements and hires from various regions globally, including the Americas, Europe, Asia, and the Middle East, with roles that may require collaboration across time zones.

Senior Network Site Reliability Engineer

4 days ago
Full-time
Asia, Israel
Key requirements: 8 years of experience, Network automation, Prometheus, Grafana, Python, Go, TCP/UDP, BGP, VPN, L2 switching, Firewalls, Load Balancers, SNMP, Syslog, Streaming Telemetry, Mellanox/Cumulus Linux, Palo Alto firewalls, Netscalers, F5 load balancers
NVIDIA

NVIDIA Corporation is a Santa Clara-based technology company specializing in designing GPUs and AI solutions for gaming, professional visualization, and cloud services, operating in both B2B and B2C markets globally.

Remote policy: NVIDIA supports flexible remote work arrangements and hires from various regions globally, including the Americas, Europe, Asia, and the Middle East, with roles that may require collaboration across time zones.

Senior DevOps / SRE Engineer

4 days ago
Full-time
United States
$120,000 to $150,000 per year
Key requirements: Kubernetes (EKS), Blockchain reliability, Zero-downtime operations, CI/CD pipeline development, Observability tooling, Real-time systems, Infrastructure as Code (IaC), Incident leadership, Security focus
MLabs

MLabs is a remote fintech company specializing in DeFi solutions, providing a unified API for financial institutions to access on-chain liquidity across major blockchains.

Remote policy: MLabs supports remote work for positions located within the EMEA region, allowing for flexible hours and a remote-first environment.

Senior DevOps/SRE Engineer

5 days ago
Full-time
Worldwide
$100,000 to $150,000 per year
Key requirements: 6 years of experience, AWS services, Kubernetes, Terraform, GitLab CI, VictoriaMetrics, Prometheus, Grafana, Apache Kafka, Bash, Python, Go
capital.com

Capital.com is a Cyprus-based fintech B2C online trading platform specializing in CFDs and spread betting across over 3,000 global financial markets.

Remote policy: Capital.com offers remote work opportunities, including the flexibility to work from various locations, with team members enjoying benefits such as 30 extra days to work remotely from anywhere in the world.

Senior Site Reliability Engineer- Remote

6 days ago
Full-time
United States
Key requirements: 8 years of experience, Go, Python, AWS, Azure, Google Cloud, Kubernetes, Ansible, Terraform, Distributed databases, ClickHouse, Incident management, Post-mortem analysis, Problem solving, Accountability
ClickHouse

ClickHouse is a San Francisco-based B2B open-source column-oriented database system specializing in real-time analytics and SQL querying for enterprises globally.

Remote policy: ClickHouse is a globally distributed and remote-friendly company, operating in 20 countries, allowing for flexible work arrangements across various regions.

Senior Site Reliability Engineer- Remote

6 days ago
Full-time
Worldwide
$141,000 to $208,000 per year
Key requirements: 8 years of experience, Go, Python, AWS, Azure, Google Cloud Platform, Kubernetes, Docker Swarm, Ansible, Terraform, Puppet, Distributed databases, SQL, ClickHouse
ClickHouse

ClickHouse is a San Francisco-based B2B open-source column-oriented database system specializing in real-time analytics and SQL querying for enterprises globally.

Remote policy: ClickHouse is a globally distributed and remote-friendly company, operating in 20 countries, allowing for flexible work arrangements across various regions.

Senior Cloud Network Engineer (US Remote)

7 days ago
Full-time
United States
$110,000 to $140,000 per year
Key requirements: 7 years of experience, AWS network security, Palo Alto NGFW, Multi-cloud network management, CloudFormation, Terraform, IP overlap and static routing, Public Cloud architecture (Azure/AWS), Network performance testing methodologies, Centralized cloud connectivity design, Automation methodologies
First Advantage

First Advantage is an Atlanta-based HR Tech B2B SaaS provider specializing in global background screening and compliance solutions for various industries.

Remote policy: First Advantage offers flexibility with the possibility to work remotely, supporting a global workforce across various regions. Team members are located in 17 countries, allowing for collaboration across time zones.

Site Reliability Engineer

7 days ago
Full-time
Worldwide
Key requirements: 3 years of experience, SLIs/SLOs definition, Multi-tenant SaaS platforms, Datadog, Grafana, Elastic Stack, Kubernetes, High-availability architectures, Incident response leadership, Automation and process improvement, Cloud experience (Azure preferred), Capacity planning and load testing
HostPapa

HostPapa is a Canadian-based web hosting company offering B2B and B2C solutions, including shared, reseller, and VPS hosting services, with a focus on small businesses and a global presence.

Remote policy: HostPapa offers remote work opportunities and hires from various locations, with team members and customers in 39 countries around the globe.

Senior HPC Site Reliability Engineer

8 days ago
Full-time
Asia, Israel
Key requirements: 8 years of experience, HPC infrastructure design, Large scale compute architecture, Job schedulers (LSF, SGE, SLURM), Cluster configuration management (Ansible, Puppet), Public cloud services (AWS, Azure, Google Cloud), Script-writing (Python, Bash, Perl), PaaS microservices (Docker, Kubernetes), Distributed storage solutions, Linux performance optimization, Kubernetes deployment management
NVIDIA

NVIDIA Corporation is a Santa Clara-based technology company specializing in designing GPUs and AI solutions for gaming, professional visualization, and cloud services, operating in both B2B and B2C markets globally.

Remote policy: NVIDIA supports flexible remote work arrangements and hires from various regions globally, including the Americas, Europe, Asia, and the Middle East, with roles that may require collaboration across time zones.

Systems Engineer

8 days ago
Full-time
United States
$78,900 to $116,760 per year
Key requirements: 5 years of experience, Windows server administration, Linux server administration, SRE methodologies, DevOps methodology, VMware/Nutanix administration, Incident management, Automation focus, Flexible working hours willingness
rockstargames

Rockstar Games is a New York City-based video game publisher specializing in action-adventure and racing games, operating primarily as a B2C company with a global reach.

Senior Infrastructure Automation Engineer - SCM and HPC AI

8 days ago
Full-time
India
Key requirements: 4 years of experience, Baremetal provisioning automation, Distributed systems architecture, CI/CD systems configuration, Go, Python, Ansible, Linux system administration
NVIDIA

NVIDIA Corporation is a Santa Clara-based technology company specializing in designing GPUs and AI solutions for gaming, professional visualization, and cloud services, operating in both B2B and B2C markets globally.

Remote policy: NVIDIA supports flexible remote work arrangements and hires from various regions globally, including the Americas, Europe, Asia, and the Middle East, with roles that may require collaboration across time zones.

Senior Site Reliability Engineer

8 days ago
Full-time
Worldwide
$120,000 to $180,000 per year
Key requirements: 8 years of experience, Kubernetes, Multi-cloud experience, Terraform, CI/CD processes, Stateful software on Kubernetes, Kubernetes best practices, Debugging Kubernetes clusters, Scripting (bash, Python, Go)
Diagrid

Diagrid is a technology company providing a B2B SaaS platform for workflow orchestration and AI agent development, headquartered in an unspecified location, serving various industries including financial services and healthcare.

Remote policy: Diagrid operates with a fully remote and flexible work environment, supporting collaboration across various regions, including the United States and Europe.

HPC Operations Engineer

8 days ago
Full-time
United States
$124,000 to $241,500 per year
Key requirements: 2 years of experience, Linux systems administration, Workload schedulers (LSF, Slurm), HPC support, Scripting (Bash, Python), Network computing (NFS, LDAP), Technical support experience
NVIDIA

NVIDIA Corporation is a Santa Clara-based technology company specializing in designing GPUs and AI solutions for gaming, professional visualization, and cloud services, operating in both B2B and B2C markets globally.

Remote policy: NVIDIA supports flexible remote work arrangements and hires from various regions globally, including the Americas, Europe, Asia, and the Middle East, with roles that may require collaboration across time zones.

Senior HPC and LSF Operations Engineer

8 days ago
Full-time
United States
$152,000 to $287,500 per year
Key requirements: 5 years of experience, HPC scheduling systems, LSF, Slurm, Linux systems administration, Reliability engineering practices, Observability systems, Container technologies
NVIDIA

NVIDIA Corporation is a Santa Clara-based technology company specializing in designing GPUs and AI solutions for gaming, professional visualization, and cloud services, operating in both B2B and B2C markets globally.

Remote policy: NVIDIA supports flexible remote work arrangements and hires from various regions globally, including the Americas, Europe, Asia, and the Middle East, with roles that may require collaboration across time zones.

Cloud Operations Engineer

8 days ago
Full-time
United States
$110,000 to $125,000 per year
Key requirements: 3 years of experience, Cloudwatch, AWS, CI/CD pipelines, Incident triage, SSL/TLS management, Automation/orchestration tools, Cross-functional collaboration, Exceptional communication skills
Lumin Digital

Lumin Digital is a San Ramon, California-based B2B cloud-native digital banking platform provider, specializing in innovative solutions for financial institutions across the United States.

Remote policy: Lumin Digital operates a remote-first work environment with a hybrid workspace model, supporting remote work from various locations, including the United States. Team members gather twice a year for in-person collaboration.

Site Reliability Engineer - AI & ML Infrastructure (Kubernetes, AWS & Terraform)

9 days ago
Full-time
United States
Key requirements: 5 years of experience, Kubernetes, Terraform, Slurm, High-performance computing, Bare metal infrastructure management, Python, AWS
Deepgram

Deepgram is a San Francisco-based B2B SaaS provider specializing in Voice AI solutions, offering speech-to-text and text-to-speech APIs for developers and enterprises in various sectors.

Site Reliability Engineer - AI & ML Infrastructure (Kubernetes, AWS & Terraform)

9 days ago
Full-time
United States
Key requirements: 5 years of experience, Kubernetes, Terraform, AWS, Slurm, Bare metal infrastructure, High-performance computing, Scripting (Python, Go, Bash)
deepgram

Deepgram is a San Francisco-based B2B AI company specializing in speech-to-text (STT) and text-to-speech (TTS) technologies, providing real-time APIs for developers in the Voice AI industry.

Sr. Site Reliability Engineer

9 days ago
Full-time
United States
$140,000 to $180,000 per year
Key requirements: 10 years of experience, Infrastructure assessments, Hybrid hosting environments, Infrastructure stabilization, Performance engineering, System resilience strategies, Citrix dependency analysis, Operational continuity strategies, Technical documentation skills, Stakeholder communication
Element Solutions

Element Solutions Inc is a US-based specialty chemicals company specializing in manufacturing chemical products for electronics and industrial applications, operating primarily in a B2B model with a global presence.

Remote policy: Element Solutions Inc is a remote-first company, primarily hiring candidates who reside in the Continental US, with team members collaborating across various time zones.

IT Operations Engineer I, Remote

9 days ago
Full-time
United States
Key requirements: 6 years of experience, PowerShell, Python, Azure, ITGC, SOX, SOC II Type II, Identity migrations, VDI management, Incident management, Security monitoring
Aledade

Aledade is an Ashburn, VA-based B2B healthcare company specializing in helping independent primary care practices and health centers build and manage Accountable Care Organizations (ACOs) to enhance value-based care.

Remote policy: Aledade supports flexible work schedules and remote work for many roles, operating across various states in the United States, with team members collaborating nationwide.