Remote Site Reliability Engineer Jobs

Explore 69 fresh remote Site Reliability Engineer jobs. Whether you're working from home or from anywhere in the world, our curated listings deliver clear insights for your next move.

Filter by Location

Subscribe to our Telegram bot to receive instant notifications about new remote jobs

TelegramSubscribe Now

Latest Site Reliability Engineer Jobs (69)

Sr. Reliability Operations Engineer (Mexico)

about 9 hours ago
Full-time
Mexico
Key requirements: 5 years of experience, Incident response leadership, Grafana/Prometheus, GCP Monitoring, Automation scripting, Runbook development, Distributed systems support, IoT device operations, Operational documentation improvement, Incident management tools, Networking fundamentals
Serve Robotics

Serve Robotics is a B2B hardware robotics company based in the U.S. specializing in self-driving delivery robots for last-mile food delivery, targeting urban markets.

Remote policy: Serve Robotics is open to hiring qualified talent working remotely, with a preference for candidates located in the United States. Team members may be based in various locations, allowing for flexible collaboration.

Reliability Operations Engineer (Mexico)

about 9 hours ago
Full-time
Mexico
Key requirements: 2 years of experience, Grafana, Prometheus, GCP Monitoring, OpenTelemetry, Linux, Incident response, Cloud platforms, Runbooks, Distributed systems, IoT systems, Scripting, Networking fundamentals, Jira
Serve Robotics

Serve Robotics is a B2B hardware robotics company based in the U.S. specializing in self-driving delivery robots for last-mile food delivery, targeting urban markets.

Remote policy: Serve Robotics is open to hiring qualified talent working remotely, with a preference for candidates located in the United States. Team members may be based in various locations, allowing for flexible collaboration.

Senior HPC Cluster Administrator - Deep Learning Frameworks Infrastructure

about 20 hours ago
Full-time
Poland
221,250 PLN to 383,500 PLN per year
Key requirements: 5 years of experience, HPC cluster administration, Deep learning frameworks, Linux systems administration, Slurm, Ansible, High-speed networking, Distributed filesystems, MLOps tooling, NVIDIA GPU infrastructure tools
NVIDIA

NVIDIA Corporation is a Santa Clara-based technology company specializing in designing GPUs and AI solutions for gaming, professional visualization, and cloud services, operating in both B2B and B2C markets globally.

Remote policy: NVIDIA supports flexible remote work arrangements and hires from various regions globally, including the Americas, Europe, Asia, and the Middle East, with roles that may require collaboration across time zones.

Kubernetes Engineer (REMOTE)

1 day ago
Full-time
United States
Key requirements: 5 years of experience, Kubernetes, AWS (EKS), CI/CD pipelines, Terraform, Ansible, Scripting (Bash, Python, PowerShell), Containerization, Hybrid environments
Xcellent Technology Solutions (XTS)

Xcellent Technology Solutions (XTS) is a Warrenton, Virginia-based B2G company specializing in geospatial technology and GEOINT services for the defense and federal government sectors.

Systems Software Engineer, AI Infrastructure

2 days ago
Full-time
India
Key requirements: 5 years of experience, Python, C/C++, SRE principles, AI training, deep learning frameworks, cloud platforms, observability platforms, distributed systems
NVIDIA

NVIDIA Corporation is a Santa Clara-based technology company specializing in designing GPUs and AI solutions for gaming, professional visualization, and cloud services, operating in both B2B and B2C markets globally.

Remote policy: NVIDIA supports flexible remote work arrangements and hires from various regions globally, including the Americas, Europe, Asia, and the Middle East, with roles that may require collaboration across time zones.

Senior Linux System Administrator

4 days ago
Full-time
Asia, Israel
Key requirements: 5 years of experience, NVIDIA hardware experience, Bash or Python coding, Zabbix, Prometheus, or Nagios, Infoblox, Jenkins and Git familiarity, Strong system-level understanding
NVIDIA

NVIDIA Corporation is a Santa Clara-based technology company specializing in designing GPUs and AI solutions for gaming, professional visualization, and cloud services, operating in both B2B and B2C markets globally.

Remote policy: NVIDIA supports flexible remote work arrangements and hires from various regions globally, including the Americas, Europe, Asia, and the Middle East, with roles that may require collaboration across time zones.

Staff Cloud Infrastructure Engineer

4 days ago
Full-time
United States
Key requirements: 7 years of experience, AWS, GCP, Azure, Docker, Kubernetes, Terraform, Pulumi, Distributed systems, High availability, Fault tolerance, Cost efficiency, Technical leadership, Infrastructure strategy, Regulated industries
Valon Tech

Valon is a New York City-based fintech company specializing in AI-native mortgage servicing solutions, operating a proprietary platform for both B2B and B2C markets across the US.

Remote policy: Valon supports remote work and hires primarily within the United States, with team members located across various regions. Candidates are encouraged to apply regardless of their specific location within the US.

Senior Cloud Infrastructure Engineer

4 days ago
Full-time
United States
Key requirements: 3 years of experience, AWS, GCP, Azure, Vitess, Clickhouse, Redis, Docker, Kubernetes, Terraform, Pulumi, Distributed systems, Infrastructure-as-code, Product mindset, High-growth environments, Regulated industries
Valon Tech

Valon is a New York City-based fintech company specializing in AI-native mortgage servicing solutions, operating a proprietary platform for both B2B and B2C markets across the US.

Remote policy: Valon supports remote work and hires primarily within the United States, with team members located across various regions. Candidates are encouraged to apply regardless of their specific location within the US.

Staff Site Reliability Engineer

5 days ago
Full-time
Worldwide
$220,000 to $250,000 per year
Key requirements: 15 years of experience, Kubernetes, Terraform, PostgreSQL, Incident Response, Distributed systems, Cloud infrastructure, Automation scripting, Linux operations, CI/CD systems
YugabyteDB

Yugabyte is a Sunnyvale, CA-based B2B SaaS provider of YugabyteDB, a PostgreSQL-compatible distributed SQL database designed for cloud-native applications across industries like cybersecurity and financial services.

Remote policy: YugabyteDB supports remote work and is likely to hire from various regions, with team members collaborating globally across time zones (UTC).

Staff Software Engineer, System Reliability

5 days ago
Full-time
United States, India
$212,000 to $286,200 per year
Key requirements: Distributed systems, Reliability improvements, Gamedays, Chaos testing, Load testing, Reliability scorecards, Observability standards, Post-incident learning, Mentoring engineers
Temporal Technologies

Temporal Technologies is a Bellevue, WA-based B2B software company specializing in an open-source durable execution system, serving diverse industries including fintech and e-commerce.

Remote policy: Temporal Technologies supports remote work for specific roles, including opportunities in the United States and India, with team members collaborating across various time zones.

Site Reliability Engineer II - Government Cloud

5 days ago
Full-time
United States
$96,000 to $120,000 per year
Key requirements: Google Cloud Platform (GCP), FedRAMP compliance, Kubernetes, CI/CD pipelines, NIST 800-53, Infrastructure as Code, Go (Golang)
Ping Identity

Ping Identity is a Denver-based B2B software company specializing in identity and access management (IAM) solutions for enterprise customers across technology, finance, healthcare, and government sectors.

Senior Site Reliability Engineer - Government Cloud

5 days ago
Full-time
United States
$105,000 to $130,000 per year
Key requirements: GCP, FedRAMP compliance, Kubernetes, Go, CI/CD pipelines, Infrastructure as Code, NIST 800-53 controls, Distributed systems architecture
Ping Identity

Ping Identity is a Denver-based B2B software company specializing in identity and access management (IAM) solutions for enterprise customers across technology, finance, healthcare, and government sectors.

Site Reliability Engineer - South Korea

5 days ago
Full-time
South Korea
Key requirements: 5 years of experience, Go, Distributed systems, Cloud-native technologies, DevOps, High-performance computing, Storage systems, Customer collaboration, Automation frameworks
MinIO

MinIO is a Silicon Valley-based B2B open-source object storage provider specializing in high-performance, S3-compatible solutions for cloud-native workloads, targeting enterprises and developers globally.

Sr. Site Reliability Engineer (Database focused)

5 days ago
Full-time
United States
$136,100 to $174,210 per year
Key requirements: 7 years of experience, MySQL, Snowflake, Amazon Redshift, Database migration, Database performance tuning, High availability, AWS (S3, Lambda, IAM), Data pipeline tools (Apache Airflow, AWS Glue), Monitoring tools (CloudWatch, Prometheus), Scripting (Python, Bash)
iSpot.tv

iSpot.tv is a Bellevue, WA-based B2B company specializing in cross-platform TV and video ad measurement solutions, serving the advertising industry with real-time analytics and insights.

Remote policy: iSpot.tv supports a hybrid and flexible workplace, allowing employees to work remotely or in the office based on their location and role. Team members are located in various regions, including Bellevue, WA; El Segundo, CA; and New York, NY, with remote work options available for those outside these areas.

Database Reliability Engineer - Core Team

5 days ago
Full-time
United Kingdom, Germany, Netherlands
Key requirements: 5 years of experience, ClickHouse, SQL databases, Distributed database internals, Shell scripting, Python, C++ reading, AWS, Azure, Google Cloud Platform, Production debugging, Incident response processes
ClickHouse

ClickHouse is a San Francisco-based B2B open-source column-oriented database system specializing in real-time analytics and SQL querying for enterprises globally.

Remote policy: ClickHouse is a globally distributed and remote-friendly company, operating in 20 countries, allowing for flexible work arrangements across various regions.

Senior Site Reliability Engineer

5 days ago
Full-time
North America, South America
$130,000 to $140,000 per year
Key requirements: 3 years of experience, AWS, Kubernetes, Database optimization, Incident response management, Observability tooling, North or South America based
Circle.so

Circle.so is a global SaaS platform for community building and online education, enabling creators and businesses to engage audiences through discussions, courses, and events.

Remote policy: Circle.so is a fully remote company with team members from over 30 countries, hiring globally and supporting candidates in various regions, including preferences for European and North/South American time zones.

Senior Systems Engineer

6 days ago
Full-time
Germany
Key requirements: 5 years of experience, Ansible, Terraform, CI/CD pipelines, Virtualization technologies, Containerized workloads, Linux systems, Windows systems, Automation expertise
Riot Games

Riot Games is a Los Angeles-based video game developer and publisher specializing in competitive multiplayer esports titles, operating globally with a B2C business model.

NOC Engineer – Tier III

6 days ago
Full-time
United States
Key requirements: 8 years of experience, IP networking (BGP, OSPF, MPLS, VLANs, VPNs, QoS), Fiber transport systems (DWDM, CWDM, Ethernet), Juniper (Junos), Cisco, MikroTik, Calix XGS-PON, Fixed wireless and LTE platforms, Advanced diagnostic and analytical skills, Network monitoring and telemetry tools, Lead technical response during major incidents
Vero Networks

Vero Fiber Networks is a Boulder, Colorado-based telecommunications company specializing in fiber-to-the-premise (FTTP) broadband services for underserved communities across the US, operating in both B2B and B2C markets.

Remote policy: Vero Networks offers remote work opportunities for certain roles, including positions like the Director of Fiber Engineering. However, specific hiring locations and broader remote work policies are not clearly defined.

Senior Systems Performance Engineer 

7 days ago
Full-time
United States
$168,000 to $258,750 per year
Key requirements: 5 years of experience, Dynamo, TensorRT, Slurm, BCM, vLLM, SG Lang, Cuda, Cublas, Cutlass, Python, x86/Arm server architectures, GPU computing
NVIDIA

NVIDIA Corporation is a Santa Clara-based technology company specializing in designing GPUs and AI solutions for gaming, professional visualization, and cloud services, operating in both B2B and B2C markets globally.

Remote policy: NVIDIA supports flexible remote work arrangements and hires from various regions globally, including the Americas, Europe, Asia, and the Middle East, with roles that may require collaboration across time zones.

Distinguished Engineer, Cloud Site Reliability Engineering

7 days ago
Full-time
United States
$320,000 to $488,750 per year
Key requirements: 18 years of experience, Cloud infrastructure maintenance, AI development, JAVA, Python, Shell scripting, Distributed systems, REST APIs, SQL/NoSQL databases, Docker, Kubernetes, OpenStack, Machine Learning, Deep Learning, High-performance software design, Scalable software systems
NVIDIA

NVIDIA Corporation is a Santa Clara-based technology company specializing in designing GPUs and AI solutions for gaming, professional visualization, and cloud services, operating in both B2B and B2C markets globally.

Remote policy: NVIDIA supports flexible remote work arrangements and hires from various regions globally, including the Americas, Europe, Asia, and the Middle East, with roles that may require collaboration across time zones.

Senior Forward Deploy Engineer

7 days ago
Full-time
Sweden, Norway, Finland, Denmark
Key requirements: 8 years of experience, Linux, Kubernetes, Networking, GPU-accelerated compute, Edge compute, Modular data centers, AI workloads, Nordic market experience, Technical account management, Customer relationship management
Armada

Armada is a global edge computing startup specializing in IoT and AI solutions for remote areas, headquartered in an unspecified location and primarily serving B2B clients across various industries.

Senior Software Engineer - Site Reliability Engineering

7 days ago
Full-time
United States
$130,000 to $165,000 per year
Key requirements: 5 years of experience, Ruby on Rails, Typescript, AWS, CI/CD frameworks, Infrastructure automation, System monitoring, Observability platform, Performance optimization
Snapsheet Inc

Snapsheet is a Chicago-based B2B SaaS provider specializing in claims management technology for the property and casualty insurance industry, serving clients across the US, Canada, and Europe.

Remote policy: Snapsheet is a remote-first company, hiring from various regions including the United States, Canada, and Europe, with no office attendance required.

Senior Site Reliability Engineer

8 days ago
Full-time
Singapore
Key requirements: 5 years of experience, Cloud-scale production management, AI model APIs, Fault-tolerant cloud architectures, Automated self-recovery systems, IaC (Terraform/CloudFormation), GPU-based workloads, Vector databases, Bash scripting, Python, Web Technologies (HTTP, REST, SSL)
Razer

Razer Inc. is a dual-headquartered gaming hardware and consumer electronics company specializing in high-performance gaming peripherals and laptops, targeting gamers globally.

Helpdesk and Cloud Operations Engineer

9 days ago
Full-time
United States
$70,000 to $100,000 per year
Key requirements: 5 years of experience, Microsoft 365 administration, Cloud infrastructure management (Azure, AWS), Powershell automation, Incident and problem management, Cybersecurity practices (NIST, CIS), CI/CD pipelines, Remote work independence
Gorilla Commerce

Gorilla Commerce is a Westport, CT-based B2C e-commerce company specializing in high-quality, affordable home and pet products, leveraging a data-driven approach to product development.

Remote policy: Gorilla Commerce adopts a remote-first approach, allowing flexible work arrangements for team members. While specific hiring locations are not detailed, the company supports collaboration across various regions.

SRE - Infra

11 days ago
Full-time
Worldwide
Key requirements: Kubernetes (EKS), AWS multi-account management, Terraform/Terragrunt automation, Linux systems, Stateful systems support, Performance debugging, End-to-end system ownership
PostHog

PostHog is a San Francisco-based B2B SaaS platform offering an integrated suite of tools for product engineers to build, test, and analyze software products, with a focus on the global market.

Remote policy: PostHog is a fully remote company with a globally distributed team, currently hiring in time zones between GMT-8 and GMT+2.

Incident Manager

12 days ago
Full-time
United States
Key requirements: 5 years of experience, Incident management, SaaS experience, Healthcare experience, ITIL frameworks, PagerDuty, Jira, Datadog, Root Cause Analysis, Cross-functional leadership, Strategic documentation, Process improvement
SmithRx

SmithRx is a health-tech B2B Pharmacy Benefit Manager (PBM) focused on providing transparent pharmacy benefits and cost-saving solutions to employers and health plans across the U.S.

Site Reliability Engineer, Tech Lead

12 days ago
Full-time
Brazil
Key requirements: 5 years of experience, AWS, Kubernetes, Docker, Cloud Computing, SRE/DevOps, UNIX/Linux, Terraform, Ansible, Python, Project management, Stakeholder collaboration
Loadsmart

Loadsmart is a Chicago-based logistics technology company specializing in innovative freight management solutions for the B2B market.

Remote policy: Loadsmart supports remote work and has a globally distributed team, currently hiring for remote positions in Brazil.

Platform Engineer

13 days ago
Full-time
United Kingdom
Key requirements: Kubernetes (EKS), AWS core services, GitOps (ArgoCD), CI/CD tools, Airflow, DevOps/SRE practices, IaC Tooling (Terraform), Linux
sonyinteractiveentertainmentglobal

Sony Interactive Entertainment is a San Mateo-based global video game and digital entertainment company, primarily B2C, known for the PlayStation brand and its innovative gaming hardware, software, and network services.

Remote policy: Sony Interactive Entertainment supports flexible remote work arrangements, hiring from various regions globally, including locations such as the USA, UK, and Japan.

Senior Software Engineer, SRE

13 days ago
Full-time
Worldwide
Key requirements: AWS expertise, Terraform, Kubernetes fundamentals, Amazon EKS, Go, Python, CI/CD pipelines, GitHub Actions, ArgoCD, Datadog, SLIs/SLOs
Socure

Socure is a U.S.-based B2B SaaS provider specializing in AI-driven identity verification and fraud prevention solutions for enterprises across financial services, e-commerce, and government sectors.

Remote policy: Socure is a fully remote organization, supporting team members across various locations, with some roles requiring in-person engagement in specific regions such as Washington, D.C.

Site Reliability Engineer - Canada Wide - Remote

13 days ago
Full-time
Canada
Key requirements: AWS, chaos engineering, incident management, SLIs/SLOs/SLA, observability, Linux Shell, Python, JavaScript, Java, self-starter
Newton

Newton is a Canadian fintech company specializing in cryptocurrency trading, providing tools for financial freedom in the crypto market.

Remote policy: Newton operates with a remote team across Canada, welcoming applicants from this region to join their innovative and collaborative environment.