Remote Site Reliability Engineer Jobs

Explore 67 fresh remote Site Reliability Engineer jobs. Whether you're working from home or from anywhere in the world, our curated listings deliver clear insights for your next move.

Filter by Location

Subscribe to our Telegram bot to receive instant notifications about new remote jobs

TelegramSubscribe Now

Latest Site Reliability Engineer Jobs (67)

Senior Site Reliability Engineer

about 9 hours ago
Full-time
Worldwide
$113,082 to $175,725 per year
Key requirements: 6 years of experience, Puppet, Kubernetes, Python, Linux system-level troubleshooting, Distributed caching systems, Incident response, Automation of tasks, Open source contribution, Linux kernel tuning, Monitoring infrastructure (Prometheus, Grafana)
Wikimedia Foundation

The Wikimedia Foundation is a San Francisco-based nonprofit organization providing free, multilingual educational content through its wiki-based projects, including Wikipedia, targeting a global audience.

Remote policy: The Wikimedia Foundation is a remote-first organization, hiring globally from various countries including the United States, Canada, and many others across different continents. Team members collaborate across time zones, supporting a diverse and inclusive workforce.

Senior Site Reliability Engineer, Wikimedia Enterprise

about 9 hours ago
Full-time
Worldwide
$116,633 to $181,243 per year
Key requirements: SRE best practices, Kubernetes, CI/CD pipelines, GitOps workflows, Infrastructure as Code, Cloud platforms (AWS, GCP), Observability (Prometheus, OpenTelemetry), Incident response, Capacity planning, Automation tools (Terraform, Ansible)
Wikimedia Foundation

The Wikimedia Foundation is a San Francisco-based nonprofit organization providing free, multilingual educational content through its wiki-based projects, including Wikipedia, targeting a global audience.

Remote policy: The Wikimedia Foundation is a remote-first organization, hiring globally from various countries including the United States, Canada, and many others across different continents. Team members collaborate across time zones, supporting a diverse and inclusive workforce.

Senior Site Reliability Engineer, GeForce NOW

2 days ago
Full-time
United States
$168,000 to $270,250 per year
Key requirements: 8 years of experience, Kubernetes, Automation, Multi-region cloud deployments, Datadog, Prometheus, Deployment pipelines, Go, Python, Bash scripting, Anomaly detection tools, AI usage in SRE
NVIDIA

NVIDIA Corporation is a Santa Clara-based technology company specializing in designing GPUs and AI solutions for gaming, professional visualization, and cloud services, operating in both B2B and B2C markets globally.

Remote policy: NVIDIA supports flexible remote work arrangements and hires from various regions globally, including the Americas, Europe, Asia, and the Middle East, with roles that may require collaboration across time zones.

Kafka Platform Engineer

2 days ago
Full-time
United States
$100,000 to $150,000 per year
Key requirements: 5 years of experience, Kafka internals, Kafka security, Kafka Connect, Schema Registry, Kafka Streams, HA/DR strategies, Python, Terraform, Observability tooling
Bright Vision Technologies

Bright Vision Technologies is a New Jersey-based IT staffing firm specializing in placing technical professionals in software development roles across the U.S. government and enterprise sectors.

Observability Engineer (Prometheus / Grafana / Datadog)

2 days ago
Full-time
United States
$100,000 to $150,000 per year
Key requirements: 5 years of experience, Prometheus, Grafana, Datadog, OpenTelemetry, SRE principles, High-cardinality metrics, Distributed tracing, CI/CD integration, Linux internals, Container platforms
Bright Vision Technologies

Bright Vision Technologies is a New Jersey-based IT staffing firm specializing in placing technical professionals in software development roles across the U.S. government and enterprise sectors.

Site Reliability Engineer (SRE)

2 days ago
Full-time
United States
$100,000 to $150,000 per year
Key requirements: 5 years of experience, Python, Go, Kubernetes, Prometheus, Grafana, CI/CD pipelines, Distributed systems, Incident response, Chaos engineering, Cloud platforms
Bright Vision Technologies

Bright Vision Technologies is a New Jersey-based IT staffing firm specializing in placing technical professionals in software development roles across the U.S. government and enterprise sectors.

DevOps / Infrastructure Engineer

2 days ago
Full-time
North America
$100,000 to $130,000 per year
Key requirements: Tailscale, AWS, Container orchestration, Infrastructure-as-Code, CI/CD, Blockchain familiarity, Financial SRE background
MLabs

MLabs is a remote fintech company specializing in DeFi solutions, providing a unified API for financial institutions to access on-chain liquidity across major blockchains.

Remote policy: MLabs supports remote work for positions located within the EMEA region, allowing for flexible hours and a remote-first environment.

Nutanix Engineer

2 days ago
Full-time
United States
Key requirements: 3 years of experience, Nutanix AOS, Nutanix AHV, Prism management tools, HCI troubleshooting, Client-facing experience, Infrastructure project support, Backup and DR configurations, Managed services experience
MetroSys

MetroSys is a San Diego-based B2B technology solutions and staffing company specializing in IT consulting, backup and recovery solutions, and cloud services for enterprise clients.

Remote policy: MetroSys offers remote work opportunities primarily for candidates located in the United States, with a focus on hiring across the Americas.

Senior Release Engineer

3 days ago
Full-time
Worldwide
Key requirements: 4 years of experience, Kubernetes, Helm, CI/CD pipelines, Infrastructure as Code, GitOps tooling, Security tooling integration, Linux systems, Networking fundamentals, Distributed systems
Onebrief

Onebrief is a Honolulu-based B2G SaaS platform specializing in AI-powered workflow software for military planning and command operations, targeting defense organizations globally.

Remote policy: Onebrief operates as an all-remote company, hiring from various regions, with team members collaborating globally, including at military commands.

Agora - Senior Infrastructure Engineer

4 days ago
Full-time
North America, South America
Key requirements: 5 years of experience, TypeScript, Kubernetes, AWS architecture, Infrastructure-as-code, Distributed systems fundamentals, Reusable systems and abstractions, SLIs/SLOs and incident tooling, Observability practices, GitOps patterns, Internal Development Platforms
Silver.dev

Silver.dev is a Buenos Aires-based B2B talent recruitment platform specializing in connecting venture-backed US startups with vetted software engineers from Latin America.

Remote policy: Silver.dev supports remote work for most engineering roles, primarily hiring from Latin America, with a focus on Argentina and Uruguay. Team members collaborate across Americas time zones.

Agora - Senior Infrastructure Engineer

4 days ago
Full-time
North America, South America
Key requirements: 5 years of experience, TypeScript, AWS architecture, Kubernetes, Infrastructure-as-code, Distributed systems, Reusable systems, SLIs/SLOs, Observability practices, GitOps, Developer Productivity, Internal Development Platforms
Silver.dev

Silver.dev is a Buenos Aires-based B2B talent recruitment platform specializing in connecting venture-backed US startups with vetted software engineers from Latin America.

Remote policy: Silver.dev supports remote work for most engineering roles, primarily hiring from Latin America, with a focus on Argentina and Uruguay. Team members collaborate across Americas time zones.

Site Reliability Engineer

5 days ago
Full-time
Worldwide
Key requirements: Linux/Unix, Cloud providers (AWS, Google Cloud, Azure), Infrastructure provisioning (Terraform, CloudFormation, Ansible), Containerization (Docker), Orchestration (Kubernetes), Monitoring tools (Prometheus, Grafana, Datadog), CI/CD pipelines (Jenkins, GitLab CI, CircleCI), Incident management practices, Scripting (Bash, Perl)
OXIO

OXIO is a North America-based B2B Telecom-as-a-Service (TaaS) platform enabling businesses to build and manage customizable mobile networks through a cloud-based solution.

Remote policy: OXIO supports flexible work arrangements and hires from various regions, with team members located in cities such as New York, Mexico City, and Montreal. Candidates are encouraged to apply regardless of their location.

Kafka Platform Engineer

5 days ago
Full-time
United States
$100,000 to $150,000 per year
Key requirements: 5 years of experience, Kafka internals, Kafka security, Kafka Connect, Schema Registry, Kafka Streams, HA/DR strategies, Python, Terraform, Observability tooling, Confluent platform
Bright Vision Technologies

Bright Vision Technologies is a New Jersey-based IT staffing firm specializing in placing technical professionals in software development roles across the U.S. government and enterprise sectors.

Observability Engineer (Prometheus / Grafana / Datadog)

5 days ago
Full-time
United States
$100,000 to $150,000 per year
Key requirements: 5 years of experience, Prometheus, Grafana, Datadog, OpenTelemetry, SRE principles, High-cardinality metrics, Distributed tracing, CI/CD integration, Linux internals, Container platforms
Bright Vision Technologies

Bright Vision Technologies is a New Jersey-based IT staffing firm specializing in placing technical professionals in software development roles across the U.S. government and enterprise sectors.

Site Reliability Engineer

5 days ago
Full-time
Malaysia
$80,000 to $120,000 per year
Key requirements: 3 years of experience, SLIs/SLOs definition, Multi-tenant SaaS platforms, Datadog, Grafana, Kubernetes, High-availability architectures, Incident response leadership, Automation and process improvement, Cloud experience (Azure preferred), Capacity planning, Resilience testing
HostPapa

HostPapa is a Canadian-based web hosting company offering B2B and B2C solutions, including shared, reseller, and VPS hosting services, with a focus on small businesses and a global presence.

Remote policy: HostPapa offers remote work opportunities and hires from various locations, with team members and customers in 39 countries around the globe.

Senior AI Tools Engineer, SRE Operations - GeForce NOW

7 days ago
Full-time
North America
$144,000 to $230,000 per year
Key requirements: 5 years of experience, Python, AI tools development, LLM-based systems, Kubernetes, AWS, Data pipeline management, Automation, SRE principles, Monitoring tools (Grafana)
NVIDIA

NVIDIA Corporation is a Santa Clara-based technology company specializing in designing GPUs and AI solutions for gaming, professional visualization, and cloud services, operating in both B2B and B2C markets globally.

Remote policy: NVIDIA supports flexible remote work arrangements and hires from various regions globally, including the Americas, Europe, Asia, and the Middle East, with roles that may require collaboration across time zones.

Migration Support Engineer (Romania)

7 days ago
Contract
Romania
$30,000 to $40,000 per year
Key requirements: OpenStack, Kubernetes, VM migration, Customer-facing support, Problem-solving, Automation, Scripting (Python, Bash, Go), Incident management (Sev1/Sev2)
Platform9

Platform9 is a B2B SaaS provider specializing in enterprise private cloud management, headquartered in an undisclosed location, targeting global enterprises with its Private Cloud Director solution.

Site Reliability Engineer

8 days ago
Full-time
China
Key requirements: 3 years of experience, AWS, Terraform, CloudFormation, Python, Incident management, Monitoring systems, Zero Downtime Deployments, Multi-region AWS architecture, AI-driven analytics
Razer

Razer Inc. is a dual-headquartered gaming hardware and consumer electronics company specializing in high-performance gaming peripherals and laptops, targeting gamers globally.

Senior Site Reliability Engineering - Storage

9 days ago
Full-time
India
Key requirements: 12 years of experience, NAS, SAN, Object Storage, SRE concepts, Infrastructure as Code, Terraform, Ansible, Docker, Kubernetes, Python, Go, Shell, AI/ML storage, data-driven reliability improvements, technical leadership
NVIDIA

NVIDIA Corporation is a Santa Clara-based technology company specializing in designing GPUs and AI solutions for gaming, professional visualization, and cloud services, operating in both B2B and B2C markets globally.

Remote policy: NVIDIA supports flexible remote work arrangements and hires from various regions globally, including the Americas, Europe, Asia, and the Middle East, with roles that may require collaboration across time zones.

Site Reliability Engineer

10 days ago
Full-time
United States
$150,000 to $200,000 per year
Key requirements: 5 years of experience, SRE, Linux, Container management, Distributed systems, SLIs/SLOs, Incident response leadership, Scripting, Monitoring systems, GPU infrastructure, High-growth reliability improvement
RunPod

RunPod is a Mt. Laurel, New Jersey-based B2B cloud computing platform specializing in GPU infrastructure for AI and machine learning applications, serving a global market of developers and enterprises.

Remote policy: RunPod operates as a remote-first organization, welcoming candidates from various locations, primarily focusing on those eligible to work in the United States.

Staff Site Reliability Engineer (AI Enablement)

11 days ago
Full-time
United States
$150,000 to $230,000 per year
Key requirements: 8 years of experience, AI-assisted development tools, Building AI/LLM-powered developer tools, Driving org-wide tooling adoption, Prompt engineering techniques, Go or Python proficiency, Operating production environments in AWS, Strong experience with Terraform, Container orchestration (ECS/Kubernetes), Observability practices, AI Enablement Strategy
Coalition, Inc.

Staff Site Reliability Engineer (AI Enablement)

11 days ago
Full-time
Canada
$153,400 to $230,400 per year
Key requirements: 8 years of experience, AI-assisted development tools, Building AI/LLM-powered developer tools, AI Enablement Strategy, Proficiency in prompt engineering, Go or Python proficiency, AWS production environment experience, Terraform expertise, Container orchestration (ECS/Kubernetes), Observability practices, Driving org-wide tooling adoption
Coalition, Inc.

Site Reliability Engineer

11 days ago
Full-time
United States, Canada, United Kingdom, Brazil, Japan, Nigeria
$100,000 to $150,000 per year
Key requirements: 4 years of experience, PostgreSQL, Kubernetes, GitOps, Cloud networking, Incident response, Go, Python, Observability stack
Alpaca

Alpaca is a US-based fintech company providing self-clearing brokerage infrastructure and APIs for stocks, ETFs, options, and crypto, serving financial institutions globally.

Remote policy: Alpaca embraces a remote-first culture, hiring globally from various regions including the USA, Canada, Japan, Hungary, Nigeria, Brazil, and the UK, allowing team members to work from their preferred locations.

Senior Infrastructure Engineer

11 days ago
Full-time
United States
$105,000 to $135,000 per year
Key requirements: 4 years of experience, PowerShell, Python, Cloud management tools, ITIL 4 Foundation, AI platforms, SharePoint Online, Vulnerability remediation, Disaster Recovery (DR), Business Continuity Plans (BCP), Automated deployment frameworks, Global infrastructure scaling
Omnidian

Omnidian is a Seattle-based B2B tech-enabled service company specializing in solar energy monitoring and maintenance, serving residential and commercial markets globally.

Remote policy: Omnidian supports remote work for most roles, allowing flexibility for employees to work from various locations, including regions such as Seattle, WA, and Australia.

Senior AIOps Engineer

12 days ago
Full-time
Malaysia
Key requirements: 5 years of experience, AIOps automation, AI/ML integration, Observability frameworks, Predictive modeling, Payment Gateway experience, Cloud platforms (AWS, GCP, Azure), Container orchestration, Operational data analysis, Anomaly detection tooling
Razer

Razer Inc. is a dual-headquartered gaming hardware and consumer electronics company specializing in high-performance gaming peripherals and laptops, targeting gamers globally.

System Administrator

12 days ago
Full-time
Malaysia
Key requirements: 3 years of experience, Cloud infrastructure management, AWS, Google Cloud, Microsoft Azure, High-availability applications, Linux administration, Scripting (Python, Bash), Infrastructure as Code (Terraform), Payment Gateway experience, Container platforms (Docker, Kubernetes), PCI DSS compliance, AI-assisted operations
Razer

Razer Inc. is a dual-headquartered gaming hardware and consumer electronics company specializing in high-performance gaming peripherals and laptops, targeting gamers globally.

AIOps Engineer

12 days ago
Full-time
Malaysia
Key requirements: 4 years of experience, AI/ML tooling for anomaly detection, Payment Gateway experience, Cloud infrastructure (AWS, GCP, Azure), Distributed systems knowledge, Operational analytics, Scripting (Python, Go, Bash), Monitoring and observability platforms, Incident response leadership
Razer

Razer Inc. is a dual-headquartered gaming hardware and consumer electronics company specializing in high-performance gaming peripherals and laptops, targeting gamers globally.

Managed Services Engineer I (Raleigh/Durham/Chapel Hill, NC area)

12 days ago
Full-time
United States
$40,000 to $60,000 per year
Key requirements: 1 years of experience, Microsoft Exchange, SQL, Windows Server, ConnectWise, LAN/WAN troubleshooting, Microsoft Active Directory, Azure AD, Imaging Solutions, Basic PowerShell, Strong organization skills, Strong interpersonal skills
Logically

Logically is a Brighouse, England-based B2B Managed Security Services Provider (MSSP) specializing in cybersecurity solutions and IT services for organizations across various industries.

Remote policy: Logically supports remote work and hires from various regions, including the Raleigh–Durham–Chapel Hill area in North Carolina, while encouraging a collaborative team environment.

Senior Systems Administrator (temp to hire) - Marcus Hook, PA - HYBRID

12 days ago
Contract
United States
Key requirements: 12 years of experience, Microsoft Active Directory, VMWare VSphere 7.x, Citrix XenApp 2507+, Cisco routers and switches, IT SOX Controls, endpoint security tools, server management best practices, scripting, capacity planning, cybersecurity tools
Arctiq

Arctiq is a Toronto-based B2B DevOps and cloud solution integrator specializing in professional IT services and managed services for enterprise organizations across North America.

Technology Operations Manager

13 days ago
Full-time
Worldwide
$200,000 to $225,000 per year
Key requirements: 5 years of experience, AWS, Hybrid cloud infrastructure, Site Reliability Engineering, Incident investigation, Observability practices, Service reliability, Root cause management, Data center technologies, Virtualization, Infrastructure as Code (IaC), Operational KPIs
Business Wire

Business Wire is a San Francisco-based B2B service provider specializing in global news release distribution and regulatory disclosure for various industries, including finance and healthcare.

Remote policy: Business Wire supports remote work and hires from various locations, with team members collaborating across different time zones.