Remote Site Reliability Engineer Jobs

Explore 66 fresh remote Site Reliability Engineer jobs. Whether you're working from home or from anywhere in the world, our curated listings deliver clear insights for your next move.

Filter by Location

Subscribe to our Telegram bot to receive instant notifications about new remote jobs

TelegramSubscribe Now

Latest Site Reliability Engineer Jobs (66)

Senior Site Reliability Engineering - Storage

1 day ago
Full-time
India
Key requirements: 12 years of experience, NAS, SAN, Object Storage, SRE concepts, Infrastructure as Code, Terraform, Ansible, Docker, Kubernetes, Python, Go, Shell, AI/ML storage, data-driven reliability improvements, technical leadership
NVIDIA

NVIDIA Corporation is a Santa Clara-based technology company specializing in designing GPUs and AI solutions for gaming, professional visualization, and cloud services, operating in both B2B and B2C markets globally.

Remote policy: NVIDIA supports flexible remote work arrangements and hires from various regions globally, including the Americas, Europe, Asia, and the Middle East, with roles that may require collaboration across time zones.

Site Reliability Engineer

2 days ago
Full-time
United States
$150,000 to $200,000 per year
Key requirements: 5 years of experience, SRE, Linux, Container management, Distributed systems, SLIs/SLOs, Incident response leadership, Scripting, Monitoring systems, GPU infrastructure, High-growth reliability improvement
RunPod

RunPod is a Mt. Laurel, New Jersey-based B2B cloud computing platform specializing in GPU infrastructure for AI and machine learning applications, serving a global market of developers and enterprises.

Remote policy: RunPod operates as a remote-first organization, welcoming candidates from various locations, primarily focusing on those eligible to work in the United States.

Staff Site Reliability Engineer (AI Enablement)

3 days ago
Full-time
United States
$150,000 to $230,000 per year
Key requirements: 8 years of experience, AI-assisted development tools, Building AI/LLM-powered developer tools, Driving org-wide tooling adoption, Prompt engineering techniques, Go or Python proficiency, Operating production environments in AWS, Strong experience with Terraform, Container orchestration (ECS/Kubernetes), Observability practices, AI Enablement Strategy
Coalition, Inc.

Staff Site Reliability Engineer (AI Enablement)

3 days ago
Full-time
Canada
$153,400 to $230,400 per year
Key requirements: 8 years of experience, AI-assisted development tools, Building AI/LLM-powered developer tools, AI Enablement Strategy, Proficiency in prompt engineering, Go or Python proficiency, AWS production environment experience, Terraform expertise, Container orchestration (ECS/Kubernetes), Observability practices, Driving org-wide tooling adoption
Coalition, Inc.

Site Reliability Engineer

3 days ago
Full-time
United States, Canada, United Kingdom, Brazil, Japan, Nigeria
$100,000 to $150,000 per year
Key requirements: 4 years of experience, PostgreSQL, Kubernetes, GitOps, Cloud networking, Incident response, Go, Python, Observability stack
Alpaca

Alpaca is a US-based fintech company providing self-clearing brokerage infrastructure and APIs for stocks, ETFs, options, and crypto, serving financial institutions globally.

Remote policy: Alpaca embraces a remote-first culture, hiring globally from various regions including the USA, Canada, Japan, Hungary, Nigeria, Brazil, and the UK, allowing team members to work from their preferred locations.

Senior Infrastructure Engineer

3 days ago
Full-time
United States
$105,000 to $135,000 per year
Key requirements: 4 years of experience, PowerShell, Python, Cloud management tools, ITIL 4 Foundation, AI platforms, SharePoint Online, Vulnerability remediation, Disaster Recovery (DR), Business Continuity Plans (BCP), Automated deployment frameworks, Global infrastructure scaling
Omnidian

Omnidian is a Seattle-based B2B tech-enabled service company specializing in solar energy monitoring and maintenance, serving residential and commercial markets globally.

Remote policy: Omnidian supports remote work for most roles, allowing flexibility for employees to work from various locations, including regions such as Seattle, WA, and Australia.

Senior AIOps Engineer

4 days ago
Full-time
Malaysia
Key requirements: 5 years of experience, AIOps automation, AI/ML integration, Observability frameworks, Predictive modeling, Payment Gateway experience, Cloud platforms (AWS, GCP, Azure), Container orchestration, Operational data analysis, Anomaly detection tooling
Razer

Razer Inc. is a dual-headquartered gaming hardware and consumer electronics company specializing in high-performance gaming peripherals and laptops, targeting gamers globally.

System Administrator

4 days ago
Full-time
Malaysia
Key requirements: 3 years of experience, Cloud infrastructure management, AWS, Google Cloud, Microsoft Azure, High-availability applications, Linux administration, Scripting (Python, Bash), Infrastructure as Code (Terraform), Payment Gateway experience, Container platforms (Docker, Kubernetes), PCI DSS compliance, AI-assisted operations
Razer

Razer Inc. is a dual-headquartered gaming hardware and consumer electronics company specializing in high-performance gaming peripherals and laptops, targeting gamers globally.

AIOps Engineer

4 days ago
Full-time
Malaysia
Key requirements: 4 years of experience, AI/ML tooling for anomaly detection, Payment Gateway experience, Cloud infrastructure (AWS, GCP, Azure), Distributed systems knowledge, Operational analytics, Scripting (Python, Go, Bash), Monitoring and observability platforms, Incident response leadership
Razer

Razer Inc. is a dual-headquartered gaming hardware and consumer electronics company specializing in high-performance gaming peripherals and laptops, targeting gamers globally.

Site Reliability Engineer (SRE)

4 days ago
Full-time
United States
$100,000 to $150,000 per year
Key requirements: 5 years of experience, Kubernetes, Python, Go, Prometheus, Grafana, CI/CD pipelines, Chaos engineering, SLOs and error budgets, Linux at scale, Observability tooling
Bright Vision Technologies

Bright Vision Technologies is a New Jersey-based IT staffing firm specializing in placing technical professionals in software development roles across the U.S. government and enterprise sectors.

Managed Services Engineer I (Raleigh/Durham/Chapel Hill, NC area)

4 days ago
Full-time
United States
$40,000 to $60,000 per year
Key requirements: 1 years of experience, Microsoft Exchange, SQL, Windows Server, ConnectWise, LAN/WAN troubleshooting, Microsoft Active Directory, Azure AD, Imaging Solutions, Basic PowerShell, Strong organization skills, Strong interpersonal skills
Logically

Logically is a Brighouse, England-based B2B Managed Security Services Provider (MSSP) specializing in cybersecurity solutions and IT services for organizations across various industries.

Remote policy: Logically supports remote work and hires from various regions, including the Raleigh–Durham–Chapel Hill area in North Carolina, while encouraging a collaborative team environment.

Kafka Platform Engineer

4 days ago
Full-time
United States
$100,000 to $150,000 per year
Key requirements: 5 years of experience, Kafka internals, Kafka security, Kafka Connect, Schema Registry, Kafka Streams, HA/DR strategies, Python, Terraform, Observability tooling, Confluent platform
Bright Vision Technologies

Bright Vision Technologies is a New Jersey-based IT staffing firm specializing in placing technical professionals in software development roles across the U.S. government and enterprise sectors.

Observability Engineer (Prometheus / Grafana / Datadog)

4 days ago
Full-time
United States
$100,000 to $150,000 per year
Key requirements: 5 years of experience, Prometheus, Grafana, Datadog, OpenTelemetry, SRE principles, High-cardinality metrics, CI/CD integration, Linux internals, Distributed tracing, Observability cost optimization
Bright Vision Technologies

Bright Vision Technologies is a New Jersey-based IT staffing firm specializing in placing technical professionals in software development roles across the U.S. government and enterprise sectors.

Senior Systems Administrator (temp to hire) - Marcus Hook, PA - HYBRID

4 days ago
Contract
United States
Key requirements: 12 years of experience, Microsoft Active Directory, VMWare VSphere 7.x, Citrix XenApp 2507+, Cisco routers and switches, IT SOX Controls, endpoint security tools, server management best practices, scripting, capacity planning, cybersecurity tools
Arctiq

Arctiq is a Toronto-based B2B DevOps and cloud solution integrator specializing in professional IT services and managed services for enterprise organizations across North America.

Observability Engineer (Prometheus / Grafana / Datadog)

5 days ago
Full-time
United States
$100,000 to $150,000 per year
Key requirements: 5 years of experience, Prometheus, Grafana, Datadog, OpenTelemetry, SLOs, error budgets, high-cardinality metrics, distributed tracing, CI/CD integration, Linux internals
Bright Vision Technologies

Bright Vision Technologies is a New Jersey-based IT staffing firm specializing in placing technical professionals in software development roles across the U.S. government and enterprise sectors.

Technology Operations Manager

5 days ago
Full-time
Worldwide
$200,000 to $225,000 per year
Key requirements: 5 years of experience, AWS, Hybrid cloud infrastructure, Site Reliability Engineering, Incident investigation, Observability practices, Service reliability, Root cause management, Data center technologies, Virtualization, Infrastructure as Code (IaC), Operational KPIs
Business Wire

Business Wire is a San Francisco-based B2B service provider specializing in global news release distribution and regulatory disclosure for various industries, including finance and healthcare.

Remote policy: Business Wire supports remote work and hires from various locations, with team members collaborating across different time zones.

Site Reliability Engineer (SRE)

5 days ago
Full-time
Worldwide
$90,000 to $130,000 per year
Key requirements: 3 years of experience, Kubernetes, Ansible, Terraform, Bash scripting, CI/CD (GitLab CI), Observability tools (Prometheus, Grafana), Linux administration, Networking fundamentals, Containerization (Docker/Podman), DNS management
Social Discovery Group

Social Discovery Group is a global tech conglomerate specializing in social discovery and dating platforms, headquartered remotely with teams across multiple countries, operating in the B2C and B2B sectors.

Remote policy: Social Discovery Group supports fully remote work for its international team and hires globally from various locations, including the USA, Cyprus, Malta, and many others, fostering a digital nomad culture.

Senior Debug System Engineer, Datacenter

6 days ago
Full-time
United States
$200,000 to $322,000 per year
Key requirements: 12 years of experience, Failure analysis on datacenter products, Debugging GPU baseboards and servers, Enabling DFx requirements, Hardware, Software, Component, Process, Test, Validation expertise, Familiarity with oscilloscopes and analyzers, Strong negotiation and organization skills, Problem solving mentality, Ability to travel to factory sites
NVIDIA

NVIDIA Corporation is a Santa Clara-based technology company specializing in designing GPUs and AI solutions for gaming, professional visualization, and cloud services, operating in both B2B and B2C markets globally.

Remote policy: NVIDIA supports flexible remote work arrangements and hires from various regions globally, including the Americas, Europe, Asia, and the Middle East, with roles that may require collaboration across time zones.

Senior Production Engineer - DGX Cloud

7 days ago
Full-time
North America
$168,000 to $333,500 per year
Key requirements: 8 years of experience, Production Engineering, DevOps, SRE, Kubernetes, Slurm, Go, Python, Large-scale distributed systems, Incident management, Monitoring and alerting, Automated deployments
NVIDIA

NVIDIA Corporation is a Santa Clara-based technology company specializing in designing GPUs and AI solutions for gaming, professional visualization, and cloud services, operating in both B2B and B2C markets globally.

Remote policy: NVIDIA supports flexible remote work arrangements and hires from various regions globally, including the Americas, Europe, Asia, and the Middle East, with roles that may require collaboration across time zones.

Senior Site Reliability Engineer

7 days ago
Full-time
South Africa, Nigeria, Kenya, Ghana
Key requirements: 5 years of experience, Python, Terraform, AWS, Deep Observability Knowledge, Systems Thinking, Reliability Standards, Disaster Recovery, CI/CD, Kubernetes, Remote work adaptability
Deimos

Deimos is a fully remote African-based technology services company specializing in cloud-native development and security operations for clients of all sizes.

Remote policy: Deimos is a fully remote company with a team based in Africa, hiring from countries such as Kenya, Ghana, Nigeria, South Africa, and Senegal.

Senior Site Reliability Engineer - Security

8 days ago
Full-time
India
Key requirements: 5 years of experience, Python, AWS, Infrastructure-as-code, Observability, Automation, Incident handling, AI systems interest
scopely

Scopely is a Culver City-based mobile game developer and publisher specializing in free-to-play games, targeting a global audience with popular entertainment IPs.

Remote policy: Scopely embraces a flexible work environment and hires remotely from various regions, including the United States, EMEA, and Asia, with team members collaborating across time zones.

Staff Site Reliability Engineer

9 days ago
Full-time
United States, Canada, United Kingdom, Singapore, India, Ireland, Finland
$120,000 to $180,000 per year
Key requirements: 8 years of experience, Production SaaS systems, Python, AWS, Kubernetes, Networking fundamentals, Monitoring & alerting, Advanced observability, Incident management, Troubleshooting skills, AIOps strategy
AlphaSense

AlphaSense is a New York City-based B2B fintech platform specializing in AI-driven market intelligence and search solutions for financial institutions and top companies globally.

Remote policy: AlphaSense supports remote work and hires from various regions, with team members located in countries such as the United States, U.K., Finland, India, Singapore, Canada, and Ireland.

Cloud Reliability & Recovery Engineer

9 days ago
Full-time
United States, Canada, United Kingdom, Singapore, India, Ireland, Finland
$100,000 to $150,000 per year
Key requirements: 5 years of experience, AWS expertise, Disaster Recovery architecture, Multi-region failover, Terraform, Kubernetes, CI/CD pipelines, Python scripting, AWS Backup administration, Chaos engineering, Business Continuity Planning
AlphaSense

AlphaSense is a New York City-based B2B fintech platform specializing in AI-driven market intelligence and search solutions for financial institutions and top companies globally.

Remote policy: AlphaSense supports remote work and hires from various regions, with team members located in countries such as the United States, U.K., Finland, India, Singapore, Canada, and Ireland.

Observability Engineer (Prometheus / Grafana / Datadog)

9 days ago
Full-time
United States
Key requirements: 5 years of experience, Prometheus, Grafana, Datadog, OpenTelemetry, SRE principles, High-cardinality metrics, Distributed tracing, CI/CD integration, Linux internals, Container platforms
Bright Vision Technologies

Bright Vision Technologies is a New Jersey-based IT staffing firm specializing in placing technical professionals in software development roles across the U.S. government and enterprise sectors.

Staff Database Reliability Engineer

10 days ago
Full-time
United States
$200,000 to $250,000 per year
Key requirements: PostgreSQL, Django ORM, AWS DMS, pganalyze, CloudWatch, Honeycomb, AI coding tools, OpenSearch, Redis, SQS, RabbitMQ, Python, Terraform, Cross-team leadership, Automation
Scribe

Scribe is a San Francisco-based B2B SaaS platform specializing in workflow documentation and optimization, serving over 5 million users across 600,000 businesses globally.

Senior Infrastructure Engineer, Government Systems

10 days ago
Full-time
North America, Middle East
Key requirements: Kubernetes, Terraform, AWS, CI/CD, GitOps, Linux administration, Operational mindset, Security compliance
Chainalysis

Chainalysis is a New York City-based B2B blockchain analysis firm specializing in compliance and investigation software for the cryptocurrency and financial sectors, serving clients globally.

Remote policy: Chainalysis supports remote work and is open to hiring from various regions, including North America and the Middle East, with team members located across multiple countries.

OpenShift Engineer

10 days ago
Full-time
United States
Key requirements: 5 years of experience, OpenShift, Kubernetes internals, Linux administration, Infrastructure-as-code (Ansible, Terraform, Helm), CI/CD pipelines (Tekton, Jenkins, Argo CD), Scripting (Bash, Python, Go), Monitoring tools (Prometheus, Grafana, EFK, Tempo), Container image security, Multi-tenant OpenShift platforms, Disaster recovery strategies
Bright Vision Technologies

Bright Vision Technologies is a New Jersey-based IT staffing firm specializing in placing technical professionals in software development roles across the U.S. government and enterprise sectors.

Site Reliability Engineer (SRE)

10 days ago
Full-time
United States
$100,000 to $150,000 per year
Key requirements: 5 years of experience, Python, Go, Kubernetes, Prometheus, Grafana, CI/CD pipelines, Chaos engineering, Distributed systems design, Incident response, Observability tooling
Bright Vision Technologies

Bright Vision Technologies is a New Jersey-based IT staffing firm specializing in placing technical professionals in software development roles across the U.S. government and enterprise sectors.

Senior Software Engineer, AV Mapping Infrastructure

11 days ago
Full-time
United States
$152,000 to $287,500 per year
Key requirements: 5 years of experience, AWS, Kubernetes, Cloud services management, Application containers, Monitoring systems (Prometheus, Datadog), Middleware systems (Redis, MongoDB, Kafka, HBase, Postgres, ElasticSearch), CI/CD deployment strategies, Networking fundamentals, Linux proficiency
NVIDIA

NVIDIA Corporation is a Santa Clara-based technology company specializing in designing GPUs and AI solutions for gaming, professional visualization, and cloud services, operating in both B2B and B2C markets globally.

Remote policy: NVIDIA supports flexible remote work arrangements and hires from various regions globally, including the Americas, Europe, Asia, and the Middle East, with roles that may require collaboration across time zones.

Site Reliability Engineer II

11 days ago
Full-time
Mexico
Key requirements: 3 years of experience, Python, Go, Distributed Systems Expertise, Reliability Engineering Mindset, Observability & Incident Response, Cross-functional Communication, Operational Tooling & AI Fluency, Leadership & Mentorship
EarnIn

EarnIn is a fintech company headquartered in the US, specializing in earned wage access (EWA) through a mobile app that provides financial tools for hourly workers.