Posted at: 18 February
Senior DevOps Engineer - Highload, Cloud & Data-Intensive Systems (EU / Remote)
Company
Alex Staff Agency
Alex Staff Agency is an international IT recruitment B2B agency specializing in connecting top tech talent with companies in the IT and creative sectors, operating remotely without a fixed headquarters.
Remote Hiring Policy:
Alex Staff Agency embraces remote work and offers flexible collaboration options, including fully remote roles and hybrid models in locations such as London. Team members are supported across various regions.
Job Type
Full-time
Allowed Applicant Locations
Europe
Salary
€5,000 to €8,000 per month
Job Description
About the project
The team develops and maintains distributed services around analytics, APIs, and transaction monitoring. The systems process very large volumes of data — terabytes of storage, trillions of records, continuously growing load.
Infrastructure:
~100 servers (bare metal + VPS)
active use of IaC
Kubernetes clusters in production
focus on stability, observability, and automation
The project is long-term — not a hype startup, but a mature product with real users.
What the work looks like
This is a hands-on role with a clear time allocation:
60% — operations and incidents (including helping teams)
20% — infrastructure automation
20% — prototyping, improvements, technical initiatives
There is on-call responsibility, but normally after-hours incidents happen 2–3 times a year, not every week.
Responsibilities
Operation of production services and infrastructure (server provisioning/decommissioning, updates, replacements, performance troubleshooting)
Support and development of Infrastructure as Code (Terraform / Ansible: modules, roles, standards, reviews)
Monitoring, alerting, backups, and regular recovery checks
Development of service and infrastructure automation
Development of CI/CD and release procedures
Incident diagnosis and resolution, support for product teams
Traffic analytics, bot and attack protection tools
Responsibility for 24/7 platform stability
What’s important
4+ years of experience operating Linux/Ubuntu infrastructure and production services
Strong understanding of networking and troubleshooting
Kubernetes (cluster operations), Rancher, Docker / containerd
Hands-on experience with Ansible and Terraform
Monitoring: Prometheus / Thanos / Telegraf / Grafana / Sentry
CI/CD: Jenkins
Automation: Bash, Python
Experience working with LVM
Nice to have
Experience working with blockchain nodes
Diagnosis and tuning of ClickHouse and MongoDB in high-load clusters
Providers: Hetzner / OVHcloud
Cloudflare (edge, DDoS), experience with AWS
Handling abuse tickets with hosting providers
Technology stack
VPN: WireGuard, OpenVPN
Databases: ClickHouse, MongoDB, Redis, PostgreSQL
Applications: Node.js (pm2), php-fpm, Lua, Tarantool
Supporting services: Go (operatorSDK), Ruby, Node.js, PHP
5,000 – 8,000 € net
Format: office / hybrid / remote
Location: Spain (Barcelona and suburbs) or remote (CET ±2)
Full-time
Opportunity to genuinely influence architecture and processes
Mature engineering team and reasonable expectations