Posted at: 15 May

DevOps (AWS / Kubernetes / Terraform)

Company

CompanyMutt Data

Mutt Data is an Argentina-based B2B technology startup specializing in custom AI and machine learning solutions for e-commerce and retail media.

Remote Hiring Policy:

Mutt Data supports fully remote work for employees based in Argentina, hiring exclusively from this region.

Job Type

Full-time

Allowed Applicant Locations

Argentina

Job Description

πŸš€ Join Our Data Products and Machine Learning Development Remote Startup! πŸš€
 
Muttdata is a dynamic startup committed to crafting innovative systems using cutting-edge Big Data and Machine Learning technologies.
 

We are looking for a hands-on DevOps to join a strategic initiative focused on deploying and operating Data & AI platforms.

This role is centered around infrastructure automation, platform reliability, Kubernetes operations, and production-grade cloud infrastructure on AWS. The person will work closely with Infrastructure and Enterprise Architecture teams, operating within established corporate security, networking, and compliance standards.πŸΆπŸš€

πŸš€ What We Do

  • Leveraging our expertise, we build modern Machine Learning systems for demand planning and budget forecasting.
  • Developing scalable data infrastructures, we enhance high-level decision-making, tailored to each client.
  • Offering comprehensive Data Engineering and custom AI solutions, we optimize cloud-based systems.
  • Using Generative AI, we help e-commerce platforms and retailers create higher-quality ads, faster.
  • Building deep learning models, we enhance visual recognition and automation for various industries, improving product categorization, quality control, and information retrieval.
  • Developing recommendation models, we personalize user experiences in e-commerce, streaming, and digital platforms, driving engagement and conversions.

🌟 Our Partnerships

  • Amazon Web Services
  • Astronomer
  • Databricks

🌟 Our Values

  • πŸ“Š We are Data Nerds
  • πŸ€— We are Open Team Players
  • πŸš€ We Take Ownership
  • 🌟 We Have a Positive Mindset
 
πŸ” Curious about what we’re up to? Check out our case studies and dive into our blog post to learn more about our culture and the exciting projects we’re working on! πŸš€

Responsibilities πŸ€“

Infrastructure as Code on AWS

  • Deploy and maintain infrastructure using Terraform on AWS.
  • Work within the organization’s corporate golden path leveraging HCP (HashiCorp Cloud Platform) and HashiCorp Vault.
  • Ensure infrastructure complies with enterprise security, networking, and governance standards.
  • Collaborate with Infrastructure and Enterprise Architecture teams on platform requirements and integrations.

Administration & Operations of Data & AI Platforms

  • Operate and govern production-grade platforms running on Kubernetes / EKS.
  • Manage platforms such as Langfuse, LiteLLM, ClickHouse, Redis, and future Data & AI tooling.
  • Design and maintain:
    • Backup & restore strategies
    • High Availability (HA) configurations
    • Shared cache and distributed rate limiting mechanisms
    • Horizontal and vertical scaling strategies
    • Platform upgrades and dependency management
  • Integrate secrets and credentials management through Vault.
  • Troubleshoot production incidents and improve operational recovery processes.
  • Read and understand platform documentation to ensure optimal deployment and operation on Kubernetes environments.

DevOps / SRE Automation

  • Build and maintain CI/CD pipelines using GitHub Actions.
  • Automate operational workflows and deployment processes.
  • Improve observability, monitoring, and operational reliability.
  • Create operational runbooks and reduce manual toil through automation.
  • Continuously improve platform stability, scalability, and developer experience.

Required Skills πŸ’»

Must-have

  • Strong Terraform experience (production-level, not tutorial-based)
  • Solid AWS infrastructure experience
  • Kubernetes / EKS administration and operations
  • Containers and cloud-native infrastructure
  • SRE mindset and operational judgment
  • Ability to understand systems under the hood, not only operate tooling

Nice-to-have

  • Python for automation
  • Database administration experience
  • ClickHouse
  • Redis
  • LiteLLM
  • Langfuse
  • Observability tools such as Prometheus, Grafana, or equivalent

🎁 Perks

  • πŸš€ In-Company English Lessons.
  • πŸ’ͺ Wellhub or sports club stipend to stay active
  • πŸš€ AWS & Databricks certifications fully covered
  • πŸ• Food credits via Pedidos Ya – because great work deserves great food.
  • πŸŽ‚ Birthday off + an extra vacation week (Mutt Week! πŸ–οΈ)
  • 🀝 Referral bonuses – help us grow the team & get rewarded!
  • ✈️🏝️ Annual Mutters' Trip – an unforgettable getaway with the team!
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.