Posted at: 14 April
1062 | MLOps Engineer
Company
Intetics
Intetics is a Brooklyn-based B2B custom software development company specializing in distributed development teams, AI/ML solutions, and geospatial services for various industries, including automotive and government.
Remote Hiring Policy:
Intetics embraces a distributed team model, hiring remotely from various regions, including the United States and several European countries, fostering collaboration across time zones.
Job Type
Full-time
Allowed Applicant Locations
Europe, United States
Job Description
Intetics Inc., a global technology company providing custom software application development, distributed professional teams, software product quality assessment, and “all-things-digital” solutions, is seeking a highly skilled and experienced MLOps Engineer to join our dynamic team on a full-time basis.
Responsibilities:
- Design and implement scalable, secure, and cost‑efficient MLOps solutions leveraging AWS and Databricks.
- Automate ML deployment pipelines, reducing manual intervention and operational overhead.
- Collaborate closely with data scientists to ensure solutions align with established MLOps architecture, best practices, and platform standards.
- Integrate security controls and compliance requirements throughout the entire machine learning lifecycle.
- Own and manage incidents end‑to‑end, from root cause analysis to prevention of future occurrences.
- Contribute to software system architecture and the design of platform‑level components.
- Build and optimize ML training, retraining, and inference pipelines, ensuring reliability and scalability.
- Enhance observability with metrics, logging, tracing, and dashboards to ensure system visibility and performance.
- Drive best practices in infrastructure automation, CI/CD, and cloud resource management across ML teams.
- Strong hands‑on experience with AWS architecture, including security best practices, IAM, networking, and cost optimization.
- Proficiency with Databricks (essential): MLflow, Workflows, Feature Store, cluster management, Unity Catalog.
- Experience with cloud‑managed ML platforms such as AWS SageMaker or Google Vertex AI.
- Expert knowledge of Terraform / Terragrunt for multi‑cloud infrastructure provisioning and automation.
- Deep expertise in Kubernetes, including autoscaling, GPU workloads, networking policies, and cluster optimization.
- Practical experience with observability stacks such as Prometheus, Grafana, Loki, ELK.
- Strong understanding of GitOps workflows and CI/CD tools (e.g., ArgoCD, FluxCD).
- Solid knowledge of Docker security, container hardening, and secure container orchestration.
- Advanced experience in MLOps practices for continuous training (CT), CI/CD for ML models, and automated deployment.
- Familiarity with ML pipeline orchestration tools such as Kubeflow or Argo Workflows.
- Experience with LLMOps, including frameworks such as Langfuse, ollama, vLLM, and supporting large‑scale inference.
- Ability to contribute to architecture design, set platform standards, and mentor MLOps or ML engineers.