Description:
As Manager of Analytics Operations for AI Applications, you will be at the heart of BMS's Global Product Development and Supply AI/ML ecosystem. This is not a passive oversight role. You will be hands-on, owning the reliability, performance, and scalability of production AI systems that directly support life-changing work. From managing containerized workloads to partnering with data scientists and ML engineers, you will be the operational backbone that transforms AI innovation into real-world impact.
What You Will Do
- Ensure production AI and LLM applications run with the availability, stability, and performance the business depends on
- Troubleshoot complex issues across data pipelines, microservices, model deployments, and integration layers
- Build and maintain monitoring, alerting, and observability solutions tailored to AI workload behavior
- Deploy and manage containerized infrastructure using Kubernetes and IaC tools such as Terraform and Ansible
- Enhance CI/CD and AI orchestration pipelines, automating everything that does not need a human touch
- Drive continuous improvement through rigorous root cause analysis and incident management
- Champion security, compliance, and model governance in collaboration with cybersecurity and cross-functional teams
- Act as a key technical voice during design reviews, architecture planning, and operational readiness assessments
What You Bring
- 5 or more years of experience in IT operations, site reliability engineering, cloud engineering, or DevOps
- Strong hands-on expertise in AWS, Kubernetes, Docker, and infrastructure-as-code frameworks
- Proficiency in scripting languages such as Python, Bash, or PowerShell
- Solid troubleshooting skills across distributed systems, APIs, databases, and data pipelines
- Familiarity with AI tools and workflows such as LangChain, Bedrock, or Databricks
- Experience supporting generative AI or LLM workloads is a strong advantage
- Knowledge of observability platforms such as Datadog, Splunk, or Grafana
- A background in highly regulated industries such as healthcare, finance, or government is preferred
- Cloud, DevOps, or AI certifications are a welcome addition