The client offers comprehensive IT services throughout Europe
Basic information:
Location: 100% remote or hybrid (Warsaw/Poznań/Lublin)
Rate: up to approx. PLN 135/hour net + VAT (additional payment for on-call time per week/month)
Type of employment: B2B contract
Duration: 12 months + extensions
Recruitment process: 2 stages
English: B2/C1
Luxmed, MultiSport, equipment provided by the client
Customer Description
- Our client is a leading German company at the forefront of telecommunications and IT services, specializing in advanced solutions such as web hosting, cloud computing, and internet services
- To expand its capabilities, the client is establishing a European Technology Center in Poland.
Project Description
- We are looking for an experienced ML Ops Engineer to design and implement infrastructure for hosting, orchestrating, and managing up to 1,500 ML scoring processes within a new Databricks environment.
- The role focuses on operationalizing ML scoring pipelines by creating a scalable, secure, and well-monitored platform for data science teams to deploy models efficiently.
- You will build the operational backbone that enables data scientists to run, monitor, and manage high-volume ML scoring in production.
- Working closely with DevOps Engineers, you will ensure Databricks supports both AI workloads and traditional BI/data processing use cases, including secure access, seamless integration with downstream tools, and optimized data pipelines.
Impact of the Role
- This position is critical for scaling AI capabilities across the organization, enabling thousands of predictive scores to be calculated and monitored daily in a production-grade environment.
- Your work will accelerate AI adoption while ensuring operational excellence.
Working Model & Collaboration
- Distributed delivery model aligned with the central AI/BI team in Germany
- Daily use of remote collaboration tools (MS Teams, Jira, Confluence)
- Agile methodologies (Scrum/Kanban) in cross-functional squad
- Clear documentation and reproducibility for seamless handovers
Key Responsibilities
- Environment Setup & Configuration
> Configure Databricks clusters, jobs, and workflows for large-scale ML scoring
> Implement Infrastructure as Code (e.g., Terraform) for reproducibility and governance
> Optimize job scheduling, parallel execution, and resource allocation
> Integrate monitoring and alerting using cloud-native tools
> Ensure security, compliance, and cost-efficiency
- ML Ops Pipeline Integration:
> Develop deployment processes for ML models using Databricks MLflow or equivalent
> Implement version control for models, scoring code, and configurations
- Execution Management:
> Build frameworks to orchestrate scoring for 1,500+ ML models/jobs
> Ensure resilience, fault tolerance, and restart capabilities
- Monitoring & Observability:
> Integrate logging, alerting, and dashboards for throughput, latency, and failures
> Establish model performance monitoring hooks
- Automation:
> Collaborate with DevOps Engineers for shared infrastructure (e.g., Delta Lake tables)
> Automate resource provisioning and deployments via CI/CD pipelines
> Utilize IaC for reproducibility
- Collaboration & Governance:
> Work closely with data scientists, architects, and platform engineers
> Define operational SLAs for scoring workloads
> Implement RBAC, credential management, and audit logging
> Ensure secure handling of model artifacts and scoring data
Technology Stack:
> Databricks (administration, ideally on Azure)
> Terraform
> Python (automation)
Key Skills:
> Azure
> Databricks
> MLOps
> Python
> Terraform
Must-Have Requirements:
> Proven experience in ML Ops within production environments
> Hands-on expertise with Databricks (MLflow, Jobs, Workflows, Delta Lake)
> Large-scale batch job orchestration and distributed computing experience
> Strong Python skills for workflow scripting and pipeline integration
> CI/CD pipeline experience (Azure DevOps, GitHub Actions, etc.)
> Proficiency with monitoring tools (Datadog, Prometheus, Grafana, or cloud-native)
> Knowledge of IaC and cloud automation
> Understanding of model lifecycle management and reproducibility
Nice-to-Have:
> Experience with high-volume ML scoring in Databricks
> Familiarity with ML operationalization best practices in regulated environments
> Knowledge of job queueing systems and parallel execution patterns
> Exposure to Azure Databricks and Azure ecosystem
> Performance tuning for large concurrent workloads
> Cost optimization strategies for ML infrastructure
Hays Poland sp. z o.o. is an employment agency registered in a registry kept by Marshal of the Mazowieckie Voivodeship under the number 361