Role Overview
We are seeking an experienced ML Ops Engineer to lead the operationalisation of machine learning models across the SMPnet platform. In this role, you will design, deploy, and maintain robust ML pipelines, ensuring reliability, reproducibility, and high performance in real-world grid environments.
You will collaborate closely with data scientists, backend, cloud and smart grid engineers, and domain experts to automate model training, validation, deployment, monitoring, and lifecycle management. This is an opportunity to shape SMPnet’s ML infrastructure and contribute directly to delivering mission-critical solutions to energy providers.
Key Responsibilities
Model Deployment & Operationalisation
- Design, implement, and manage production-grade ML pipelines for model training, validation, deployment, and monitoring.
- Develop scalable, automated workflows using tools such as SageMaker, MLflow, Airflow, or Kubeflow.
- Convert and optimise trained models for efficient inference using TensorRT, TorchScript, ONNX, or similar deployment frameworks.
- Deploy models across cloud and hybrid environments, ensuring high availability and efficient resource utilisation.
Data Pipelines & Infrastructure
- Develop and maintain robust data ingestion and transformation pipelines for time-series and real-time data.
- Optimise data workflows to ensure reproducibility, consistency, and low-latency model inference.
- Collaborate with cloud engineers to integrate ML workflows with AWS services and internal APIs.
Monitoring, Observability & Model Governance
- Implement monitoring systems to track model drift, data quality, and performance degradation.
- Contribute to the establishment of model governance practices, versioning strategies, and auditability mechanisms.
- Build dashboards and tools for operational insights, enabling transparent and reliable model lifecycle management.
Collaboration & Technical Guidance
- Work closely with data scientists to transition research models into production-ready solutions.
- Support backend and DevOps teams in integrating ML components into broader system architectures.
- Contribute to technical design discussions and long-term planning for SMPnet’s data and ML platforms.
- Continuous Improvement
- Research and adopt state-of-the-art techniques, tools, and technologies to enhance ML reliability and automation.
- Introduce best practices for CI/CD in ML environments, including automated testing, validation, and retraining.
- Champion efficiency, documentation, security, and quality within ML-related development processes.
Requirements
- Bachelor’s or Master’s degree in Computer Science, Engineering, Data Science, or a related field.
- 2–4 years of hands-on experience building and deploying machine learning systems into production.
- Strong proficiency in Python, including ML and data engineering libraries.
- Experience with ML frameworks such as TensorFlow, PyTorch, and scikit-learn.
- Hands-on experience with AWS services relevant to ML (e.g., S3, Lambda, SageMaker, ECS).
- Solid understanding of data pre-processing, feature engineering, and evaluation techniques for time-series forecasting.
- Experience building automated pipelines and workflows (e.g., Airflow, Dagster, Prefect).
- Familiarity with Docker, containerisation, and Git-based workflows.
- Strong analytical skills and comfort working with real-time or high-volume time-series datasets.
- Excellent communication and collaboration skills in English.
What We Offer
- A full-time position with a competitive salary
- Benefits, including stock options, 28 days holidays (excluding statutory), private health insurance, €1000 one-off training budget for professional development and wellness.
- Flexible working hours with a focus on achieving a balanced work environment.
- A collaborative and innovative atmosphere with opportunities to impact the company’s direction and growth.