Posted at: 16 February

Senior Data Scientist

Company

CompanyAlex Staff Agency

Alex Staff Agency is an international IT recruitment B2B agency specializing in connecting top tech talent with companies in the IT and creative sectors, operating remotely without a fixed headquarters.

Remote Hiring Policy:

Alex Staff Agency embraces remote work and offers flexible collaboration options, including fully remote roles and hybrid models in locations such as London. Team members are supported across various regions.

Job Type

Full-time

Allowed Applicant Locations

United Kingdom

Job Description

We need someone who can build high-quality forecasting models for UK energy balancing markets — not a generalist who's touched a bit of everything, but a specialist who genuinely understands time series, knows how to extract signal from massive feature sets, and can produce reliable probabilistic forecasts.

You'll spend significant time on tasks like: engineering features from raw market data, selecting the most predictive subset from hundreds of thousands of candidates, building gradient boosting models that output well-calibrated prediction intervals, and rigorously validating everything to avoid the subtle leakage problems that plague time series work.

You won't be responsible for deployment — we have experienced DevOps for that. But you'll need to hand off models that are well-documented, reproducible, and actually work in production. If you find satisfaction in the craft of building models that hold up under scrutiny — rather than just hitting a metric on a test set — this role is for you.

Feature Engineering and Selection

• Engineer predictive features from energy market data (prices, volumes, grid conditions, weather, calendar effects)

• Work with feature sets in the hundreds of thousands — you'll need systematic approaches, not manual inspection

• Apply and evaluate feature selection methods (mRMR, importance-based selection, recursive elimination) to build parsimonious models

• Analyse feature importance and stability across time periods and market conditions

• Understand the domain well enough to create features that reflect how the balancing market actually works


Model Development

• Build gradient boosting models (XGBoost, LightGBM, CatBoost) for multi-horizon forecasting

• Produce probabilistic forecasts — prediction intervals, quantile regression, or distribution outputs — not just point estimates

• Handle class imbalances appropriately when the problem requires classification

• Design proper time series cross-validation schemes that respect temporal ordering

• Diagnose and fix target leakage — you should be able to explain why a 'too good' result is suspicious


Validation and Testing

• Test pipeline components using synthetic/artificial data where ground truth is known

• Validate that preprocessing steps (missing value imputation, outlier handling) don't introduce leakage

• Build confidence that models will generalise, not just interpolate


Experiment Tracking and Reproducibility

• Track experiments systematically (MLflow or similar)

• Maintain reproducible training pipelines with proper configuration management

• Document model decisions, hyperparameter choices, and validation results clearly


Domain Understanding

• Invest time learning UK energy balancing markets — BM units, settlement periods, system prices, imbalance dynamics

• Translate domain knowledge into model improvements (better features, appropriate loss functions, sensible constraints)

• Collaborate with colleagues who understand the data infrastructure and market context

Must Have

Deep time series experience — you understand why random CV splits fail for forecasting, how to handle multiple horizons, and the pitfalls of lookahead bias

Strong feature engineering and selection skills — you've worked with high-dimensional feature sets and know multiple approaches to reduce them systematically

Gradient boosting expertise — XGBoost, LightGBM, or CatBoost are your core tools; you understand their hyperparameters and when each matters

Probabilistic forecasting ability — you can produce calibrated prediction intervals or quantile forecasts, not just point predictions

Rigorous validation mindset — you're paranoid about leakage, you test your assumptions, and you don't trust results that seem too good

Python fluency — clean, testable code; comfortable with pandas/Polars, scikit-learn, and the GBM libraries

SQL competence — you can pull and reshape data from PostgreSQL without friction

Clear communication — you document your work and can explain model behaviour to non-ML colleagues


Nice to Have

• Experience with MLflow, Hydra, Metaflow, or similar tooling for experiment tracking and pipeline management

• Polars experience (we're migrating some workloads from pandas)

• Background in energy, utilities, trading, or other domains with similar forecasting challenges

• Familiarity with UK energy markets, Elexon data, or grid balancing

• Experience with conformal prediction or other modern uncertainty quantification methods


Highly Desirable — Agentic AI Coding Experience

We value candidates who can build software using agentic AI coding systems. This is fundamentally different from using code completion tools or chat-based assistants.

What we're NOT looking for: - GitHub Copilot (code completion/autocomplete) - ChatGPT or similar chat interfaces for generating isolated code snippets - Any tool that only provides single-turn question/answer interactions

What we ARE looking for: Hands-on experience with agentic coding systems such as Claude Code, Codex (OpenAI's agentic coding tool), Open Code, or Cursor.

Ideal candidates will demonstrate:

- Breadth of experience — proficiency with at least 2 agentic systems (experience with only one is insufficient)

- End-to-end development — ability to design and build software from the ground up using these tools, not just generating isolated snippets

- Multi-agent orchestration — demonstrated experience orchestrating multiple agents using skills, tools, and agent coordination, not just one-shot problem solving

- Deep system knowledge — familiarity with hooks, permission systems, MCP (Model Context Protocol) servers, custom skills and tool definitions, and context management

  • Plenty of opportunities for learning and professional growth
  • B2b contract with a paid vacation