Posted at: 27 April
Data Scientist
Company
Angel Studios is a Provo, Utah-based independent media company specializing in values-based entertainment through film distribution and an OTT streaming platform, primarily targeting a global audience.
Remote Hiring Policy:
Angel Studios supports remote work opportunities and is open to hiring from various regions, although specific hiring locations are not explicitly defined.
Job Type
Full-time
Allowed Applicant Locations
United States
Salary
$110,000 to $130,000 per year
Job Description
Why Join Angel
Angel Studios is growing fast. Our content library has expanded 10x in under two years, and over two million Guild members now decide what gets produced, funded, and watched. As that library grows, the gap between what members would love and what they actually find is the most important problem on the platform.
You’ll be the first dedicated data scientist on Discovery. You’ll own the analytical foundation that makes our recommendation system measurable, improvable, and eventually intelligent. Today, our recommendations run on AWS Personalize. Your work will determine how far that takes us and when we’ve outgrown it.
This is a data science role, not an ML engineering role. Day one is about analytical rigor: metrics, experimentation, causal inference, and making the team smarter about our members. But we’re building toward a future where Angel owns its recommendation models end to end. If you’re a strong data scientist who wants to grow into owning models in production, this is the role where that trajectory is real and supported.
What You'll Own
-
Metrics and measurement. Define, instrument, and maintain the Discovery metrics framework across web, mobile, and TV. Model metrics (precision, recall, coverage, diversity), customer metrics (CTR, playthrough, completion, session depth, cold-start ramp time), and business metrics (retention segmented by recommendation engagement). You decide what we measure, how we measure it, and when a metric is lying to us.
-
Experimentation. Own the A/B testing and experimentation pipeline for Discovery surfaces. Design experiments with statistical rigor: sample sizing, duration, segmentation, guard-rail metrics. Build the institutional muscle so the team ships with evidence, not opinions. We use GrowthBook.
-
User behavior analysis. Decode how members discover, browse, and engage with content across three very different platforms. Identify patterns in Guild voting, theatrical-to-streaming conversion, content affinity, and churn risk. Surface the insights that change how the product team thinks about the problem.
-
Causal inference. Distinguish correlation from causation in engagement data, where selection bias is everywhere. When recommendation engagement correlates with retention, determine whether the system is driving retention or whether high-intent users are simply more likely to click. Design quasi-experiments when randomization isn’t feasible.
-
Data foundations for analytics. Build and maintain the dbt models, data pipelines, and analytical infrastructure that make data accessible and trustworthy for the Discovery team and the broader organization. If the data is wrong, nothing else matters.
Where This Role Grows
The trajectory from data scientist to ML engineer on this team is explicit, not aspirational. As the analytical foundation matures, the work shifts:
-
Feature engineering for recommendations. Evaluate which new signals (voting history, explicit ratings, content metadata, theatrical engagement) improve recipe performance in AWS Personalize. Graduate from analyzing features to building them.
-
Model prototyping and evaluation. Prototype recommendation approaches (content-based filtering, hybrid models, embeddings) and evaluate them against the golden eval set you built in your first months.
-
Owning a model from experimentation to deployment. When the team outgrows Personalize, you’ll take a model from notebook to production: writing testable Python, managing data lifecycles (pipelines, feature stores, monitoring, retraining), and thinking about systems design (latency, failure modes, observability).
The timing of this transition depends on the work, not a calendar. You won’t be pushed into model building before the foundation is solid, and you won’t be held back once it is.
What You Bring
-
-
Statistical rigor. You design experiments correctly: power analysis, multiple comparisons, confidence intervals, Bayesian methods where appropriate. You can explain to a non-technical stakeholder why a result is or isn’t significant.
-
Causal inference chops. You’ve worked with observational data where naive correlations are misleading. Familiar with propensity score matching, difference-in-differences, instrumental variables, or regression discontinuity. You know when to reach for them.
-
SQL and Python fluency. SQL is your first language for data exploration. Python for analysis, modeling, and automation. Your code is clean enough that someone else can read it six months later.
-
Experimentation design and analysis. You’ve designed, run, and analyzed A/B tests in production. You understand interaction effects, novelty effects, and Simpson’s paradox.
-
Communication. You translate complex analysis into clear narratives. Stakeholders trust your conclusions because you show your reasoning, name your assumptions, and flag what you don’t know.
-
Data modeling. Experience with dbt or equivalent transformation frameworks. You’ve built analytical data models that other teams actually use.
Signals you’re on the MLE trajectory
Not requirements for day one, but what tells us you’ll grow into model ownership:
-
You write Python like a software engineer, not just a notebook user: tests, packaging, code reviews.
-
You’ve thought about what happens after an analysis becomes a model: data pipelines, feature generation, monitoring, retraining.
-
You’re curious about systems design for ML features: latency, throughput, failure modes, observability.
-
You’ve touched some part of the lifecycle around a deployed model, even if it wasn’t your primary job.
-
Experience
-
6+ years as a data scientist or senior analytical role.
-
Experience with large-scale user engagement and behavior data. Streaming, entertainment, marketplace, or consumer subscription domains preferred.
-
Track record of defining metrics frameworks that stakeholders actually adopted.
-
Familiarity with modern data tools: dbt, data warehousing (Snowflake, BigQuery, Redshift), experimentation platforms (GrowthBook, Optimizely), BI tools (Rill, Looker).
-
Experience with recommendation systems or personalization is a strong plus, not a prerequisite.
The Problem Space
-
A catalog of roughly 1,100 titles that has grown 10x in two years, with heavy top-title concentration. The top 10% of titles drive the majority of watch hours. Discovery needs to surface the long tail.
-
Three platforms (TV, mobile, web) with starkly different engagement patterns. TV drives the highest engagement but is mostly single-title sessions. Mobile has more browsing behavior. Web is underserved.
-
A recommendation system (AWS Personalize) that shows strong retention signal for engaged users but has significant precision and coverage gaps to close.
-
A unique data asset in Guild voting behavior. Members vote on what gets produced and funded before they ever watch it. That signal may be the most differentiated input the recommendation system has.
-
A content model unlike general streaming: faith-friendly, owned IP, theatrical-to-streaming pipeline. What “good discovery” means here is genuinely different from Netflix or Spotify.