NFL data ELT tool.
Find a file
2025-12-01 19:20:05 -05:00
dbt chore(): remove unused seed 2025-12-01 19:19:40 -05:00
scripts feat(): set historical epoch to 1997 2025-12-01 19:20:05 -05:00
.env.example Add Infisical integration 2025-08-17 01:17:21 -04:00
.gitignore Create raw directory 2025-07-23 21:30:42 -04:00
.infisical.json chore(): rotate infisical workspace id 2025-09-29 20:51:12 -04:00
app.py feat(): add team endpoint and modeling 2025-12-01 19:19:24 -05:00
fantasy_scoring.md Fix extension 2025-03-31 01:41:19 -04:00
README.md Update README and CLI 2025-09-17 19:07:07 -04:00
requirements.txt chore(): upgrade dbt 2025-11-26 22:21:29 -05:00

NFL-DATA

NFL data extraction, load, and transformation from ESPN NFL API using Python and dbt.

Project setup

Set up and activate a virtual environment with the following command:

python -m venv venv

# Windows
./venv/Scripts/activate

# Linux
source venv/bin/activate

Install all the necessary packages using the following command:

python -m pip install -r requirements.txt

Environment variables

Secrets management is made possible using Infisical. Install the CLI to inject secrets.

Otherwise, use a .env to specify the required variables. See .env.example.

Command reference

NFL-DATA CLI

Infisical and secrets

Using Infisical for secrets management is recommended. Commands can be wrapped in an Infisical call:

infisical run --command "python app.py ..."

Load strategy

All commands have the --load-strategy or -ls option to specify how NFL-DATA will load new JSON data into the warehouse:

  • replace clears out all previously loaded JSON data before inserting.
  • add will load JSON data without any additional checks. Will introduce multiple entries for the same object.
  • skip will only load JSON data if no other data exists for a given object.

Commands

python app.py game-day # Game day run: pull today's schedule and load data for completed games.
python app.py load-season [year] # Load season data
python app.py load-event [event-id] # Load event, drives, and roster data
python app.py load-venue [venue-id|all] # Load venue data; will load all venues when venue-id is unspecified

dbt

To use dbt commands, first change directory into the dbt subdirectory.

cd dbt/

From here, all dbt commands can be accesseed. Take a look at the dbt Command Reference page for details.

Typically, a full build is the command of choice when using this project. The daily run CLI command runs a full build after all API extract and load tasks:

dbt build

API References

About raw NFL data

Event states

Events can be in one of the following known states:

  1. STATUS_SCHEDULED for events that are upcoming.
  2. STATUS_FINAL for completed events.
  3. STATUS_IN_PROGRESS for events in progress.
  4. STATUS_HALFTIME for events in halftime.
  5. STATUS_POSTPONED for events that have been cancelled or rescheduled.