| dbt | ||
| scripts | ||
| .env.example | ||
| .gitignore | ||
| .infisical.json | ||
| app.py | ||
| fantasy_scoring.md | ||
| README.md | ||
| requirements.txt | ||
NFL-DATA
NFL data extraction, load, and transformation from ESPN NFL API using Python and dbt.
Project setup
Set up and activate a virtual environment with the following command:
python -m venv venv
# Windows
./venv/Scripts/activate
# Linux
source venv/bin/activate
Install all the necessary packages using the following command:
python -m pip install -r requirements.txt
Environment variables
Secrets management is made possible using Infisical. Install the CLI to inject secrets.
Otherwise, use a .env to specify the required variables. See .env.example.
Command reference
NFL-DATA CLI
Infisical and secrets
Using Infisical for secrets management is recommended. Commands can be wrapped in an Infisical call:
infisical run --command "python app.py ..."
Load strategy
All commands have the --load-strategy or -ls option to specify how NFL-DATA will load new JSON data into the warehouse:
replaceclears out all previously loaded JSON data before inserting.addwill load JSON data without any additional checks. Will introduce multiple entries for the same object.skipwill only load JSON data if no other data exists for a given object.
Commands
python app.py game-day # Game day run: pull today's schedule and load data for completed games.
python app.py load-season [year] # Load season data
python app.py load-event [event-id] # Load event, drives, and roster data
python app.py load-venue [venue-id|all] # Load venue data; will load all venues when venue-id is unspecified
dbt
To use dbt commands, first change directory into the dbt subdirectory.
cd dbt/
From here, all dbt commands can be accesseed. Take a look at the dbt Command Reference page for details.
Typically, a full build is the command of choice when using this project. The daily run CLI command runs a full build after all API extract and load tasks:
dbt build
API References
About raw NFL data
Event states
Events can be in one of the following known states:
STATUS_SCHEDULEDfor events that are upcoming.STATUS_FINALfor completed events.STATUS_IN_PROGRESSfor events in progress.STATUS_HALFTIMEfor events in halftime.STATUS_POSTPONEDfor events that have been cancelled or rescheduled.