dbt Project: Job Market Intelligence Pipeline

Overview

This dbt project implements a modern ELT pipeline to transform raw job posting data into actionable insights. It focuses on scoring job listings based on title relevance and required technical skills (SQL, Python, dbt).

Project Architecture

1. Data Ingestion (Seeds)

jobsjumble_sliced: Raw job postings data including titles, companies, locations, and full descriptions.

2. Staging Layer (`models/staging/`)

stg_job_postings:
- Description: Standardizes raw seed data.
- Operations: Renames columns for clarity, cleans whitespace.
- Tests: not_null checks on job_title and company_name.

3. Marts Layer (`models/marts/`)

fct_job_scoring:
- Description: The core fact table that calculates a relevance_score for each posting.
- Logic:
  - Title Scoring: Assigns points for specific keywords like 'Analyst', 'Engineer', and 'Senior'.
  - Skill Scoring: Parses the job_description for technical keywords like 'SQL', 'Python', and 'dbt'.
- Materialization: Table.
- Tests: not_null check on relevance_score.

4. Data Reliability (Snapshots)

job_postings_snapshot:
- Strategy: check on relevance_score.
- Purpose: Tracks how job scoring changes over time as data is updated or refined.

Macros

select_state: A utility macro for filtering data by state (included for legacy/utility demonstration).

Key Features for Resume

Automated Testing: Implemented schema tests to ensure data integrity.
SCD Type 2 Modeling: Used snapshots to capture history of transformed data.
Complex Transformations: Logic-driven scoring system using SQL CASE statements and string parsing.
Layered Architecture: Separation of concerns between staging (cleaning) and marts (business logic).

Execution

Run the full pipeline using Docker:

# Seed, Run, Test, Snapshot
dbt seed && dbt run && dbt test && dbt snapshot

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
analyses		analyses
macros		macros
models		models
seeds		seeds
snapshots		snapshots
tests		tests
.gitignore		.gitignore
DOCUMENTATION.html		DOCUMENTATION.html
README.md		README.md
dbt_project.yml		dbt_project.yml
run_dbt_docker.sh		run_dbt_docker.sh
setup_dbt_postgres.sql		setup_dbt_postgres.sql

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

dbt Project: Job Market Intelligence Pipeline

Overview

Project Architecture

1. Data Ingestion (Seeds)

2. Staging Layer (`models/staging/`)

3. Marts Layer (`models/marts/`)

4. Data Reliability (Snapshots)

Macros

Key Features for Resume

Execution

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Inknyto/dbt_project

Folders and files

Latest commit

History

Repository files navigation

dbt Project: Job Market Intelligence Pipeline

Overview

Project Architecture

1. Data Ingestion (Seeds)

2. Staging Layer (models/staging/)

3. Marts Layer (models/marts/)

4. Data Reliability (Snapshots)

Macros

Key Features for Resume

Execution

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

2. Staging Layer (`models/staging/`)

3. Marts Layer (`models/marts/`)

Packages