Skip to content

Inknyto/dbt_project

Repository files navigation

dbt Project: Job Market Intelligence Pipeline

Overview

This dbt project implements a modern ELT pipeline to transform raw job posting data into actionable insights. It focuses on scoring job listings based on title relevance and required technical skills (SQL, Python, dbt).

Project Architecture

1. Data Ingestion (Seeds)

  • jobsjumble_sliced: Raw job postings data including titles, companies, locations, and full descriptions.

2. Staging Layer (models/staging/)

  • stg_job_postings:
    • Description: Standardizes raw seed data.
    • Operations: Renames columns for clarity, cleans whitespace.
    • Tests: not_null checks on job_title and company_name.

3. Marts Layer (models/marts/)

  • fct_job_scoring:
    • Description: The core fact table that calculates a relevance_score for each posting.
    • Logic:
      • Title Scoring: Assigns points for specific keywords like 'Analyst', 'Engineer', and 'Senior'.
      • Skill Scoring: Parses the job_description for technical keywords like 'SQL', 'Python', and 'dbt'.
    • Materialization: Table.
    • Tests: not_null check on relevance_score.

4. Data Reliability (Snapshots)

  • job_postings_snapshot:
    • Strategy: check on relevance_score.
    • Purpose: Tracks how job scoring changes over time as data is updated or refined.

Macros

  • select_state: A utility macro for filtering data by state (included for legacy/utility demonstration).

Key Features for Resume

  • Automated Testing: Implemented schema tests to ensure data integrity.
  • SCD Type 2 Modeling: Used snapshots to capture history of transformed data.
  • Complex Transformations: Logic-driven scoring system using SQL CASE statements and string parsing.
  • Layered Architecture: Separation of concerns between staging (cleaning) and marts (business logic).

Execution

Run the full pipeline using Docker:

# Seed, Run, Test, Snapshot
dbt seed && dbt run && dbt test && dbt snapshot

About

Dbt project: Job Market Intelligence Pipeline

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors