Skip to content

Latest commit

 

History

History
168 lines (136 loc) · 4.07 KB

File metadata and controls

168 lines (136 loc) · 4.07 KB

Learning Progress

Track your learning journey here. Link to portfolio tasks as you complete them.


Module 1: Search Fundamentals

1.1 TF-IDF from Scratch

  • Status: Not Started
  • Started:
  • Completed:
  • Key Learnings:

  • Can I explain:
    • Why stopwords like "the" get low scores?
    • How IDF is calculated and why log is used?
    • How cosine similarity works for sparse vectors?
  • Notes:

1.2 BM25 Implementation

  • Status: Not Started
  • Started:
  • Completed:
  • Key Learnings:

  • Can I explain:
    • How BM25 improves on TF-IDF?
    • What k1 controls and when to increase it?
    • What b controls and when to decrease it?
    • Why document length normalization matters?
  • Notes:

1.3 Embeddings & Semantic Search

  • Status: Not Started
  • Started:
  • Completed:
  • Key Learnings:

  • Can I explain:
    • How text becomes a vector (conceptually)?
    • Difference between cosine, dot product, euclidean?
    • When semantic search beats keyword search?
    • When keyword search beats semantic?
    • How to combine them (hybrid)?
  • Notes:

1.4 Ranking Metrics

  • Status: Not Started
  • Started:
  • Completed:
  • Key Learnings:

  • Can I explain:
    • What Precision@k measures and its limits?
    • What Recall@k measures?
    • What MRR (Mean Reciprocal Rank) captures?
    • How DCG works and why position matters?
    • Why we normalize to get NDCG?
    • How to create relevance judgments?
  • Notes:

1.5 Re-ranking Pipeline

  • Status: Not Started
  • Started:
  • Completed:
  • Key Learnings:

  • Can I explain:
    • Why two-stage retrieval (recall → precision)?
    • What signals matter beyond text relevance?
    • How to combine signals (linear vs LTR)?
    • Trade-offs of different approaches?
  • Portfolio Link: → FasterShops/framework search re-ranking
  • Notes:

Module 2: Experimentation

2.1 Hypothesis Testing

  • Status: Not Started
  • Can I explain:
    • What null hypothesis and alternative hypothesis mean?
    • Type I vs Type II errors?
    • What p-value actually means (not "probability we're wrong")?
    • When to use Z-test vs t-test?

2.2 Sample Size Calculation

  • Status: Not Started
  • Can I explain:
    • What statistical power is and why 80%?
    • How MDE affects sample size?
    • How baseline rate affects sample size?
    • How to answer "how long to run this test?"

2.3 Experiment Analysis

  • Status: Not Started
  • Can I explain:
    • What "peeking" is and why it's a problem?
    • Simpson's paradox and segmentation risks?
    • What guardrail metrics are?
  • Portfolio Link: → FasterShops/framework analytics platform

Module 3: Agent Patterns

3.1 RAG from Scratch

  • Status: Not Started
  • Can I explain:
    • Different chunking strategies and tradeoffs?
    • Context window management?
    • When RAG fails?

3.2 Tool Calling Deep Dive

  • Status: Not Started
  • Can I explain:
    • How to design tool schemas?
    • Error handling in tool chains?
    • Parallel vs sequential execution?

3.3 Evaluation & Evals

  • Status: Not Started
  • Can I explain:
    • Deterministic vs LLM-as-judge evals?
    • How to create test sets?
    • How to measure if a prompt change helped?

3.4 Guardrails & Safety

  • Status: Not Started
  • Can I explain:
    • Common prompt injection patterns?
    • Content filtering approaches?
    • Rate limiting strategies?
  • Portfolio Link: → FasterShops/framework AI merchant assistant

Interview Prep Checklist

Before interviews, review this:

  • Can explain TF-IDF vs BM25 vs semantic search tradeoffs
  • Can whiteboard a re-ranking pipeline
  • Can calculate sample size for an A/B test
  • Can explain p-values correctly
  • Can design tool schemas for new use cases
  • Can discuss RAG chunking tradeoffs
  • Can explain guardrails without oversimplifying