Track your learning journey here. Link to portfolio tasks as you complete them.
- Status: Not Started
- Started:
- Completed:
- Can I explain:
- Why stopwords like "the" get low scores?
- How IDF is calculated and why log is used?
- How cosine similarity works for sparse vectors?
- Notes:
- Status: Not Started
- Started:
- Completed:
- Can I explain:
- How BM25 improves on TF-IDF?
- What k1 controls and when to increase it?
- What b controls and when to decrease it?
- Why document length normalization matters?
- Notes:
- Status: Not Started
- Started:
- Completed:
- Can I explain:
- How text becomes a vector (conceptually)?
- Difference between cosine, dot product, euclidean?
- When semantic search beats keyword search?
- When keyword search beats semantic?
- How to combine them (hybrid)?
- Notes:
- Status: Not Started
- Started:
- Completed:
- Can I explain:
- What Precision@k measures and its limits?
- What Recall@k measures?
- What MRR (Mean Reciprocal Rank) captures?
- How DCG works and why position matters?
- Why we normalize to get NDCG?
- How to create relevance judgments?
- Notes:
- Status: Not Started
- Started:
- Completed:
- Can I explain:
- Why two-stage retrieval (recall → precision)?
- What signals matter beyond text relevance?
- How to combine signals (linear vs LTR)?
- Trade-offs of different approaches?
- Portfolio Link: → FasterShops/framework search re-ranking
- Notes:
- Status: Not Started
- Can I explain:
- What null hypothesis and alternative hypothesis mean?
- Type I vs Type II errors?
- What p-value actually means (not "probability we're wrong")?
- When to use Z-test vs t-test?
- Status: Not Started
- Can I explain:
- What statistical power is and why 80%?
- How MDE affects sample size?
- How baseline rate affects sample size?
- How to answer "how long to run this test?"
- Status: Not Started
- Can I explain:
- What "peeking" is and why it's a problem?
- Simpson's paradox and segmentation risks?
- What guardrail metrics are?
- Portfolio Link: → FasterShops/framework analytics platform
- Status: Not Started
- Can I explain:
- Different chunking strategies and tradeoffs?
- Context window management?
- When RAG fails?
- Status: Not Started
- Can I explain:
- How to design tool schemas?
- Error handling in tool chains?
- Parallel vs sequential execution?
- Status: Not Started
- Can I explain:
- Deterministic vs LLM-as-judge evals?
- How to create test sets?
- How to measure if a prompt change helped?
- Status: Not Started
- Can I explain:
- Common prompt injection patterns?
- Content filtering approaches?
- Rate limiting strategies?
- Portfolio Link: → FasterShops/framework AI merchant assistant
Before interviews, review this:
- Can explain TF-IDF vs BM25 vs semantic search tradeoffs
- Can whiteboard a re-ranking pipeline
- Can calculate sample size for an A/B test
- Can explain p-values correctly
- Can design tool schemas for new use cases
- Can discuss RAG chunking tradeoffs
- Can explain guardrails without oversimplifying