|
1 | 1 | --- |
2 | 2 | title: "AIP-C01: Cheatsheet" |
3 | | -description: "Quick reference and decision rules for the AWS Certified Generative AI Developer – Professional exam" |
| 3 | +description: "One-page exam day reference for AIP-C01 AWS Certified Generative AI Developer – Professional" |
4 | 4 | head: |
5 | 5 | - - meta |
6 | 6 | - name: keywords |
7 | | - content: aip-c01, aws, bedrock, cheatsheet, quick reference, invokemodel, guardrails, rag, provisioned throughput, opensearch, vector store |
| 7 | + content: aip-c01, aws, bedrock, cheatsheet, quick reference, invokemodel, guardrails, rag, provisioned throughput, opensearch, vector store, exam day |
8 | 8 | --- |
9 | 9 |
|
10 | | -# AIP-C01 Cheatsheet |
| 10 | +# AIP-C01: Cheatsheet |
11 | 11 |
|
12 | | -[← Back to Overview](./index.md) |
| 12 | +[← Overview](./index.md) · [← Exam Guide](./exam-guide.md) |
13 | 13 |
|
14 | | -::: tip Quick Reference |
15 | | -Decision rules and service comparisons for last-minute review before the AIP-C01 exam. |
| 14 | +::: danger Exam Day Reference |
| 15 | +Review this page 5 minutes before the exam. |
16 | 16 | ::: |
17 | 17 |
|
18 | | -*Content coming soon as I progress through my studies.* |
| 18 | +--- |
| 19 | + |
| 20 | +## Foundation Model Quick Reference |
| 21 | + |
| 22 | +| FM | Vendor | Key Strength | Best For | |
| 23 | +|---|---|---|---| |
| 24 | +| **Claude** | Anthropic | 200k token context, reasoning | Long docs, complex reasoning | |
| 25 | +| **Llama** | Meta | Open-source, fine-tunable | Custom fine-tuning | |
| 26 | +| **Mistral** | Mistral AI | Efficient, fast | Cost-efficient inference | |
| 27 | +| **Titan** | AWS | AWS-native, embeddings | RAG embeddings, summarization | |
| 28 | +| **Cohere Embed** | Cohere | Multilingual embeddings | Multilingual RAG | |
| 29 | + |
| 30 | +--- |
| 31 | + |
| 32 | +## Bedrock API Comparison |
| 33 | + |
| 34 | +| API | Delivery | Use Case | |
| 35 | +|---|---|---| |
| 36 | +| `InvokeModel` | Synchronous (full response) | Simple query-response | |
| 37 | +| `InvokeModelWithResponseStream` | Streaming (token by token) | Low-latency UX / chat | |
| 38 | +| `InvokeAgent` | Streaming + trace | Multi-step agentic workflows | |
| 39 | + |
| 40 | +--- |
| 41 | + |
| 42 | +## Agents vs. Knowledge Bases |
| 43 | + |
| 44 | +| | Knowledge Base (RAG only) | Bedrock Agents | |
| 45 | +|---|---|---| |
| 46 | +| **External API calls** | No | Yes (via Action Groups + Lambda) | |
| 47 | +| **Multi-step reasoning** | No | Yes | |
| 48 | +| **Document retrieval** | Yes | Yes (optional) | |
| 49 | +| **Best for** | Static knowledge Q&A | Dynamic workflows, tool use | |
| 50 | + |
| 51 | +--- |
| 52 | + |
| 53 | +## Guardrails — Four Filter Types |
| 54 | + |
| 55 | +| Filter Type | What It Controls | |
| 56 | +|---|---| |
| 57 | +| **Content Filters** | Harmful categories: hate, violence, sexual content, insults | |
| 58 | +| **Denied Topics** | Topics the model must refuse to discuss | |
| 59 | +| **Word Filters** | Exact word/phrase blocklists | |
| 60 | +| **PII Redaction** | Names, emails, SSNs, credit cards — Redact or Block | |
| 61 | + |
| 62 | +**Key rules:** |
| 63 | +- Guardrails apply to **both inputs AND outputs** |
| 64 | +- Must be explicitly applied per API call via `guardrailIdentifier` + `guardrailVersion` |
| 65 | +- PII modes: **Redact** (mask with placeholder) vs. **Block** (reject request/response) |
| 66 | + |
| 67 | +--- |
| 68 | + |
| 69 | +## Vector Store Options |
| 70 | + |
| 71 | +| Store | Type | Use When | |
| 72 | +|---|---|---| |
| 73 | +| **Amazon OpenSearch Serverless** | Managed, serverless | Bedrock Knowledge Bases (default) | |
| 74 | +| **Aurora PostgreSQL (pgvector)** | RDS extension | Existing PostgreSQL infrastructure | |
| 75 | +| **Amazon Kendra** | Enterprise search | NLP-powered enterprise retrieval | |
| 76 | + |
| 77 | +--- |
| 78 | + |
| 79 | +## Chunking Strategies |
| 80 | + |
| 81 | +| Strategy | Best For | Trade-off | |
| 82 | +|---|---|---| |
| 83 | +| **Fixed-size** | Uniform documents | May break context at boundaries | |
| 84 | +| **Fixed-size + overlap** | Preserving cross-boundary context | Higher storage cost | |
| 85 | +| **Semantic** | Varied, long-form content | Higher processing complexity | |
| 86 | +| **Hierarchical** | Complex docs: broad + fine retrieval | More complex retrieval logic | |
| 87 | + |
| 88 | +--- |
| 89 | + |
| 90 | +## Provisioned Throughput vs. On-Demand |
| 91 | + |
| 92 | +| | Provisioned Throughput (PTU) | On-Demand | |
| 93 | +|---|---|---| |
| 94 | +| **Traffic** | Predictable, 24/7 | Sporadic, variable | |
| 95 | +| **Pricing** | Fixed (per MU/hour) | Per token | |
| 96 | +| **Commitment** | 1 month or 6 months | None | |
| 97 | +| **Best for** | Production steady-state | Dev/test, bursts | |
| 98 | + |
| 99 | +Batch Inference = ~50% cheaper than on-demand for non-real-time high-volume jobs. |
19 | 100 |
|
20 | 101 | --- |
21 | 102 |
|
22 | | -## Amazon Bedrock APIs |
23 | | -- `InvokeModel`: Synchronous call to get a complete response. |
24 | | -- `InvokeModelWithResponseStream`: Asynchronous/streaming response for low-latency feel. |
| 103 | +## Model Evaluation Metrics |
| 104 | + |
| 105 | +| Metric | What It Measures | |
| 106 | +|---|---| |
| 107 | +| **Groundedness** | Response supported by retrieved context? (detects hallucinations) | |
| 108 | +| **Relevance** | Response answers the user's question? | |
| 109 | +| **Accuracy** | Factually correct? | |
| 110 | +| **Fluency** | Well-written and natural? | |
| 111 | + |
| 112 | +--- |
| 113 | + |
| 114 | +## Quick Decision Rules |
| 115 | + |
| 116 | +**Vector store for Knowledge Base?** |
| 117 | +→ Amazon OpenSearch Serverless (default) · pgvector on Aurora (alternative) |
| 118 | + |
| 119 | +**Multi-step reasoning + tool use?** |
| 120 | +→ Bedrock Agents + Action Groups (Lambda) |
| 121 | + |
| 122 | +**Content moderation / PII?** |
| 123 | +→ Guardrails for Amazon Bedrock |
| 124 | + |
| 125 | +**Audit trail for compliance?** |
| 126 | +→ AWS CloudTrail (not CloudWatch) |
| 127 | + |
| 128 | +**Operational monitoring (latency, errors, token counts)?** |
| 129 | +→ Amazon CloudWatch |
| 130 | + |
| 131 | +**Predictable 24/7 throughput?** |
| 132 | +→ Provisioned Throughput (PTU) — 1 or 6 month commitment |
| 133 | + |
| 134 | +**Bulk, non-real-time inference?** |
| 135 | +→ Batch Inference |
| 136 | + |
| 137 | +**Private connectivity to Bedrock API?** |
| 138 | +→ VPC Endpoint — `bedrock-runtime` for inference calls |
| 139 | + |
| 140 | +**Knowledge changes frequently / need traceability?** |
| 141 | +→ RAG (Knowledge Bases), not fine-tuning |
| 142 | + |
| 143 | +**Detect hallucinations in RAG?** |
| 144 | +→ Groundedness metric in Model Evaluation |
| 145 | + |
| 146 | +**Log all prompts and responses for AI governance?** |
| 147 | +→ Model Invocation Logging (to S3 or CloudWatch Logs) |
| 148 | + |
| 149 | +--- |
| 150 | + |
| 151 | +## Key Terminology |
| 152 | + |
| 153 | +- **FM**: Foundation Model — pre-trained large AI model (Claude, Llama, Titan, etc.) |
| 154 | +- **RAG**: Retrieval-Augmented Generation — FM inference + vector store retrieval |
| 155 | +- **PTU**: Provisioned Throughput Unit — reserved Bedrock model capacity |
| 156 | +- **MU**: Model Unit — unit of PTU capacity purchased |
| 157 | +- **PII**: Personally Identifiable Information — data that identifies an individual |
| 158 | +- **Groundedness**: Metric measuring how well a response is grounded in retrieved context |
| 159 | +- **Hallucination**: FM generating information not present in the provided context |
| 160 | +- **Action Group**: Lambda function exposed to a Bedrock Agent for tool use |
| 161 | +- **OpenSearch Serverless**: Managed, serverless vector store used by Bedrock Knowledge Bases |
| 162 | +- **Batch Inference**: Asynchronous bulk FM inference via S3 JSONL input/output |
| 163 | +- **Model Invocation Logging**: Bedrock feature that logs all prompts + responses to S3/CloudWatch Logs |
| 164 | + |
| 165 | +--- |
25 | 166 |
|
26 | | -## Decision Rules |
27 | | -| Requirement | Service/Feature | |
28 | | -|-------------|-----------------| |
29 | | -| Predictable Throughput | Provisioned Throughput | |
30 | | -| Data Residency | Private Endpoints + Region Selection | |
31 | | -| Content Moderation | Guardrails for Amazon Bedrock | |
32 | | -| Knowledge Base | Amazon OpenSearch Serverless | |
| 167 | +[← Overview](./index.md) · [← Exam Guide](./exam-guide.md) |
0 commit comments