Skip to content

Commit ce241b9

Browse files
committed
pca
1 parent be70bd4 commit ce241b9

4 files changed

Lines changed: 957 additions & 0 deletions

File tree

docs/gcp/cheatsheets/ai-ml.md

Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
# AI/ML Essentials
2+
3+
## 🎯 Heavy Hitters (High Frequency)
4+
5+
### 1. **Vertex AI Agent Builder**
6+
7+
- **What**: Low-code platform to build GenAI chatbots/conversational AI
8+
- **When**: Customer support bots, RAG applications, enterprise chatbots
9+
- **Key Feature**: Connects to data sources (websites, docs) automatically
10+
- **Exam Clue**: "Build a chatbot quickly without coding" → Agent Builder
11+
12+
### 2. **Vector Search**
13+
14+
- **What**: Find semantically similar content (not just keyword matching)
15+
- **Use Cases**:
16+
- Semantic search ("find similar products")
17+
- Image similarity ("find visually similar images")
18+
- Recommendation engines
19+
- **Two Options**:
20+
- **Vertex AI Vector Search**: Standalone, managed service
21+
- **AlloyDB + pgvector**: If data already in AlloyDB/PostgreSQL
22+
- **Exam Clue**: "semantic", "similar", "embeddings" → Vector Search
23+
24+
### 3. **Securing AI**
25+
26+
- **Model Armor**: Filters toxic/harmful AI outputs before reaching users
27+
- **VPC Service Controls**: Creates security perimeter around Vertex AI to prevent data exfiltration
28+
- **Exam Clue**: "Prevent training data leakage" → VPC-SC
29+
30+
---
31+
32+
## 🧠 Core Vertex AI Concepts
33+
34+
### **Model Garden**
35+
36+
- **What**: Marketplace/"App Store" for AI models
37+
- **Options**: Google's Gemini, OSS (Llama, Claude), third-party models
38+
- **When**: Client needs to compare/choose between different model providers
39+
- **Exam Clue**: "Evaluate multiple models" → Model Garden
40+
41+
### **Gemini Cloud Assist**
42+
43+
- **What**: AI-powered operations assistant
44+
- **Use Cases**:
45+
- GKE cost optimization recommendations
46+
- Network troubleshooting
47+
- Quick infrastructure insights
48+
- **Exam Clue**: "Quickly optimize/troubleshoot infrastructure" → Cloud Assist
49+
50+
---
51+
52+
## 📊 Data-to-AI Workflow
53+
54+
### **BigQuery ML**
55+
56+
- **When**: Data already in BigQuery + simple ML (regression/classification)
57+
- **Benefit**: No data movement, SQL-based ML
58+
- **Exam Clue**: "Data in BQ, simple prediction" → BQML
59+
- **Not For**: Complex deep learning, image/video models
60+
61+
### **Vertex AI Pipelines**
62+
63+
- **What**: MLOps orchestration (automated training/retraining workflows)
64+
- **When**: Need repeatable, production ML pipelines with CI/CD
65+
- **Components**: Kubeflow Pipelines or TFX
66+
- **Exam Clue**: "Automate model retraining", "MLOps" → Pipelines
67+
68+
---
69+
70+
## 🔒 AI Security (Critical for PCA)
71+
72+
### **VPC Service Controls (AI Context)**
73+
74+
- **What**: Security perimeter preventing data from leaving your environment
75+
- **Use With**: Vertex AI, BigQuery, Cloud Storage
76+
- **Exam Clue**: "Prevent data exfiltration during training" → VPC-SC
77+
- **Setup**: Create perimeter → Add projects → Restrict egress
78+
79+
### **Sensitive Data Protection (DLP)**
80+
81+
- **What**: Identify and redact PII/sensitive data
82+
- **Use Cases**:
83+
- Redact names/SSNs before model training
84+
- De-identify healthcare data (HIPAA)
85+
- Scan datasets for PII
86+
- **Methods**: Masking, tokenization, redaction
87+
- **Exam Clue**: "Remove PII before training" → DLP API
88+
89+
---
90+
91+
## 🎓 Exam Decision Tree
92+
93+
```
94+
Question mentions "chatbot" → Agent Builder
95+
Question mentions "semantic/similar" → Vector Search
96+
Question mentions "data leakage prevention" → VPC Service Controls
97+
Question mentions "PII removal" → DLP
98+
Data in BigQuery + simple ML → BigQuery ML
99+
Need automated retraining → Vertex AI Pipelines
100+
Compare multiple models → Model Garden
101+
Quick infra optimization → Gemini Cloud Assist
102+
```
103+
104+
---
105+
106+
## ⚡ Quick Reminders
107+
108+
- **Vertex AI** = Unified ML platform (training, deployment, monitoring)
109+
- **Embeddings** = Vector representations → Use Vector Search
110+
- **RAG** = Retrieval Augmented Generation → Agent Builder + Vector Search
111+
- **MLOps** = Pipelines + Monitoring + Continuous training
112+
- **Security Layers**: VPC-SC (network) + DLP (data) + Model Armor (output)

0 commit comments

Comments
 (0)