diff --git a/README.md b/README.md index 54f9e56..c6122e6 100644 --- a/README.md +++ b/README.md @@ -1,2 +1,74 @@ -# agentic-devops-sample -agentic devops sample +# Agentic DevOps Sample + +Welcome to the Agentic DevOps Sample repository! This repository contains resources for learning and presenting about agentic DevOps practices. + +## ๐Ÿ“ฆ What's Included + +### Event in a Box + +A complete package for delivering presentations and workshops on agentic DevOps. Perfect for speakers, trainers, and technical evangelists. + +**[Get Started with Event in a Box โ†’](./event-in-a-box/README.md)** + +What you'll find: +- ๐Ÿ“Š **Presentation Materials**: Ready-to-use slide decks and speaker notes +- ๐ŸŽฌ **Demo Scripts**: Step-by-step demonstration guides +- ๐ŸŽ“ **Workshop Content**: Hands-on labs and exercises +- ๐Ÿ› ๏ธ **Setup Guides**: Technical setup instructions +- ๐Ÿ“š **Resources**: FAQs, troubleshooting, and additional materials + +## ๐ŸŽฏ Use Cases + +- Deliver conference talks on agentic DevOps +- Run internal training sessions +- Host community workshops +- Create educational content +- Learn about intelligent automation in DevOps + +## ๐Ÿš€ Quick Start + +### For Presenters + +1. Browse the [Event in a Box](./event-in-a-box/) materials +2. Review the [Speaker Guide](./event-in-a-box/speaker-guide.md) +3. Set up the [Demo Environment](./event-in-a-box/setup/demo-setup.md) +4. Customize the [Presentation](./event-in-a-box/presentation/slides.md) +5. Practice with the [Demo Scripts](./event-in-a-box/demos/README.md) + +### For Learners + +1. Explore the [Event in a Box](./event-in-a-box/) to understand concepts +2. Try the [Workshop Labs](./event-in-a-box/workshop/README.md) +3. Read the [FAQ](./event-in-a-box/resources/faq.md) +4. Check out example implementations + +## ๐Ÿ“– Learn More + +- **What is Agentic DevOps?** - Intelligent, autonomous systems that make decisions and take actions within DevOps workflows +- **Why Agentic DevOps?** - Faster response times, reduced toil, improved reliability, enhanced scalability +- **How to Get Started?** - See the [Event in a Box README](./event-in-a-box/README.md) + +## ๐Ÿค Contributing + +We welcome contributions! Whether you want to: +- Improve documentation +- Add new demos +- Create workshop content +- Share your experiences +- Fix bugs or typos + +Please see [CONTRIBUTING.md](./event-in-a-box/CONTRIBUTING.md) for guidelines. + +## ๐Ÿ“„ License + +This project is licensed under the MIT License - see the LICENSE file for details. + +## ๐Ÿ’ฌ Community + +- **Discussions**: Share ideas and ask questions +- **Issues**: Report bugs or request features +- **Pull Requests**: Contribute improvements + +## ๐Ÿ™ Acknowledgments + +Thanks to all contributors who have helped build and improve these materials! diff --git a/event-in-a-box/CONTRIBUTING.md b/event-in-a-box/CONTRIBUTING.md new file mode 100644 index 0000000..966bd67 --- /dev/null +++ b/event-in-a-box/CONTRIBUTING.md @@ -0,0 +1,266 @@ +# Contributing to Event in a Box + +Thank you for your interest in contributing to the Agentic DevOps Event in a Box! This guide will help you get started. + +## How to Contribute + +There are many ways to contribute: + +### ๐Ÿ“ Content Contributions +- Improve documentation +- Add new demo scripts +- Create workshop exercises +- Write blog posts or tutorials +- Translate content + +### ๐Ÿ› Bug Reports +- Report issues with setup instructions +- Document problems with demos +- Identify outdated content + +### ๐Ÿ’ก Feature Requests +- Suggest new demos +- Propose new workshop modules +- Request additional resources + +### ๐ŸŽจ Design Improvements +- Improve slide templates +- Create diagrams and visualizations +- Design promotional materials + +### ๐Ÿงช Testing +- Test demos in different environments +- Validate workshop materials +- Provide feedback on content + +## Getting Started + +### 1. Fork the Repository + +```bash +# Fork on GitHub, then clone your fork +git clone https://github.com/YOUR-USERNAME/agentic-devops-sample +cd agentic-devops-sample +``` + +### 2. Create a Branch + +```bash +# Create a descriptive branch name +git checkout -b feature/add-kubernetes-demo +# or +git checkout -b fix/update-setup-instructions +# or +git checkout -b docs/improve-speaker-guide +``` + +### 3. Make Your Changes + +Follow these guidelines: + +#### Documentation +- Use clear, concise language +- Include examples where helpful +- Test all commands and code snippets +- Check spelling and grammar +- Follow existing formatting style + +#### Demo Scripts +- Include prerequisites +- Provide step-by-step instructions +- Add troubleshooting tips +- Test in clean environment +- Include expected output + +#### Code +- Follow existing code style +- Add comments for complex logic +- Include error handling +- Write tests if applicable +- Update documentation + +### 4. Test Your Changes + +```bash +# Test documentation links +markdown-link-check **/*.md + +# Test demo scripts +cd event-in-a-box/demos +./test-all-demos.sh + +# Spell check +aspell check file.md +``` + +### 5. Commit Your Changes + +Write clear, descriptive commit messages: + +```bash +# Good commit messages +git commit -m "Add Kubernetes demo script with troubleshooting section" +git commit -m "Fix broken link in setup guide" +git commit -m "Update Python requirements for security patches" + +# Less helpful commit messages (avoid these) +git commit -m "Update docs" +git commit -m "Fix stuff" +git commit -m "WIP" +``` + +### 6. Push and Create Pull Request + +```bash +# Push to your fork +git push origin feature/add-kubernetes-demo + +# Create pull request on GitHub +# Provide clear description of changes +``` + +## Pull Request Guidelines + +### PR Description Template + +```markdown +## Description +Brief description of what this PR does. + +## Type of Change +- [ ] Bug fix +- [ ] New feature +- [ ] Documentation update +- [ ] Content improvement +- [ ] Other (please describe) + +## Changes Made +- Bullet point list of specific changes + +## Testing Done +- How you tested these changes +- Environments tested in + +## Screenshots (if applicable) +Add screenshots to help explain your changes + +## Checklist +- [ ] I have tested these changes +- [ ] Documentation is updated +- [ ] All links work +- [ ] Code follows style guidelines +- [ ] Commit messages are clear +``` + +### Review Process + +1. **Automated Checks**: CI will run automatically +2. **Maintainer Review**: A maintainer will review your PR +3. **Feedback**: Address any requested changes +4. **Approval**: Once approved, your PR will be merged + +## Content Guidelines + +### Writing Style + +- **Be Clear**: Use simple, straightforward language +- **Be Concise**: Get to the point quickly +- **Be Practical**: Include real examples +- **Be Inclusive**: Use inclusive language +- **Be Accurate**: Test everything you document + +### Formatting Standards + +#### Markdown +```markdown +# Main Title (H1) + +## Section (H2) + +### Subsection (H3) + +**Bold** for emphasis +*Italic* for slight emphasis +`code` for inline code +``` + +#### Code Blocks +```markdown +```bash +# Always specify language +# Include comments +command --with-flags +``` +``` + +#### Links +```markdown +[Descriptive Text](./relative/path/to/file.md) +[External Link](https://example.com) +``` + +### Demo Script Standards + +Every demo should include: + +1. **Overview**: What it demonstrates +2. **Duration**: How long it takes +3. **Difficulty**: Beginner/Intermediate/Advanced +4. **Prerequisites**: What's needed +5. **Setup**: Preparation steps +6. **Script**: Step-by-step instructions +7. **Common Issues**: Troubleshooting +8. **Cleanup**: How to reset + +### Workshop Material Standards + +Every lab should include: + +1. **Learning Objectives**: What participants will learn +2. **Prerequisites**: Required knowledge +3. **Instructions**: Clear steps +4. **Starter Code**: Template to begin +5. **Solution**: Reference implementation +6. **Tests**: Validation +7. **Extensions**: Optional challenges + +## Code of Conduct + +### Our Standards + +- **Be Respectful**: Treat everyone with respect +- **Be Collaborative**: Work together constructively +- **Be Inclusive**: Welcome diverse perspectives +- **Be Patient**: Everyone is learning +- **Be Professional**: Keep discussions on-topic + +### Unacceptable Behavior + +- Harassment or discrimination +- Trolling or insulting comments +- Personal attacks +- Publishing private information +- Other unprofessional conduct + +## Recognition + +Contributors will be recognized in: +- README contributors section +- Release notes +- Community highlights +- Speaker acknowledgments (if you present using these materials) + +## Questions? + +- ๐Ÿ’ฌ **Discussions**: Use GitHub Discussions for questions +- ๐Ÿ“ง **Email**: [maintainer-email] +- ๐Ÿ’ก **Issues**: Open an issue for bugs or features +- ๐Ÿ“ฃ **Slack/Discord**: Join the community channel + +## License + +By contributing, you agree that your contributions will be licensed under the same license as the project. + +## Thank You! + +Your contributions help make this resource better for everyone. We appreciate your time and effort! ๐ŸŽ‰ diff --git a/event-in-a-box/README.md b/event-in-a-box/README.md new file mode 100644 index 0000000..4fd8ce5 --- /dev/null +++ b/event-in-a-box/README.md @@ -0,0 +1,70 @@ +# Event in a Box: Agentic DevOps + +Welcome to the Agentic DevOps Event in a Box! This package contains everything you need to deliver a successful presentation or workshop on agentic DevOps practices. + +## ๐Ÿ“ฆ What's Included + +- **Presentation Materials**: Slide decks and speaker notes +- **Demo Scripts**: Step-by-step demo walkthroughs +- **Workshop Materials**: Hands-on exercises and labs +- **Setup Guide**: Technical setup instructions +- **Resources**: Additional reading and reference materials + +## ๐ŸŽฏ Target Audience + +- DevOps Engineers +- Software Developers +- Technical Managers +- Solution Architects + +## โฑ๏ธ Session Formats + +This content can be adapted for: +- 30-minute presentation +- 60-minute deep dive +- 2-hour workshop +- Half-day hands-on lab + +## ๐Ÿš€ Getting Started + +1. Review the [Speaker Guide](./speaker-guide.md) +2. Set up the [Demo Environment](./setup/demo-setup.md) +3. Customize the [Presentation](./presentation/slides.md) +4. Practice with the [Demo Scripts](./demos/README.md) + +## ๐Ÿ“‹ Prerequisites + +For presenters: +- Basic understanding of DevOps principles +- Familiarity with CI/CD concepts +- GitHub account (for demos) + +For attendees: +- Basic programming knowledge +- Understanding of version control +- GitHub account (for hands-on exercises) + +## ๐Ÿ› ๏ธ Technical Requirements + +- GitHub repository access +- Azure DevOps or GitHub Actions environment +- Docker (for containerized demos) +- Python 3.8+ (for sample applications) + +## ๐Ÿ“š Additional Resources + +- [Official Documentation](./resources/documentation.md) +- [FAQ](./resources/faq.md) +- [Troubleshooting Guide](./resources/troubleshooting.md) + +## ๐Ÿค Contributing + +We welcome contributions! Please see [CONTRIBUTING.md](./CONTRIBUTING.md) for guidelines. + +## ๐Ÿ“„ License + +See the main repository LICENSE file for details. + +## ๐Ÿ’ฌ Feedback + +Have questions or suggestions? Please open an issue in the main repository. diff --git a/event-in-a-box/demos/README.md b/event-in-a-box/demos/README.md new file mode 100644 index 0000000..6ed0443 --- /dev/null +++ b/event-in-a-box/demos/README.md @@ -0,0 +1,161 @@ +# Demo Scripts + +This directory contains step-by-step scripts for demonstrating agentic DevOps concepts. + +## Available Demos + +### 1. Self-Healing Pipeline Demo +**Duration:** 5-10 minutes +**Difficulty:** Beginner +**File:** [self-healing-pipeline.md](./self-healing-pipeline.md) + +Demonstrates how an agent can detect and automatically fix common CI/CD pipeline failures. + +**Key Concepts:** +- Automated error detection +- Pattern matching +- Self-remediation +- Feedback loops + +### 2. Intelligent Resource Scaling Demo +**Duration:** 10-15 minutes +**Difficulty:** Intermediate +**File:** [intelligent-scaling.md](./intelligent-scaling.md) + +Shows how agents can make real-time decisions about resource allocation based on load patterns. + +**Key Concepts:** +- Predictive scaling +- Cost optimization +- Performance monitoring +- Dynamic configuration + +### 3. Automated Incident Response Demo +**Duration:** 10-15 minutes +**Difficulty:** Intermediate +**File:** [incident-response.md](./incident-response.md) + +Illustrates an agent handling a production incident from detection to resolution. + +**Key Concepts:** +- Anomaly detection +- Automated diagnostics +- Remediation strategies +- Escalation logic + +## Demo Prerequisites + +### General Requirements +- GitHub account with repository access +- Basic familiarity with CI/CD concepts +- Terminal/command line access +- Internet connection + +### Specific Tools (depending on demo) +- Docker Desktop +- Python 3.8+ +- Azure CLI or AWS CLI (for cloud demos) +- kubectl (for Kubernetes demos) + +## Setup Instructions + +1. Clone the demo repository +2. Install required dependencies +3. Configure credentials +4. Test the demo environment +5. Review the demo script + +Detailed setup for each demo is in the respective demo file. + +## Tips for Successful Demos + +### Before the Demo +- [ ] Test the entire demo at least once +- [ ] Have a backup video recording +- [ ] Prepare fallback slides if live demo fails +- [ ] Clear your terminal history +- [ ] Close unnecessary applications +- [ ] Check your internet connection + +### During the Demo +- Narrate what you're doing +- Keep demos short and focused +- Have checkpoints where you pause for questions +- Use large fonts in terminal (18pt or larger) +- Zoom in on important output +- Explain what should happen before it happens + +### If Something Goes Wrong +- Stay calm and use your backup video +- Use pre-prepared screenshots +- Skip to the next checkpoint +- Explain what would have happened +- Continue with the rest of the demo + +## Demo Variations + +### 30-Minute Session +- Choose 1 demo +- Keep it simple +- Focus on concepts over implementation + +### 60-Minute Session +- Include 2 demos +- Add more technical depth +- Show configuration details + +### Workshop Format +- All 3 demos +- Hands-on exercises +- Troubleshooting practice +- Group discussions + +## Recording Demos + +If you want to record demos for backup or asynchronous viewing: + +1. Use high-quality screen recording software +2. Record audio separately for better quality +3. Keep recordings under 10 minutes each +4. Add captions or annotations +5. Test playback before the event + +## Troubleshooting + +### Common Issues + +**Demo environment not working:** +- Check all prerequisites are installed +- Verify credentials are configured +- Review firewall/network settings +- Try the quick-start script + +**Agents not responding:** +- Check agent logs +- Verify webhook configurations +- Test API connectivity +- Confirm permissions + +**Performance issues:** +- Close resource-intensive applications +- Use a local environment instead of remote +- Reduce demo complexity +- Pre-warm services before demo + +## Contributing + +Have a great demo idea? We'd love to include it! + +1. Create a new markdown file with the demo script +2. Include setup instructions +3. Add troubleshooting tips +4. Test thoroughly +5. Submit a pull request + +## Feedback + +After running a demo, please share: +- What worked well +- What didn't work as expected +- Audience questions and reactions +- Suggestions for improvement diff --git a/event-in-a-box/demos/self-healing-pipeline.md b/event-in-a-box/demos/self-healing-pipeline.md new file mode 100644 index 0000000..82a5275 --- /dev/null +++ b/event-in-a-box/demos/self-healing-pipeline.md @@ -0,0 +1,254 @@ +# Demo: Self-Healing CI/CD Pipeline + +## Overview + +This demo showcases an intelligent agent that monitors a CI/CD pipeline and automatically fixes common failures without human intervention. + +**Duration:** 5-10 minutes +**Difficulty:** Beginner +**Best for:** Introduction to agentic DevOps concepts + +## What You'll Demonstrate + +1. A CI/CD pipeline that encounters a common failure +2. An agent detecting the failure automatically +3. The agent analyzing the failure pattern +4. Automatic remediation being applied +5. Pipeline recovery and success + +## Prerequisites + +- GitHub repository with Actions enabled +- Python 3.8+ +- GitHub CLI (`gh`) installed and authenticated +- Docker (optional, for local testing) + +## Setup (15 minutes before demo) + +### 1. Prepare the Repository + +```bash +# Clone the demo repository +git clone https://github.com/[your-org]/agentic-pipeline-demo +cd agentic-pipeline-demo + +# Install dependencies +pip install -r requirements.txt + +# Configure the agent +cp .env.example .env +# Edit .env with your GitHub token +``` + +### 2. Configure the Agent + +The agent monitors for these failure patterns: +- Dependency installation failures +- Test environment setup issues +- Flaky test failures +- Resource timeout errors + +Configuration file: `agent-config.yaml` + +```yaml +agent: + name: pipeline-healer + triggers: + - event: workflow_run + status: failure + patterns: + - name: dependency_failure + match: "Could not find package" + action: retry_with_cache_clear + - name: flaky_test + match: "Test failed intermittently" + action: retry_test_suite + max_retries: 3 + notification: + on_success: true + on_failure: true +``` + +### 3. Test the Environment + +```bash +# Test agent connectivity +python agent.py --test + +# Verify GitHub Actions access +gh workflow list + +# Run a test cycle +python agent.py --dry-run +``` + +## Demo Script + +### Part 1: Introduction (1 minute) + +**What to say:** +> "Today we're going to see how an intelligent agent can automatically fix common CI/CD pipeline failures. This is a real GitHub Actions workflow, and we're going to trigger a failure that the agent will detect and fix without any human intervention." + +**What to show:** +- Open the GitHub repository in your browser +- Show the Actions tab +- Highlight recent successful runs + +### Part 2: Trigger a Failure (2 minutes) + +**What to do:** +```bash +# Create a branch with a known issue +git checkout -b demo-failure +echo "import nonexistent_package" >> src/app.py +git add . +git commit -m "Introduce dependency issue" +git push origin demo-failure +``` + +**What to say:** +> "I'm introducing a common issue - a missing package import. This happens all the time when dependencies aren't properly specified. Let's watch what happens." + +**What to show:** +- Show the file change in VS Code or GitHub +- Navigate to Actions tab +- Show the workflow starting + +### Part 3: Agent Detection (2 minutes) + +**What to show:** +- Switch to terminal showing agent logs +- Point out the detection message + +``` +[Agent] Detected workflow_run event: run_12345 +[Agent] Status: failure +[Agent] Analyzing logs... +[Agent] Pattern matched: dependency_failure +[Agent] Confidence: 95% +[Agent] Initiating remediation... +``` + +**What to say:** +> "The agent detected the failure immediately. It analyzed the error logs, matched it against known patterns, and identified this as a dependency issue with high confidence. Now it's going to take action." + +### Part 4: Automatic Remediation (2 minutes) + +**What to show:** +- Agent logs showing remediation steps + +``` +[Agent] Action: retry_with_cache_clear +[Agent] Clearing dependency cache... +[Agent] Updating requirements.txt... +[Agent] Triggering retry... +[Agent] New run: run_12346 +``` + +**What to say:** +> "The agent cleared the cache, verified the requirements file, and triggered a new run. This all happened in seconds, without anyone having to investigate the logs or manually retry the workflow." + +### Part 5: Success and Learning (1 minute) + +**What to show:** +- GitHub Actions showing the new run succeeding +- Agent logs showing success + +``` +[Agent] Run run_12346: success +[Agent] Resolution time: 45 seconds +[Agent] Updating knowledge base... +[Agent] Similar issues in history: 3 +[Agent] Success rate: 100% +``` + +**What to say:** +> "Success! The pipeline is now passing. The agent also updated its knowledge base, so it will handle similar issues even faster next time. Notice the resolution time - 45 seconds from failure to fix. Without automation, this could have taken minutes or hours depending on when someone noticed." + +### Part 6: Show the Agent Dashboard (Optional, 1 minute) + +**What to show:** +- Agent dashboard or metrics page +- Historical success rate +- Common failure patterns +- Time savings metrics + +**What to say:** +> "Here's the agent's dashboard showing all the incidents it's handled. You can see patterns, success rates, and how much time it's saved the team." + +## Key Points to Emphasize + +1. **Speed:** From failure to resolution in under a minute +2. **Autonomy:** No human intervention required +3. **Learning:** Agent improves over time +4. **Consistency:** Same issue handled the same way every time +5. **Transparency:** Complete audit trail of all actions + +## Common Questions + +**Q: What if the agent makes the wrong decision?** +A: The agent operates within defined guardrails. For high-risk actions, it can require human approval. All actions are logged and can be reviewed or reversed. + +**Q: How does it learn?** +A: It builds a knowledge base of patterns and outcomes. Over time, it recognizes similar issues and improves its confidence and speed. + +**Q: Can it handle any failure?** +A: It handles common, well-understood failures automatically. Novel or critical failures are escalated to humans with all the diagnostic data already collected. + +**Q: Is this production-ready?** +A: Yes! Many teams use similar approaches. Start with low-risk scenarios and expand as you build confidence. + +## Backup Plan + +If the live demo fails: + +1. **Have a video recording** of the successful demo +2. **Use screenshots** at key stages +3. **Show historical runs** in GitHub Actions +4. **Walk through the code** and explain the logic + +## Variations + +### Shorter Version (3 minutes) +- Skip the setup explanation +- Use a pre-recorded failure +- Focus on agent response only + +### Longer Version (15 minutes) +- Show the agent code +- Explain the pattern matching logic +- Configure a new pattern live +- Demonstrate notification integration + +### Workshop Version +- Have participants trigger their own failures +- Let them modify agent configuration +- Group exercise: identify patterns to automate + +## Cleanup + +After the demo: + +```bash +# Delete the demo branch +git push origin --delete demo-failure +git branch -D demo-failure + +# Reset agent state (if needed) +python agent.py --reset +``` + +## Files Needed + +- `agent.py` - The agent implementation +- `agent-config.yaml` - Agent configuration +- `.github/workflows/ci.yml` - The CI workflow +- `requirements.txt` - Python dependencies +- `src/app.py` - Sample application + +## Additional Resources + +- Agent architecture diagram +- Configuration reference guide +- API documentation +- Troubleshooting guide diff --git a/event-in-a-box/presentation/slides.md b/event-in-a-box/presentation/slides.md new file mode 100644 index 0000000..973a347 --- /dev/null +++ b/event-in-a-box/presentation/slides.md @@ -0,0 +1,408 @@ +# Agentic DevOps: Intelligent Automation for Modern Software Delivery + +## Presentation Outline + +This slide deck template can be converted to PowerPoint, Google Slides, or presented using tools like Marp or reveal.js. + +--- + +## Slide 1: Title Slide + +# Agentic DevOps +## Intelligent Automation for Modern Software Delivery + +**[Your Name]** +**[Your Title/Organization]** +**[Date]** + +--- + +## Slide 2: Agenda + +## Today's Journey + +1. ๐ŸŽฏ What is Agentic DevOps? +2. ๐Ÿ”„ From Automation to Autonomy +3. ๐Ÿ› ๏ธ Core Components & Patterns +4. ๐Ÿš€ Implementation Strategies +5. ๐Ÿ’ก Live Demo +6. ๐Ÿค” Q&A + +--- + +## Slide 3: The Challenge + +## Modern DevOps Challenges + +- โฐ **24/7 Operations** - Systems don't sleep, but teams do +- ๐Ÿ“Š **Scale & Complexity** - More services, more dependencies +- ๐Ÿ”ฅ **Incident Response** - Speed matters for recovery +- ๐Ÿ” **Repetitive Tasks** - Manual toil slows innovation +- ๐ŸŽฏ **Context Switching** - Too many alerts, too little focus + +> *How do we manage increasing complexity while maintaining velocity?* + +--- + +## Slide 4: What is Agentic DevOps? + +## Agentic DevOps Defined + +**Agentic DevOps** = Intelligent, autonomous systems that can make decisions and take actions within DevOps workflows with minimal human intervention. + +### Key Characteristics: +- ๐Ÿค– **Self-directed** - Acts independently within defined boundaries +- ๐Ÿง  **Context-aware** - Understands the situation before acting +- ๐Ÿ“ˆ **Learning** - Improves from patterns and outcomes +- โšก **Proactive** - Prevents problems before they escalate + +--- + +## Slide 5: Evolution of DevOps Automation + +## The Automation Journey + +``` +Manual Operations โ†’ Scripted Automation โ†’ Intelligent Agents +``` + +| Manual | Scripted | Agentic | +|--------|----------|---------| +| Human runs commands | Script runs commands | Agent decides & acts | +| Reactive | Predetermined | Adaptive | +| Error-prone | Consistent | Context-aware | +| Slow | Fast | Fast + Smart | + +--- + +## Slide 6: Real-World Use Cases + +## Where Agentic DevOps Shines + +### ๐ŸŽฏ Incident Response +- Auto-detect anomalies +- Gather diagnostic data +- Apply known fixes +- Escalate when needed + +### ๐Ÿ”„ CI/CD Pipeline Management +- Optimize build resources +- Smart test selection +- Automatic rollback decisions +- Deployment scheduling + +### ๐Ÿ“Š Resource Optimization +- Auto-scaling based on patterns +- Cost optimization +- Performance tuning +- Capacity planning + +### ๐Ÿ”’ Security & Compliance +- Automated vulnerability remediation +- Policy enforcement +- Audit trail maintenance +- Compliance verification + +--- + +## Slide 7: Core Components + +## Building Blocks of Agentic Systems + +### 1. **Intelligent Agents** + - Decision-making logic + - Action executors + - State management + +### 2. **Knowledge Base** + - Historical data + - Best practices + - Runbooks & procedures + +### 3. **Integration Layer** + - APIs and webhooks + - Tool connectors + - Event streams + +### 4. **Feedback Loop** + - Monitoring & observability + - Learning mechanisms + - Continuous improvement + +--- + +## Slide 8: Agent Architecture + +## Anatomy of an Agent + +``` +โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” +โ”‚ Intelligent Agent โ”‚ +โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค +โ”‚ Sensors (Monitoring & Events) โ”‚ +โ”‚ โ†“ โ”‚ +โ”‚ Decision Engine (Rules/ML) โ”‚ +โ”‚ โ†“ โ”‚ +โ”‚ Actuators (Actions & APIs) โ”‚ +โ”‚ โ†“ โ”‚ +โ”‚ Feedback (Results & Learning) โ”‚ +โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ +``` + +--- + +## Slide 9: Implementation Patterns + +## Common Agentic Patterns + +### Pattern 1: Observer-Reactor +- Monitor for specific conditions +- React with predetermined actions +- *Example: Auto-restart failed services* + +### Pattern 2: Optimizer +- Continuously analyze performance +- Make incremental improvements +- *Example: Dynamic resource allocation* + +### Pattern 3: Orchestrator +- Coordinate multiple systems +- Manage complex workflows +- *Example: Multi-stage deployments* + +### Pattern 4: Healer +- Detect degradation +- Apply remediation +- Verify recovery +- *Example: Self-healing infrastructure* + +--- + +## Slide 10: Getting Started + +## Implementation Roadmap + +### Phase 1: Foundation (Weeks 1-2) +- โœ… Identify repetitive tasks +- โœ… Document decision trees +- โœ… Set up monitoring + +### Phase 2: First Agent (Weeks 3-4) +- โœ… Choose low-risk use case +- โœ… Build simple agent +- โœ… Test in non-production + +### Phase 3: Expand (Months 2-3) +- โœ… Add more agents +- โœ… Implement learning +- โœ… Measure outcomes + +### Phase 4: Optimize (Ongoing) +- โœ… Refine decision logic +- โœ… Expand autonomy +- โœ… Scale across org + +--- + +## Slide 11: Security & Governance + +## Safety First + +### Guardrails for Autonomous Systems + +๐Ÿ” **Authentication & Authorization** +- Least privilege access +- Service accounts with limited scope +- Regular credential rotation + +๐Ÿ“ **Audit & Compliance** +- Complete action logging +- Decision traceability +- Compliance validation + +๐ŸŽฏ **Scope Limitations** +- Define clear boundaries +- Implement circuit breakers +- Human approval for high-risk actions + +๐Ÿ” **Monitoring & Oversight** +- Agent performance metrics +- Anomaly detection +- Regular reviews + +--- + +## Slide 12: Common Pitfalls + +## What to Avoid + +### โŒ Anti-Patterns + +1. **Over-automation too soon** + - Start simple, expand gradually + +2. **Insufficient error handling** + - Agents must handle edge cases + +3. **No human override** + - Always maintain manual control + +4. **Lack of observability** + - You can't improve what you can't measure + +5. **Ignoring security** + - Agents have power; secure them appropriately + +--- + +## Slide 13: Measuring Success + +## Key Metrics + +### Operational Efficiency +- โฑ๏ธ Mean time to recovery (MTTR) +- ๐ŸŽฏ Incident resolution rate +- ๐Ÿ“‰ Manual intervention frequency + +### Business Impact +- ๐Ÿ’ฐ Cost savings +- ๐Ÿš€ Deployment frequency +- โœ… Service reliability + +### Team Satisfaction +- ๐Ÿ˜Š Reduced toil +- ๐ŸŒ™ Fewer on-call alerts +- ๐ŸŽฏ More innovation time + +--- + +## Slide 14: Demo Preview + +## What You'll See + +### Live Demo: Self-Healing Pipeline + +1. ๐Ÿ” **Detection**: Agent monitors pipeline health +2. ๐Ÿง  **Analysis**: Identifies failure pattern +3. โšก **Action**: Automatically applies fix +4. โœ… **Validation**: Verifies resolution +5. ๐Ÿ“Š **Learning**: Updates knowledge base + +--- + +## Slide 15: Best Practices + +## Keys to Success + +### ๐ŸŽฏ Start Small +- Choose one pain point +- Build a simple agent +- Iterate based on results + +### ๐Ÿ“š Document Everything +- Decision logic +- Action outcomes +- Lessons learned + +### ๐Ÿค Collaborate +- DevOps + Development + Security +- Share knowledge +- Build together + +### ๐Ÿ“Š Measure & Iterate +- Track metrics +- Gather feedback +- Continuous improvement + +--- + +## Slide 16: The Future + +## What's Next in Agentic DevOps? + +### Emerging Trends + +๐Ÿค– **AI/ML Integration** +- Advanced pattern recognition +- Predictive analytics +- Natural language interfaces + +๐ŸŒ **Multi-Agent Systems** +- Collaborative agents +- Specialized roles +- Emergent behaviors + +๐Ÿ”— **Universal Integration** +- Cross-platform orchestration +- Unified control plane +- Seamless tool integration + +--- + +## Slide 17: Resources + +## Continue Learning + +### ๐Ÿ“š Documentation & Guides +- [Link to official docs] +- [Link to GitHub repo] +- [Link to community] + +### ๐ŸŽ“ Training & Tutorials +- [Link to courses] +- [Link to workshops] +- [Link to certification] + +### ๐Ÿ’ฌ Community +- [Link to forums] +- [Link to Slack/Discord] +- [Link to meetups] + +--- + +## Slide 18: Q&A + +## Questions? + +**Contact Information:** +- ๐Ÿ“ง Email: [your-email] +- ๐Ÿฆ Twitter: [your-handle] +- ๐Ÿ’ผ LinkedIn: [your-profile] +- ๐ŸŒ Website: [your-website] + +--- + +## Slide 19: Thank You + +# Thank You! + +## Let's Build the Future of DevOps Together + +**Feedback & Follow-up:** +- Survey: [link] +- GitHub: [repo-link] +- Next Steps: [resources] + +--- + +## Notes for Presenters + +### Slide Timing (60-minute session) +- Slides 1-6: 15 minutes (Introduction & Concepts) +- Slides 7-11: 15 minutes (Technical Details) +- Slides 12-13: 10 minutes (Practical Considerations) +- Slide 14: 10 minutes (Demo) +- Slides 15-17: 5 minutes (Next Steps) +- Slides 18-19: 5 minutes (Q&A & Closing) + +### Customization Tips +- Add your organization's logo to title slide +- Include relevant case studies for your industry +- Update links to point to your resources +- Add backup slides with technical details +- Include local success stories + +### Interactive Elements +- Poll: "How many use automation in their pipelines?" +- Activity: "Identify one task that could be automated" +- Discussion: "What concerns do you have about autonomous systems?" diff --git a/event-in-a-box/resources/faq.md b/event-in-a-box/resources/faq.md new file mode 100644 index 0000000..0c7e6c9 --- /dev/null +++ b/event-in-a-box/resources/faq.md @@ -0,0 +1,405 @@ +# Frequently Asked Questions (FAQ) + +## General Questions + +### What is Agentic DevOps? + +Agentic DevOps refers to the practice of using intelligent, autonomous agents to manage DevOps processes. These agents can make decisions, take actions, and adapt their behavior based on context without requiring constant human intervention. + +Think of it as the evolution from: +- **Manual operations** โ†’ Scripts and automation โ†’ **Intelligent agents** + +### How is this different from traditional DevOps automation? + +| Traditional Automation | Agentic DevOps | +|----------------------|----------------| +| Follows fixed scripts | Makes context-aware decisions | +| Requires explicit programming for each scenario | Learns patterns and adapts | +| Reactive (responds when triggered) | Can be proactive (anticipates issues) | +| Limited to predefined actions | Can reason about best course of action | + +### Do I need AI/ML expertise to implement agentic DevOps? + +Not necessarily! While AI/ML can enhance agentic systems, many useful agents can be built using: +- Rule-based systems +- Decision trees +- Pattern matching +- State machines +- Traditional programming logic + +You can start simple and add AI/ML capabilities as you grow. + +### Is this the same as AIOps? + +There's overlap, but they're distinct: +- **AIOps** focuses on using AI for IT operations monitoring and insights +- **Agentic DevOps** focuses on autonomous agents that take action in DevOps workflows + +Agentic DevOps can use AIOps insights as input for decision-making. + +## Getting Started + +### Where should I start? + +1. **Identify a pain point**: Find a repetitive task or common failure pattern +2. **Start small**: Build a simple agent for that specific use case +3. **Learn from it**: Observe how it performs and refine +4. **Expand gradually**: Add more capabilities and agents over time + +### What's a good first use case? + +Ideal first projects: +- Auto-restart failed services +- Retry flaky tests +- Clear caches on build failures +- Auto-assign reviewers for PRs +- Send notifications based on patterns +- Scale resources based on load + +Characteristics of good first projects: +- Low risk if it fails +- Clear success criteria +- Frequent occurrence +- Well-understood solution +- Easy to monitor + +### What tools do I need? + +Minimum requirements: +- Version control system (Git) +- CI/CD platform (GitHub Actions, Azure DevOps, Jenkins, etc.) +- Webhook or event system +- Programming language (Python, JavaScript, Go, etc.) +- Monitoring/logging system + +Optional but helpful: +- Container platform (Docker, Kubernetes) +- Cloud platform (Azure, AWS, GCP) +- Agent framework or library +- Dashboard/visualization tool + +### How long does it take to see value? + +Timeline varies, but typically: +- **Week 1-2**: Setup and first simple agent +- **Week 3-4**: Agent handling real scenarios +- **Month 2-3**: Multiple agents, measurable impact +- **Month 6+**: Significant reduction in toil and incidents + +## Implementation + +### How do I ensure security? + +Key security practices: +1. **Least privilege**: Agents should have minimal necessary permissions +2. **Authentication**: Use proper service accounts and tokens +3. **Authorization**: Validate all actions before execution +4. **Audit logging**: Log every decision and action +5. **Secrets management**: Never hardcode credentials +6. **Rate limiting**: Prevent runaway automation +7. **Human approval**: Require approval for high-risk actions +8. **Regular reviews**: Audit agent behavior and permissions + +### What about compliance and governance? + +Agentic systems can actually improve compliance: +- **Consistency**: Same rules applied every time +- **Audit trail**: Complete record of all actions +- **Policy enforcement**: Automated compliance checks +- **Documentation**: Self-documenting through logs +- **Accountability**: Clear record of what changed and why + +Tips: +- Document agent capabilities and boundaries +- Include compliance checks in agent logic +- Maintain approval workflows for sensitive operations +- Regular compliance audits of agent activities + +### How do I test agents before production? + +Testing strategies: +1. **Dry-run mode**: Log what agent would do without doing it +2. **Sandbox environment**: Test in isolated non-prod environment +3. **Shadow mode**: Run alongside manual process, don't take action +4. **Gradual rollout**: Start with limited scope, expand slowly +5. **Circuit breakers**: Auto-disable if error rate exceeds threshold +6. **Unit tests**: Test decision logic thoroughly +7. **Integration tests**: Test with actual systems (in test env) + +### What if an agent makes a mistake? + +Build in safeguards: +- **Logging**: Complete audit trail for debugging +- **Rollback capability**: Can undo actions when possible +- **Circuit breakers**: Auto-disable after N failures +- **Monitoring**: Alert on unexpected behavior +- **Manual override**: Always have human override option +- **Gradual deployment**: Limit blast radius +- **Testing**: Thorough testing before production + +### How do I monitor agent performance? + +Key metrics to track: +- **Success rate**: % of actions that succeed +- **Resolution time**: Time from detection to fix +- **False positive rate**: Actions taken unnecessarily +- **Intervention rate**: How often humans need to step in +- **Cost savings**: Time/money saved +- **Incident reduction**: Fewer problems reaching production + +Tools: +- Application logs +- Metrics dashboards (Grafana, Datadog, etc.) +- Custom agent dashboard +- APM tools +- Alerting systems + +## Technical Questions + +### What programming languages work best? + +Any language works, but popular choices: +- **Python**: Rich ecosystem, easy to learn, great for scripting +- **JavaScript/TypeScript**: Good for web hooks, Node.js ecosystem +- **Go**: Fast, concurrent, good for distributed systems +- **Java/C#**: Enterprise environments, robust frameworks + +Choose based on: +- Team expertise +- Existing infrastructure +- Performance requirements +- Available libraries + +### Can agents work across multiple tools? + +Yes! Agents can integrate with: +- Version control (GitHub, GitLab, Bitbucket) +- CI/CD (Jenkins, CircleCI, Azure DevOps, GitHub Actions) +- Cloud platforms (Azure, AWS, GCP) +- Monitoring (Datadog, New Relic, Prometheus) +- Communication (Slack, Teams, PagerDuty) +- Issue tracking (Jira, Azure Boards) + +Integration methods: +- REST APIs +- Webhooks +- Message queues +- Database connections +- Command-line tools + +### How do agents scale? + +Scaling strategies: +- **Horizontal**: Run multiple agent instances +- **Vertical**: Increase resources for single agent +- **Distributed**: Different agents for different responsibilities +- **Queue-based**: Use message queues for load balancing +- **Serverless**: Use cloud functions for event-driven scaling + +### Can multiple agents work together? + +Yes! Multi-agent patterns: +- **Hierarchical**: Coordinator agent delegates to specialized agents +- **Peer-to-peer**: Agents communicate directly +- **Pipeline**: Output of one agent feeds into another +- **Consensus**: Multiple agents vote on best action +- **Specialized**: Each agent handles specific domain + +Coordination methods: +- Shared database/state +- Message passing +- Event bus +- Service mesh + +## Advanced Topics + +### How do agents learn and improve? + +Learning mechanisms: +1. **Pattern recognition**: Identify recurring scenarios +2. **Success tracking**: Record what actions work +3. **Failure analysis**: Learn from mistakes +4. **A/B testing**: Try different approaches +5. **Feedback loops**: Incorporate human feedback +6. **Model training**: Use historical data to train ML models + +### Can I use LLMs/GPT in agents? + +Yes! LLMs can enhance agents: +- Natural language analysis of errors +- Generating fix suggestions +- Documentation generation +- Anomaly description +- User interaction + +Considerations: +- Cost per API call +- Response latency +- Reliability/consistency +- Data privacy +- Hallucination risk + +Best practices: +- Use for analysis, not blind execution +- Validate LLM outputs +- Have fallback logic +- Cache common responses +- Monitor costs + +### What about edge cases and rare failures? + +Strategies: +- **Graceful degradation**: Do something useful even if not perfect +- **Escalation**: Route unknown scenarios to humans +- **Learning**: Add new patterns as they're discovered +- **Conservative approach**: When uncertain, ask for help +- **Fallback logic**: Have safe default behavior + +### How do I handle agent failures? + +Failure handling: +1. **Detection**: Monitor agent health +2. **Alerting**: Notify on-call when agent fails +3. **Automatic restart**: Self-healing agents +4. **Fallback**: Revert to manual process +5. **Graceful degradation**: Reduced functionality vs. complete failure +6. **Post-mortem**: Learn from failures + +## Organizational Questions + +### How do I get buy-in from my team? + +Strategies: +1. **Start small**: Prove value with quick win +2. **Show data**: Measure and demonstrate impact +3. **Address concerns**: Listen to objections, mitigate risks +4. **Involve team**: Collaborative design and implementation +5. **Share success**: Celebrate and publicize wins +6. **Education**: Provide training and resources + +### What's the ROI? + +Measurable benefits: +- **Time savings**: Hours/week not spent on toil +- **Faster recovery**: Reduced MTTR +- **Fewer incidents**: Problems caught earlier +- **Better work/life balance**: Fewer on-call pages +- **Increased velocity**: More time for feature work +- **Cost reduction**: Optimized resource usage + +Typical ROI timeline: 3-6 months to breakeven, then ongoing value. + +### How do I maintain agent code? + +Maintenance best practices: +- **Version control**: Track all agent code +- **Documentation**: Keep docs up to date +- **Testing**: Automated test suite +- **Code review**: Review agent changes +- **Monitoring**: Watch for degradation +- **Regular updates**: Dependencies, patterns, logic +- **Ownership**: Clear responsibility +- **Knowledge sharing**: Cross-train team members + +### What skills does my team need? + +Helpful skills: +- Programming fundamentals +- API integration +- DevOps practices +- Debugging and troubleshooting +- System design +- Testing strategies + +Nice to have: +- Machine learning basics +- Data analysis +- Distributed systems +- Security practices + +## Troubleshooting + +### Agent isn't triggering + +Check: +- [ ] Webhook configured correctly +- [ ] Event types selected +- [ ] Network connectivity +- [ ] Firewall rules +- [ ] Authentication/tokens +- [ ] Agent service running +- [ ] Logs for error messages + +### Agent is too slow + +Optimization approaches: +- Profile code to find bottlenecks +- Cache frequently accessed data +- Use async operations +- Optimize API calls +- Scale horizontally +- Improve algorithms + +### Agent is making wrong decisions + +Debug steps: +1. Review logs for decision reasoning +2. Test with known scenarios +3. Check pattern matching logic +4. Verify input data quality +5. Review decision weights/thresholds +6. Add more logging +7. Implement dry-run mode for testing + +### How do I debug agent issues? + +Debugging tools: +- Structured logging +- Distributed tracing +- Metrics and dashboards +- Local development environment +- Dry-run/simulation mode +- Step-through debugging +- Log analysis tools + +## Resources + +### Where can I learn more? + +- [Official Documentation](./documentation.md) +- [Troubleshooting Guide](./troubleshooting.md) +- [Demo Scripts](../demos/README.md) +- [Workshop Materials](../workshop/README.md) +- Community forums and discussions +- Conference talks and presentations +- Blog posts and articles + +### Are there examples I can reference? + +Yes! Check out: +- [Demo repository](https://github.com/[org]/agentic-demos) +- Sample agent implementations +- Case studies from organizations +- Open source agent frameworks +- Community-contributed patterns + +### How do I contribute? + +We welcome contributions! +- Report issues +- Suggest improvements +- Share your patterns +- Submit pull requests +- Help answer questions +- Write blog posts +- Give presentations + +See [CONTRIBUTING.md](../CONTRIBUTING.md) for guidelines. + +## Still Have Questions? + +- ๐Ÿ’ฌ Join the community: [Link to forum/Slack] +- ๐Ÿ“ง Email us: [support email] +- ๐Ÿ› Report issues: [GitHub Issues] +- ๐Ÿ“š Read the docs: [Documentation link] +- ๐ŸŽ“ Take the course: [Training link] diff --git a/event-in-a-box/resources/troubleshooting.md b/event-in-a-box/resources/troubleshooting.md new file mode 100644 index 0000000..86fc350 --- /dev/null +++ b/event-in-a-box/resources/troubleshooting.md @@ -0,0 +1,667 @@ +# Troubleshooting Guide + +This guide helps you diagnose and fix common issues when setting up and running agentic DevOps demos and workshops. + +## Setup Issues + +### Python Environment Problems + +#### Issue: "Python version not supported" + +``` +Error: Python 3.7 is not supported. Please use Python 3.8 or later. +``` + +**Solution:** +```bash +# Check your Python version +python --version +python3 --version + +# Install Python 3.8+ using package manager +# On macOS: +brew install python@3.11 + +# On Ubuntu: +sudo apt update +sudo apt install python3.11 + +# On Windows: Download from python.org +``` + +#### Issue: "Module not found" errors + +``` +ModuleNotFoundError: No module named 'requests' +``` + +**Solution:** +```bash +# Ensure you're in virtual environment +source venv/bin/activate # macOS/Linux +venv\Scripts\activate # Windows + +# Reinstall requirements +pip install -r requirements.txt + +# If issues persist, upgrade pip +pip install --upgrade pip +pip install -r requirements.txt --force-reinstall +``` + +#### Issue: "Permission denied" when installing packages + +**Solution:** +```bash +# Use virtual environment (recommended) +python -m venv venv +source venv/bin/activate +pip install -r requirements.txt + +# Or install with --user flag +pip install --user -r requirements.txt +``` + +### GitHub Authentication Issues + +#### Issue: "Bad credentials" or "401 Unauthorized" + +**Solution:** +1. Verify your token has correct permissions: + - `repo` (all) + - `workflow` + - `read:org` (if using org repos) + +2. Check token is properly set: +```bash +# View current token (first few characters) +echo $GITHUB_TOKEN | cut -c1-10 + +# Re-set token +export GITHUB_TOKEN=your_token_here + +# Or update .env file +echo "GITHUB_TOKEN=your_token_here" > .env +source .env +``` + +3. Regenerate token if expired: + - Go to GitHub Settings โ†’ Developer settings โ†’ Personal access tokens + - Delete old token + - Generate new token with required scopes + +#### Issue: "rate limit exceeded" + +``` +Error: API rate limit exceeded for user +``` + +**Solution:** +- Authenticated requests have higher limits +- Wait for rate limit reset (check `X-RateLimit-Reset` header) +- Use conditional requests with ETags +- Cache API responses when possible + +```bash +# Check rate limit status +curl -H "Authorization: token $GITHUB_TOKEN" \ + https://api.github.com/rate_limit +``` + +### Docker Issues + +#### Issue: "Cannot connect to Docker daemon" + +``` +Error: Cannot connect to the Docker daemon at unix:///var/run/docker.sock +``` + +**Solution:** +```bash +# Ensure Docker is running +docker ps + +# On macOS: Start Docker Desktop +open -a Docker + +# On Linux: Start Docker service +sudo systemctl start docker + +# Check Docker status +sudo systemctl status docker +``` + +#### Issue: "Port already in use" + +``` +Error: bind: address already in use +``` + +**Solution:** +```bash +# Find process using the port +lsof -i :8080 # macOS/Linux +netstat -ano | findstr :8080 # Windows + +# Kill the process +kill -9 # macOS/Linux +taskkill /PID /F # Windows + +# Or use a different port +docker run -p 8081:8080 agentic-agent:latest +``` + +#### Issue: "Image not found" + +``` +Error: Unable to find image 'agentic-agent:latest' locally +``` + +**Solution:** +```bash +# Build the image first +docker build -t agentic-agent:latest . + +# Or pull from registry +docker pull your-registry/agentic-agent:latest + +# List available images +docker images +``` + +### Webhook Issues + +#### Issue: Webhook not receiving events + +**Symptoms:** +- Agent logs show no incoming requests +- GitHub webhook shows "Last delivery" but agent doesn't react + +**Diagnosis:** +```bash +# Check webhook status in GitHub +gh api repos/{owner}/{repo}/hooks + +# Check agent is listening +curl http://localhost:8080/health + +# Check recent webhook deliveries +# Go to: Settings โ†’ Webhooks โ†’ Edit โ†’ Recent Deliveries +``` + +**Solutions:** + +1. **Local development - use ngrok:** +```bash +# Install ngrok +brew install ngrok # macOS +# Or download from ngrok.com + +# Start ngrok tunnel +ngrok http 8080 + +# Update webhook URL in GitHub to ngrok URL +# Example: https://abc123.ngrok.io/webhook +``` + +2. **Firewall blocking:** +```bash +# Allow port through firewall (Linux) +sudo ufw allow 8080 + +# Check if port is accessible externally +curl http://your-external-ip:8080/health +``` + +3. **Wrong webhook secret:** +```bash +# Verify secret in .env matches GitHub webhook +# Update GitHub webhook secret to match .env +gh api repos/{owner}/{repo}/hooks/{hook_id} \ + -X PATCH \ + -f secret="your-secret" +``` + +#### Issue: Webhook receives events but agent doesn't process them + +**Diagnosis:** +```bash +# Check agent logs +tail -f logs/agent.log + +# Test webhook manually +curl -X POST http://localhost:8080/webhook \ + -H "Content-Type: application/json" \ + -H "X-GitHub-Event: workflow_run" \ + -d @test-payload.json +``` + +**Common causes:** +- Event type filtering too restrictive +- Error in event parsing +- Exception in event handler +- Missing event handlers + +**Solution:** +```bash +# Enable debug logging +export LOG_LEVEL=DEBUG +python agent.py --mode service + +# Add more event types to configuration +# Edit agent-config.yaml to include needed events +``` + +## Runtime Issues + +### Agent Issues + +#### Issue: Agent crashes immediately + +**Diagnosis:** +```bash +# Run with verbose logging +python agent.py --mode service --log-level DEBUG + +# Check for syntax errors +python -m py_compile agent.py + +# Check dependencies +pip check +``` + +**Common causes:** +- Missing dependencies +- Configuration file errors +- Port conflicts +- Invalid credentials + +#### Issue: Agent running but not responding to events + +**Diagnosis:** +```bash +# Check if agent is actually running +ps aux | grep agent + +# Check listening ports +netstat -an | grep 8080 +lsof -i :8080 + +# Send test event +curl -X POST http://localhost:8080/webhook \ + -H "Content-Type: application/json" \ + -d '{"test": true}' + +# Check response +curl http://localhost:8080/health +``` + +**Solutions:** +- Verify webhook configuration +- Check event filtering rules +- Review agent logs for errors +- Ensure event handlers are registered + +#### Issue: Agent making wrong decisions + +**Debugging steps:** + +1. **Enable dry-run mode:** +```python +# In agent code +agent = Agent(dry_run=True) # Logs actions without executing +``` + +2. **Review decision logic:** +```bash +# Add detailed logging +import logging +logging.basicConfig(level=logging.DEBUG) +``` + +3. **Test with known scenarios:** +```python +# Create test cases +def test_pattern_matching(): + agent = Agent() + result = agent.match_pattern("error: module not found") + assert result.pattern == "dependency_failure" +``` + +4. **Check pattern priorities:** +```yaml +# In agent-config.yaml, ensure patterns are ordered correctly +patterns: + - name: specific_error # More specific first + priority: 10 + - name: general_error # More general last + priority: 1 +``` + +### GitHub Actions Issues + +#### Issue: Workflow not triggering + +**Diagnosis:** +```bash +# Check workflow status +gh workflow list + +# View workflow runs +gh run list --workflow=ci.yml + +# Check workflow file syntax +yamllint .github/workflows/ci.yml +``` + +**Solutions:** + +1. **Check trigger configuration:** +```yaml +# .github/workflows/ci.yml +on: + push: + branches: [main] # Ensure correct branch + workflow_dispatch: # Allow manual trigger +``` + +2. **Trigger manually:** +```bash +gh workflow run ci.yml +``` + +3. **Check Actions are enabled:** +- Go to Settings โ†’ Actions โ†’ General +- Ensure "Allow all actions and reusable workflows" is selected + +#### Issue: Workflow fails immediately + +**Common causes:** +- Syntax errors in workflow file +- Missing secrets +- Invalid GitHub Actions expressions +- Wrong runner (e.g., using Windows commands on Linux) + +**Solutions:** +```bash +# Validate workflow locally +act -l # Lists workflows +act push # Simulates push event + +# Check secrets are set +gh secret list + +# View workflow logs +gh run view --log +``` + +### Demo Issues + +#### Issue: Demo fails during presentation + +**Immediate recovery:** +1. **Switch to backup video/screenshots** +2. **Use pre-recorded demo** +3. **Walk through with slides/diagrams** +4. **Continue with next segment** + +**Prevention:** +- Test entire demo flow before presentation +- Have backup video ready +- Use stable, known-good commits +- Have screenshots at key points +- Test in presentation environment + +#### Issue: Audience can't follow along + +**Solutions:** +- Increase terminal font size (minimum 18pt) +- Use high contrast color scheme +- Slow down and narrate actions +- Show overview slides between steps +- Use drawing tools to highlight +- Post command history in chat + +#### Issue: Demo running slow + +**Quick fixes:** +- Skip non-essential steps +- Use pre-built images instead of building +- Run on faster hardware +- Reduce data set size +- Cache dependencies +- Use local resources instead of remote + +## Workshop Issues + +### Participant Setup Issues + +#### Issue: Participants can't install prerequisites + +**Solutions:** +- Provide pre-configured cloud environments (GitHub Codespaces, Gitpod) +- Use Docker containers for consistency +- Have installation help desk during setup time +- Pair participants (working setup helps those without) +- Provide offline installers on USB drives + +#### Issue: Varied experience levels + +**Strategies:** +- Provide multiple difficulty tracks +- Use pair programming +- Create challenge extensions for fast learners +- Offer simplified alternatives for strugglers +- Have TAs available for individual help + +#### Issue: Time running over + +**Adjustments:** +- Skip optional sections +- Reduce Q&A time (save for after) +- Streamline demonstrations +- Provide pre-built solutions +- Assign remaining work as homework + +### Technical Issues During Workshop + +#### Issue: WiFi/Internet problems + +**Mitigations:** +- Have offline documentation +- Pre-download all dependencies +- Use local resources +- Provide USB with materials +- Use mobile hotspot as backup + +#### Issue: GitHub rate limiting affecting multiple participants + +**Solutions:** +```bash +# Use GitHub Enterprise or team account +# Rotate through multiple tokens +# Cache responses locally +# Use local GitLab/Gitea instance + +# Check rate limit +gh api rate_limit +``` + +#### Issue: Everyone's agents interfering with each other + +**Solutions:** +- Use separate repositories per participant +- Use branches instead of separate repos +- Add participant ID to agent names +- Use different webhooks per participant +- Implement agent isolation in config + +## Performance Issues + +### Agent Performance + +#### Issue: Agent response time too slow + +**Profiling:** +```python +# Add timing decorator +import time +from functools import wraps + +def timing(f): + @wraps(f) + def wrapper(*args, **kwargs): + start = time.time() + result = f(*args, **kwargs) + end = time.time() + print(f'{f.__name__} took {end-start:.2f}s') + return result + return wrapper + +@timing +def process_event(event): + # ... agent logic ... +``` + +**Optimizations:** +- Cache frequently accessed data +- Use async operations +- Optimize API calls (batch requests) +- Improve pattern matching algorithms +- Add request timeout limits +- Scale horizontally + +#### Issue: High memory usage + +**Diagnosis:** +```python +# Memory profiling +from memory_profiler import profile + +@profile +def memory_intensive_function(): + # ... code ... +``` + +**Solutions:** +- Stream large datasets instead of loading into memory +- Clear caches periodically +- Use generators instead of lists +- Profile and fix memory leaks +- Increase available memory + +### GitHub Actions Performance + +#### Issue: Workflows taking too long + +**Optimizations:** +- Cache dependencies +- Use matrix strategy for parallel execution +- Optimize test suite +- Use faster runners +- Split into multiple workflows + +```yaml +# Use caching +- uses: actions/cache@v3 + with: + path: ~/.npm + key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }} + +# Use matrix for parallel runs +strategy: + matrix: + os: [ubuntu-latest, windows-latest] + node: [16, 18, 20] +``` + +## Getting Help + +### Before Asking for Help + +Gather this information: +- [ ] Exact error message +- [ ] Steps to reproduce +- [ ] Environment details (OS, versions) +- [ ] Relevant logs +- [ ] What you've tried already + +### Where to Get Help + +1. **Documentation:** + - [README](../README.md) + - [FAQ](./faq.md) + - [Setup Guide](../setup/demo-setup.md) + +2. **Community:** + - GitHub Discussions + - Slack/Discord channel + - Stack Overflow (tag: agentic-devops) + +3. **Support:** + - Open GitHub Issue + - Email support + - Schedule office hours + +### Reporting Bugs + +Include: +```markdown +**Description:** Brief description of issue + +**Steps to Reproduce:** +1. Step one +2. Step two +3. ... + +**Expected Behavior:** What should happen + +**Actual Behavior:** What actually happens + +**Environment:** +- OS: [e.g., macOS 12.0] +- Python version: [e.g., 3.11.0] +- Agent version: [e.g., 1.2.3] +- Docker version: [if applicable] + +**Logs:** +```bash +# Include relevant logs +``` + +**Screenshots:** [if applicable] +``` + +## Emergency Contacts + +During live events: +- **Technical Lead:** [Contact info] +- **Backup Presenter:** [Contact info] +- **IT Support:** [Contact info] +- **Venue Contact:** [Contact info] + +## Appendix: Common Error Messages + +### "Connection refused" +- Service not running +- Wrong host/port +- Firewall blocking +- Network issue + +### "Permission denied" +- Insufficient permissions +- Need sudo/admin rights +- File ownership issue +- Token lacks required scopes + +### "Timeout" +- Network issue +- Service overloaded +- Request taking too long +- Firewall/proxy blocking + +### "Invalid configuration" +- Syntax error in config file +- Missing required field +- Invalid value +- Version mismatch diff --git a/event-in-a-box/setup/demo-setup.md b/event-in-a-box/setup/demo-setup.md new file mode 100644 index 0000000..b3aba84 --- /dev/null +++ b/event-in-a-box/setup/demo-setup.md @@ -0,0 +1,405 @@ +# Demo Environment Setup Guide + +This guide will help you set up everything you need to deliver the agentic DevOps demos. + +## Overview + +The demo environment includes: +- GitHub repository with sample workflows +- Agent monitoring service +- Demo applications +- Dashboard for visualizing agent activity + +## Time Required + +- Initial setup: 30-45 minutes +- Pre-demo verification: 10-15 minutes + +## Prerequisites + +### Required +- GitHub account with organization access +- Admin permissions on a GitHub repository +- Terminal/command line access +- Git installed +- Internet connection + +### Optional (depending on demos) +- Docker Desktop +- Python 3.8 or later +- Node.js 16 or later +- Azure or AWS account (for cloud demos) +- kubectl (for Kubernetes demos) + +## Step-by-Step Setup + +### 1. Create Demo Repository + +```bash +# Clone the template +gh repo create agentic-devops-demo --template microsoft/agentic-devops-sample --public + +# Navigate to the repository +cd agentic-devops-demo +``` + +### 2. Configure GitHub Actions + +Enable GitHub Actions in your repository: +1. Go to Settings โ†’ Actions โ†’ General +2. Set "Actions permissions" to "Allow all actions" +3. Enable "Read and write permissions" for workflows + +### 3. Install Required Tools + +#### Python Environment + +```bash +# Create virtual environment +python3 -m venv venv +source venv/bin/activate # On Windows: venv\Scripts\activate + +# Install dependencies +pip install -r requirements.txt + +# Verify installation +python agent.py --version +``` + +#### GitHub CLI + +```bash +# Install gh (if not already installed) +# On macOS: +brew install gh + +# On Windows: +winget install GitHub.cli + +# On Linux: +curl -fsSL https://cli.github.com/packages/githubcli-archive-keyring.gpg | sudo dd of=/usr/share/keyrings/githubcli-archive-keyring.gpg +echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" | sudo tee /etc/apt/sources.list.d/github-cli.list > /dev/null +sudo apt update +sudo apt install gh + +# Authenticate +gh auth login +``` + +#### Docker (Optional) + +```bash +# Install Docker Desktop from https://www.docker.com/products/docker-desktop + +# Verify installation +docker --version +docker compose --version +``` + +### 4. Configure Secrets and Tokens + +#### Create GitHub Personal Access Token + +1. Go to GitHub Settings โ†’ Developer settings โ†’ Personal access tokens โ†’ Tokens (classic) +2. Click "Generate new token (classic)" +3. Select scopes: + - `repo` (all) + - `workflow` + - `read:org` +4. Copy the token + +#### Set Environment Variables + +```bash +# Create .env file +cp .env.example .env + +# Edit .env with your values +cat > .env << EOF +GITHUB_TOKEN=your_token_here +GITHUB_REPO=your-org/agentic-devops-demo +AGENT_NAME=demo-agent +LOG_LEVEL=INFO +EOF + +# Source the environment variables +source .env +``` + +#### Configure Repository Secrets + +```bash +# Using GitHub CLI +gh secret set GITHUB_TOKEN --body "$GITHUB_TOKEN" + +# Or manually in GitHub: +# Settings โ†’ Secrets and variables โ†’ Actions โ†’ New repository secret +``` + +### 5. Deploy the Agent + +#### Local Deployment + +```bash +# Start the agent service +python agent.py --mode service + +# In a new terminal, verify it's running +curl http://localhost:8080/health +``` + +#### Docker Deployment + +```bash +# Build the agent container +docker build -t agentic-agent:latest . + +# Run the agent +docker run -d \ + --name agentic-agent \ + -p 8080:8080 \ + --env-file .env \ + agentic-agent:latest + +# Check logs +docker logs -f agentic-agent +``` + +#### Cloud Deployment (Optional) + +For Azure Container Instances: + +```bash +# Login to Azure +az login + +# Create resource group +az group create --name agentic-demo-rg --location eastus + +# Deploy container +az container create \ + --resource-group agentic-demo-rg \ + --name agentic-agent \ + --image agentic-agent:latest \ + --environment-variables \ + GITHUB_TOKEN=$GITHUB_TOKEN \ + GITHUB_REPO=$GITHUB_REPO \ + --ports 8080 +``` + +### 6. Configure Webhooks + +#### Automated Setup + +```bash +# Create webhook using script +python scripts/setup-webhook.py + +# Verify webhook +gh api repos/{owner}/{repo}/hooks +``` + +#### Manual Setup + +1. Go to repository Settings โ†’ Webhooks โ†’ Add webhook +2. Payload URL: `http://your-agent-url:8080/webhook` +3. Content type: `application/json` +4. Select events: + - Workflow runs + - Pull requests + - Issues +5. Click "Add webhook" + +### 7. Verify Setup + +Run the verification script: + +```bash +# Run full verification +python scripts/verify-setup.py + +# Expected output: +# โœ“ GitHub authentication successful +# โœ“ Repository accessible +# โœ“ GitHub Actions enabled +# โœ“ Agent service running +# โœ“ Webhook configured +# โœ“ Demo workflows present +# โœ“ All checks passed! +``` + +### 8. Run a Test Workflow + +```bash +# Trigger a test workflow +gh workflow run test-agent.yml + +# Watch the run +gh run watch + +# Check agent logs +python agent.py --logs --tail 50 +``` + +## Pre-Demo Checklist + +Run this checklist 10-15 minutes before your presentation: + +```bash +# Run the pre-demo check script +python scripts/pre-demo-check.py +``` + +Manual checks: +- [ ] Agent service is running +- [ ] Webhook is receiving events +- [ ] GitHub Actions has sufficient minutes +- [ ] All secrets are configured +- [ ] Demo branches are clean +- [ ] Terminal font size is readable (18pt+) +- [ ] Browser tabs are prepared +- [ ] Backup video/screenshots ready +- [ ] Internet connection stable + +## Troubleshooting + +### Agent Not Starting + +```bash +# Check Python version +python --version # Should be 3.8+ + +# Check dependencies +pip list + +# Reinstall dependencies +pip install --upgrade -r requirements.txt + +# Check for port conflicts +lsof -i :8080 # On macOS/Linux +netstat -ano | findstr :8080 # On Windows +``` + +### Webhook Not Receiving Events + +```bash +# Check webhook deliveries in GitHub +# Settings โ†’ Webhooks โ†’ Edit โ†’ Recent Deliveries + +# Check agent logs +tail -f logs/agent.log + +# Test webhook manually +curl -X POST http://localhost:8080/webhook \ + -H "Content-Type: application/json" \ + -d '{"test": true}' +``` + +### GitHub Actions Not Triggering + +```bash +# Check workflow files +gh workflow list + +# View workflow runs +gh run list + +# Check repository permissions +gh api repos/{owner}/{repo} --jq .permissions +``` + +### Docker Issues + +```bash +# Check Docker is running +docker ps + +# View container logs +docker logs agentic-agent + +# Restart container +docker restart agentic-agent + +# Rebuild if needed +docker build --no-cache -t agentic-agent:latest . +``` + +## Quick Reset + +If something goes wrong during setup: + +```bash +# Stop all services +python scripts/stop-all.py + +# Clean up +rm -rf logs/ +rm -rf __pycache__/ +docker stop agentic-agent +docker rm agentic-agent + +# Start fresh +python scripts/setup-demo.py --fresh +``` + +## Environment Configurations + +### Minimal (for basic demos) +- GitHub repository +- Local Python agent +- No Docker required + +### Standard (recommended) +- GitHub repository +- Dockerized agent +- Local dashboard + +### Full (for workshops) +- GitHub repository +- Cloud-hosted agent +- Dashboard with metrics +- Multiple demo apps + +## Resource Requirements + +### Local Development +- CPU: 2+ cores +- RAM: 4GB+ +- Disk: 10GB free space +- Network: Stable internet connection + +### Cloud Deployment +- Azure: B1s or larger +- AWS: t3.small or larger +- GCP: e2-small or larger + +## Costs + +- GitHub: Free for public repositories +- Docker: Free for personal use +- Cloud hosting: ~$5-10/month (if running 24/7) + +**Demo tip:** Use cloud free tiers or shut down resources between demos to minimize costs. + +## Additional Resources + +- [Agent Configuration Reference](../resources/agent-config.md) +- [Troubleshooting Guide](../resources/troubleshooting.md) +- [Demo Scripts](../demos/README.md) +- [FAQ](../resources/faq.md) + +## Getting Help + +If you encounter issues during setup: +1. Check the [Troubleshooting Guide](../resources/troubleshooting.md) +2. Review [GitHub Discussions](https://github.com/microsoft/agentic-devops-sample/discussions) +3. Open an issue with setup logs +4. Ask in the community Slack/Discord + +## Next Steps + +Once setup is complete: +1. Review the [Speaker Guide](../speaker-guide.md) +2. Practice with the [Demo Scripts](../demos/README.md) +3. Customize the [Presentation](../presentation/slides.md) +4. Run through at least one full demo diff --git a/event-in-a-box/speaker-guide.md b/event-in-a-box/speaker-guide.md new file mode 100644 index 0000000..8d6962a --- /dev/null +++ b/event-in-a-box/speaker-guide.md @@ -0,0 +1,150 @@ +# Speaker Guide: Agentic DevOps + +## Overview + +This guide will help you deliver an engaging and informative presentation on agentic DevOps principles and practices. + +## Session Objectives + +By the end of this session, attendees should be able to: +1. Understand what agentic DevOps means and its benefits +2. Identify key components of an agentic DevOps pipeline +3. Recognize opportunities to apply agentic approaches in their workflows +4. Implement basic agentic automation patterns + +## Key Messages + +### What is Agentic DevOps? + +Agentic DevOps refers to intelligent, autonomous systems that can make decisions and take actions within DevOps workflows with minimal human intervention. + +**Key characteristics:** +- Self-directed automation +- Context-aware decision making +- Learning from patterns and outcomes +- Proactive problem resolution + +### Why Agentic DevOps? + +1. **Faster Response Times**: Automated agents respond to incidents immediately +2. **Reduced Toil**: Repetitive tasks are handled autonomously +3. **Improved Reliability**: Consistent execution reduces human error +4. **Enhanced Scalability**: Systems adapt to changing demands automatically + +## Session Flow (60 minutes) + +### 1. Introduction (5 minutes) +- Introduce yourself and the topic +- Set expectations for the session +- Poll the audience about their current DevOps maturity + +### 2. Agentic DevOps Fundamentals (15 minutes) +- Define agentic DevOps +- Contrast with traditional DevOps +- Discuss the evolution from automation to autonomous systems +- Share real-world use cases + +**Speaker Notes:** +- Use the comparison slide to highlight differences +- Encourage questions throughout +- Share personal experiences if applicable + +### 3. Core Components (15 minutes) +- Intelligent agents and their capabilities +- Integration patterns +- Decision-making frameworks +- Monitoring and feedback loops + +**Demo Checkpoint:** Show a simple agent in action + +### 4. Implementation Strategies (15 minutes) +- Getting started with agentic approaches +- Common patterns and anti-patterns +- Security and governance considerations +- Measuring success + +**Interactive Element:** Ask audience to share their challenges + +### 5. Hands-on Demo (5 minutes) +- Walk through the sample application +- Show agent configuration and deployment +- Demonstrate autonomous decision-making + +### 6. Q&A and Wrap-up (5 minutes) +- Answer questions +- Share additional resources +- Provide next steps + +## Tips for Success + +### Before the Session +- [ ] Test all demos in the presentation environment +- [ ] Verify all links and resources are accessible +- [ ] Have backup plans for technical issues +- [ ] Review the FAQ for common questions +- [ ] Prepare extra examples for different experience levels + +### During the Session +- Start with an engaging hook or question +- Use real-world examples that resonate with your audience +- Encourage interaction and questions +- Keep demos short and focused +- Monitor time carefully +- Be prepared to adjust pace based on audience engagement + +### After the Session +- Share resources and slides with attendees +- Collect feedback for future improvements +- Follow up on unanswered questions +- Document any issues encountered + +## Common Questions + +**Q: How is this different from regular automation?** +A: Traditional automation follows predefined scripts. Agentic systems can interpret context, make decisions, and adapt their behavior based on the situation. + +**Q: Do we need AI/ML expertise?** +A: Not necessarily. Many agentic patterns can be implemented using rule-based systems and existing tooling. AI/ML enhances capabilities but isn't always required. + +**Q: What about security and compliance?** +A: Agentic systems should operate within defined guardrails. Implement proper authentication, authorization, and audit logging. Start with low-risk scenarios. + +**Q: How do we get started?** +A: Begin with a specific pain point in your pipeline. Implement a simple agent to address it. Learn from the experience, then expand gradually. + +## Customization Tips + +- Adjust technical depth based on audience expertise +- Include industry-specific examples when possible +- Extend or shorten sections based on time available +- Add local success stories or case studies +- Incorporate your organization's tools and practices + +## Presenter Checklist + +Before you present: +- [ ] Environment setup complete +- [ ] Demos tested and working +- [ ] Slides customized for audience +- [ ] Backup materials ready +- [ ] Contact information added to slides +- [ ] Feedback mechanism prepared +- [ ] Recording setup (if applicable) +- [ ] Q&A strategy planned + +## Resources for Presenters + +- [Demo Scripts](./demos/README.md) +- [Slide Deck](./presentation/slides.md) +- [Technical Setup](./setup/demo-setup.md) +- [Troubleshooting Guide](./resources/troubleshooting.md) + +## Feedback and Improvement + +After your presentation, please contribute back: +- What worked well? +- What questions came up? +- What demos resonated most? +- What could be improved? + +Share your insights to help improve this event in a box for future speakers. diff --git a/event-in-a-box/workshop/README.md b/event-in-a-box/workshop/README.md new file mode 100644 index 0000000..7a3c99b --- /dev/null +++ b/event-in-a-box/workshop/README.md @@ -0,0 +1,355 @@ +# Workshop Materials + +## Overview + +This directory contains hands-on exercises and lab materials for the Agentic DevOps workshop. + +## Workshop Structure + +### Duration Options + +- **Half-day (4 hours)**: Core concepts + 2 labs +- **Full-day (8 hours)**: All concepts + 4 labs + project +- **Multi-day**: Deep dive with advanced topics + +## Lab Exercises + +### Lab 1: Build Your First Agent (60 minutes) +**Difficulty:** Beginner +**File:** [lab1-first-agent.md](./lab1-first-agent.md) + +Create a simple monitoring agent that responds to GitHub webhook events. + +**Learning Objectives:** +- Understand agent architecture +- Handle webhook events +- Implement basic decision logic +- Execute automated actions + +### Lab 2: Pattern Matching and Response (60 minutes) +**Difficulty:** Beginner-Intermediate +**File:** [lab2-pattern-matching.md](./lab2-pattern-matching.md) + +Build an agent that recognizes error patterns and applies appropriate fixes. + +**Learning Objectives:** +- Pattern recognition techniques +- Regular expressions for log analysis +- Decision trees +- Action mapping + +### Lab 3: Self-Healing Pipeline (90 minutes) +**Difficulty:** Intermediate +**File:** [lab3-self-healing.md](./lab3-self-healing.md) + +Implement a complete self-healing CI/CD pipeline with multiple recovery strategies. + +**Learning Objectives:** +- Failure detection +- Root cause analysis +- Remediation strategies +- Rollback mechanisms + +### Lab 4: Multi-Agent System (90 minutes) +**Difficulty:** Advanced +**File:** [lab4-multi-agent.md](./lab4-multi-agent.md) + +Create a system with multiple specialized agents working together. + +**Learning Objectives:** +- Agent coordination +- Message passing +- Shared state management +- Conflict resolution + +## Prerequisites + +### For Participants + +**Required Knowledge:** +- Basic programming (Python or JavaScript) +- Git fundamentals +- Command line basics +- Understanding of CI/CD concepts + +**Required Setup:** +- Laptop with development environment +- GitHub account +- Git installed +- Python 3.8+ or Node.js 16+ +- Text editor or IDE +- Docker Desktop (for advanced labs) + +### For Instructors + +**Preparation Time:** 2-3 hours before workshop + +**Required:** +- Complete all labs yourself +- Test all code samples +- Prepare demo repository +- Set up participant accounts/access +- Verify all links and resources +- Prepare troubleshooting guide +- Have backup materials ready + +## Workshop Schedule + +### Half-Day Format (4 hours) + +``` +09:00 - 09:15 | Welcome & Introduction +09:15 - 10:00 | Presentation: Agentic DevOps Fundamentals +10:00 - 10:15 | Break +10:15 - 11:15 | Lab 1: Build Your First Agent +11:15 - 11:30 | Break +11:30 - 12:30 | Lab 2: Pattern Matching +12:30 - 13:00 | Q&A and Wrap-up +``` + +### Full-Day Format (8 hours) + +``` +09:00 - 09:30 | Welcome & Introduction +09:30 - 10:30 | Presentation: Agentic DevOps +10:30 - 10:45 | Break +10:45 - 11:45 | Lab 1: Build Your First Agent +11:45 - 13:00 | Lunch +13:00 - 14:00 | Lab 2: Pattern Matching +14:00 - 14:15 | Break +14:15 - 15:45 | Lab 3: Self-Healing Pipeline +15:45 - 16:00 | Break +16:00 - 17:00 | Lab 4: Multi-Agent System +17:00 - 17:30 | Showcase & Wrap-up +``` + +## Instructor Guide + +### Before the Workshop + +1. **Setup Checklist** (1 week before) + - [ ] Reserve venue/virtual space + - [ ] Send calendar invites + - [ ] Share prerequisite installation guide + - [ ] Prepare GitHub organization/repositories + - [ ] Test all lab materials + - [ ] Print/share handouts + +2. **Technical Preparation** (1 day before) + - [ ] Test WiFi/connectivity + - [ ] Verify projection/screen sharing + - [ ] Clone repositories for participants + - [ ] Prepare backup USB drives + - [ ] Test all demo scripts + - [ ] Have offline documentation ready + +3. **Day of Workshop** + - [ ] Arrive early for setup + - [ ] Test AV equipment + - [ ] Have contact info displayed + - [ ] Prepare ice breaker activity + - [ ] Review emergency contacts + +### During the Workshop + +**Pacing Tips:** +- Start with a quick poll of experience levels +- Adjust depth based on audience +- Use timers for labs +- Walk around to help participants +- Identify common issues early +- Pair up struggling participants with others +- Keep energy high with breaks and movement + +**Engagement Strategies:** +- Ask questions frequently +- Encourage pair programming +- Create friendly competition +- Share real-world stories +- Use humor appropriately +- Celebrate small wins + +**Handling Questions:** +- Answer briefly, offer to discuss details in breaks +- Use "parking lot" for off-topic questions +- Turn questions into learning opportunities +- Don't fake knowledge - it's OK to say "I don't know" + +### After the Workshop + +1. **Immediate Follow-up** + - Share slides and materials + - Send feedback survey + - Answer outstanding questions + - Share photos (with permission) + +2. **Within a Week** + - Review feedback + - Update materials based on feedback + - Send additional resources + - Offer office hours for questions + +3. **Long-term** + - Stay connected via community channels + - Share new resources + - Track participant success stories + - Iterate on materials + +## Lab Materials + +### What's Provided + +Each lab includes: +- **Instructions**: Step-by-step guide +- **Starter Code**: Template to begin with +- **Solution Code**: Reference implementation +- **Test Cases**: Validation scripts +- **Troubleshooting**: Common issues and fixes + +### Repository Structure + +``` +workshop/ +โ”œโ”€โ”€ lab1-first-agent/ +โ”‚ โ”œโ”€โ”€ README.md +โ”‚ โ”œโ”€โ”€ starter/ +โ”‚ โ”œโ”€โ”€ solution/ +โ”‚ โ””โ”€โ”€ tests/ +โ”œโ”€โ”€ lab2-pattern-matching/ +โ”‚ โ”œโ”€โ”€ README.md +โ”‚ โ”œโ”€โ”€ starter/ +โ”‚ โ”œโ”€โ”€ solution/ +โ”‚ โ””โ”€โ”€ tests/ +โ”œโ”€โ”€ lab3-self-healing/ +โ”‚ โ”œโ”€โ”€ README.md +โ”‚ โ”œโ”€โ”€ starter/ +โ”‚ โ”œโ”€โ”€ solution/ +โ”‚ โ””โ”€โ”€ tests/ +โ””โ”€โ”€ lab4-multi-agent/ + โ”œโ”€โ”€ README.md + โ”œโ”€โ”€ starter/ + โ”œโ”€โ”€ solution/ + โ””โ”€โ”€ tests/ +``` + +## Assessment and Certification + +### Lab Completion Criteria + +Each lab must demonstrate: +- [ ] Functional code that meets requirements +- [ ] Proper error handling +- [ ] Tests passing +- [ ] Documentation complete + +### Optional Certification + +Participants who complete all labs and pass the final assessment can receive: +- Certificate of completion +- Digital badge +- LinkedIn endorsement +- Access to advanced materials + +## Troubleshooting Guide + +### Common Participant Issues + +**"My agent isn't receiving webhooks"** +- Check webhook configuration +- Verify URL is accessible +- Check firewall/network settings +- Try ngrok or similar tunneling tool + +**"The code doesn't work on my machine"** +- Verify all prerequisites installed +- Check Python/Node version +- Review environment variables +- Try the Docker alternative + +**"I'm stuck on step X"** +- Pair with another participant +- Review solution hints +- Ask instructor for help +- Skip to next checkpoint + +**"This is too fast/slow"** +- Provide optional challenges for fast learners +- Offer simplified paths for struggling participants +- Use flexible pacing +- Have extension activities ready + +## Adaptations + +### For Different Audiences + +**Executives/Non-Technical:** +- Focus on concepts over coding +- Use no-code tools +- Emphasize business value +- Include more demos, fewer labs + +**Experienced Engineers:** +- Move faster through basics +- Add advanced challenges +- Discuss architecture tradeoffs +- Encourage experimentation + +**Mixed Experience Levels:** +- Pair experienced with beginners +- Offer multiple difficulty paths +- Provide extension challenges +- Use differentiated instruction + +### For Virtual Delivery + +**Adjustments:** +- Shorten sessions (max 90 min blocks) +- More frequent breaks +- Use breakout rooms for labs +- Increase interaction +- Record sessions for later viewing +- Provide better written instructions + +**Tools:** +- Zoom/Teams for video +- Slack/Discord for chat +- GitHub Codespaces for uniform environments +- Miro/Mural for collaboration +- Polling tools for engagement + +## Success Metrics + +### For Participants + +- % completing all labs +- Average assessment scores +- Post-workshop survey results +- Implementation rate (did they use it?) + +### For Organization + +- ROI in automation time saved +- Incidents prevented/resolved +- Team satisfaction improvement +- Adoption rate + +## Additional Resources + +- [Pre-Workshop Setup Guide](./pre-workshop-setup.md) +- [Instructor Tips & Tricks](./instructor-guide.md) +- [Participant Handbook](./participant-handbook.md) +- [Troubleshooting Database](./troubleshooting.md) + +## Feedback and Improvement + +After each workshop, please document: +- What went well +- What needs improvement +- Common stumbling blocks +- New questions that arose +- Ideas for new content + +Share feedback via: +- GitHub Issues +- Community Forums +- Direct contact with maintainers