| name | optimize-performance | |||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| description | Identify and fix performance issues using pattern recognition and language-idiomatic best practices | |||||||||||||||||||
| agent | agent | |||||||||||||||||||
| argument-hint | Optional: specify performance target (latency, throughput, memory) or constraints | |||||||||||||||||||
| tools |
|
Systematically analyze code for performance issues using pattern recognition and static analysis. Identify known anti-patterns, algorithmic inefficiencies, and opportunities for language-idiomatic optimizations. Suggests improvements following best practices for the detected language/framework, and implements changes only after user approval. Explicitly identifies breaking vs. non-breaking changes and validates that optimizations improve performance through testing.
- Accept target code from
${selection},${file}, or${input:target:Describe performance goals or specify target code (optional)}. - Optional profiling data: If user provides profiling output, benchmark results, or performance metrics as context, prioritize optimizations based on that data.
- Proactively gather project context:
- Language/framework detection: file extensions, manifests (package.json, Cargo.toml, go.mod, requirements.txt, etc.)
- Performance constraints:
README.md,CONTRIBUTING.md,.github/**/*,docs/**/*, SLA documentation - Existing benchmarks/tests: performance test files, benchmark configurations, CI/CD performance gates
- Runtime environment: deployment target (server, browser, mobile, embedded), resource constraints
- Dependencies: framework versions, available performance libraries, optimization tools
- Architecture patterns: caching strategies, database query patterns, async/concurrency usage
- Determine optimization scope:
- If user specifies targets (e.g., "reduce latency," "optimize memory") → focus on those metrics
- If profiling data provided → prioritize issues identified in profiling results
- If no target specified → perform comprehensive static analysis for known anti-patterns
- If unclear → ask about performance goals, acceptable tradeoffs, and critical constraints
Use sequential thinking (MCP tool: sequentialthinking) for complex performance analysis:
- Understand the code's purpose: What does this code do? What's its expected usage pattern?
- Identify performance context: Is this hot path or cold path? What's the likely execution frequency?
- Detect language/framework: What language is this? What are the idiomatic performance patterns?
- Check for profiling data: Did user provide profiling output, benchmarks, or metrics?
- Analyze code structure: What algorithms, data structures, and patterns are used?
Critical decision point: Does this code have a performance problem?
✅ **No optimization needed**
Code follows performance best practices:
- [Observation 1: e.g., using efficient algorithms (O(n log n))]
- [Observation 2: e.g., proper data structures for the use case]
- [Observation 3: e.g., idiomatic patterns for the language/framework]
**Why optimization would be premature or counterproductive**:
- [Reason 1: e.g., not a hot path based on code structure]
- [Reason 2: e.g., algorithmic complexity already optimal]
- [Reason 3: e.g., further optimization would sacrifice readability without clear benefit]
**Recommendation**: [If performance becomes a concern, suggest profiling to identify actual bottlenecks]
Stop here. Don't optimize code that doesn't need it.
Proceed to step 2.
Detect language and identify common performance anti-patterns and optimization opportunities:
- Common anti-patterns: Event loop blocking, inefficient object creation, closure overhead, DOM manipulation in loops, excessive re-renders, using
deleteoperator, unnecessaryforEachin hot paths - Optimization patterns: Memoization, debouncing/throttling, virtual scrolling, Web Workers, lazy loading, tree shaking,
Object.freezefor immutability,WeakMapfor metadata,for-ofoverforEachin hot loops - Static checks: Look for synchronous I/O, missing pagination, unbounded loops, inefficient string concatenation
- Common anti-patterns: String concatenation with
+in loops, list comprehensions when generators suffice, global variable lookups, unnecessary object copies, missing__slots__, quadratic nested loops - Optimization patterns: Use built-ins (
map,filter,any,all),''.join()for strings, generators for large datasets, NumPy/Pandas for numerical data,functools.lru_cache,__slots__for classes, list/dict comprehensions - Static checks: Look for N+1 database queries, missing indexes on iterations, inefficient data structures (list search instead of set/dict)
- Common anti-patterns: String concatenation with
+, not preallocating slices,deferin tight loops, unnecessary interface{} usage, missing buffer reuse, goroutine leaks (no context cancellation) - Optimization patterns:
strings.Builderfor string building,make([]T, 0, capacity)for preallocation,sync.Poolfor object reuse, value receivers for small types, avoidinterface{}in hot paths - Static checks: Look for slice growth in loops, missing capacity hints, unnecessary heap allocations, inefficient map operations
- Common anti-patterns: Unnecessary
.clone()calls, missingVec::with_capacity, usingBox/Rcwhen stack allocation works, inefficient iterator chains, dynamic dispatch (dyn Trait) in hot paths - Optimization patterns: Borrowing over cloning,
Vec::with_capacityfor known sizes,Cowfor conditional ownership, iterator adapters (filter,map,fold),#[inline]for small functions, zero-copy parsing - Static checks: Look for
.clone()that could be borrows, missing capacity hints, unnecessary allocations, suboptimal iterator usage
- Common anti-patterns: Autoboxing in loops, using
LinkedListinstead ofArrayList, inefficient string concatenation, excessive reflection, unnecessary synchronization, creating streams for small collections - Optimization patterns: Primitive arrays for numerical data,
StringBuilderfor string building,ArrayListby default,EnumSet/EnumMapfor enums, stream optimization, parallel streams for large datasets - Static checks: Look for boxing/unboxing, inefficient collection usage, string concatenation with
+, unnecessary object creation
- Common anti-patterns: Unnecessary heap allocations, missing move semantics, copying large objects, inefficient memory layout, virtual calls in tight loops, missing
constandconstexpr - Optimization patterns: Move semantics (
std::move),reserve()for vectors,emplace_backoverpush_back, RAII,constexprfor compile-time computation, value semantics for small types - Static checks: Look for unnecessary copies, missing moves, heap allocations that could be stack, inefficient container operations
- Common anti-patterns: Boxing value types, excessive LINQ on hot paths, string concatenation with
+, misusing async/await (async void, unnecessary Task), creating arrays for temporary data - Optimization patterns:
Span<T>for memory slicing,ArrayPoolfor buffer reuse,ValueTaskfor hot async paths,structfor small value types,stackallocfor small buffers,StringBuilder - Static checks: Look for boxing, LINQ in loops, string allocations, unnecessary async, array allocations that could use pooling
- Common anti-patterns: String allocations (not freezing strings), inefficient Hash lookups, using
#mapfor side effects only, excessive object creation, missing memoization - Optimization patterns: Frozen string literals, symbol keys for hashes,
#fetchwith default block,#eachvs#mapappropriately, precompute constants, memoization with||= - Static checks: Look for unfrozen strings in loops,
#mapwithout using results, inefficient enumerable methods
If language not listed: Research language-specific performance patterns and common anti-patterns before proceeding.
Analyze code structure and identify optimization opportunities:
- Use profiling data if available: Prioritize issues identified in user-provided profiling output or benchmarks
- Analyze algorithmic complexity: What's the Big O? Can we use better algorithms/data structures? Look for O(n²) or worse in loops
- Identify anti-patterns: Match code against language-specific anti-patterns from section 2
- Check I/O patterns: Unnecessary network/disk I/O? Missing batching/caching? N+1 query problems?
- Inspect memory patterns: Excessive allocations visible in code? Unnecessary object creation? Missing object reuse?
- Review concurrency usage: Blocking operations? Missing async/parallelization opportunities? Obvious race conditions?
Common performance anti-patterns:
- Premature pessimization: Using slow approach when fast one is equally simple
- N+1 queries: Database/API calls in loops
- Unbounded operations: No pagination, limit, or timeout
- Inefficient data structures: Array search when hash map is better
- Repeated computation: Same calculation multiple times without caching
- Synchronous blocking: Not using async/concurrency when beneficial
- Large payloads: Transferring/parsing more data than needed
- Poor cache locality: Random memory access patterns
Prioritize optimizations by impact × likelihood × safety:
- Algorithm/data structure improvements (O(n²) → O(n log n))
- Removing redundant work (duplicate calculations, unnecessary allocations)
- Adding caching for expensive pure functions
- Database query optimization (indexes, query consolidation, connection pooling)
- Using language built-ins over manual implementations
- Preallocation of collections with known size
- Concurrency/parallelization (verify thread safety, avoid race conditions)
- Lazy loading/eager loading tradeoffs
- Memory layout optimization (struct packing, cache alignment)
- Micro-optimizations in hot loops (loop unrolling, reduced branching)
- Breaking API changes (signature changes, return type changes)
- Architecture changes (sync → async, blocking → non-blocking)
- Unsafe code (manual memory management, unchecked operations)
- Platform-specific optimizations (SIMD, assembly, compiler intrinsics)
Tradeoff considerations:
- Performance vs. Readability: Prefer clear code unless profiling justifies complexity
- Performance vs. Maintainability: Avoid clever optimizations that obscure intent
- Performance vs. Portability: Document platform-specific assumptions
- Performance vs. Safety: Never sacrifice correctness for speed
- Time complexity vs. Space complexity: Consider memory constraints
- Optimization effort vs. Actual gain: Focus on bottlenecks, not marginal improvements
Before implementing optimizations, classify changes:
- Internal algorithm changes maintaining same output
- Adding optional parameters with defaults
- Performance improvements without API changes
- Adding caching/memoization transparently
- Optimizing internal data structures
- Signature changes: Parameter types, return types, parameter order
- Behavior changes: Different output format, error handling, side effects
- Semantic changes: Async → sync or vice versa, blocking behavior
- Contract changes: Precondition/postcondition modifications
- Dependency changes: Minimum version bumps, removed dependencies
For breaking changes, document:
- What breaks: Specific API contracts that change
- Migration path: How users update their code
- Alternatives: Can we achieve performance without breaking changes?
- Justification: Is the performance gain worth the disruption?
Before writing code:
- Document optimization rationale: "By doing X, we address Y anti-pattern and expect improvement in Z"
- Suggest benchmarking: Recommend creating/running benchmarks to validate the improvement
- Present plan to user: Show proposed changes, breaking/non-breaking classification, expected impact
- Get explicit approval: Especially for breaking changes or complex optimizations
Implementation principles:
- One optimization at a time: Easier to measure impact and rollback if needed
- Preserve correctness: All tests must pass (or be updated appropriately)
- Maintain code quality: Optimized code should still be readable and maintainable
- Add comments: Explain non-obvious optimizations (why, not just what)
- Follow language idioms: Use the idiomatic "fast path" for the language
- Keep escape hatches: Allow users to opt into slower but safer behavior if needed
After implementing optimizations:
- Run existing tests: Verify correctness - all tests pass, behavior unchanged (or intentionally changed)
- Run/create benchmarks: If benchmarks exist, compare before/after. If not, suggest creating them to validate improvement
- Check edge cases: Test with various input sizes (small, medium, large), error conditions
- Verify algorithmic improvement: Confirm the complexity change (e.g., O(n²) → O(n log n)) is reflected in practice
- Suggest performance validation: Recommend how to measure the improvement in the target environment
Success criteria:
- All tests passing (correctness preserved)
- Code quality maintained or improved
- Anti-patterns eliminated
- Breaking changes documented and approved
- Benchmarks show improvement (if run), or clear recommendation for benchmarking provided
## Performance Analysis
**Language/Framework**: [Detected language and version]
**Analysis Method**: [Static code analysis / Pattern recognition / Based on provided profiling data]
**Profiling Data** (if provided by user):
- [Metric from profiling: e.g., Function X takes 450ms (65% of total time)]
- [Or: "No profiling data provided - analysis based on code patterns"]
**Identified Issues**:
1. **[Issue 1]** (`file.ext:L42-L58`) - [Severity: Critical/High/Medium/Low]
- **Anti-pattern**: [Specific anti-pattern, e.g., "O(n²) nested loop for deduplication"]
- **Evidence**: [Code structure analysis or profiling data if available]
- **Impact**: [Expected performance impact based on algorithmic analysis]
2. **[Issue 2]** (`file.ext:L89`) - [Severity]
- **Anti-pattern**: [Specific issue]
- **Evidence**: [Code analysis or profiling data]
- **Impact**: [Expected impact]
## Proposed Optimizations
### Optimization 1: [Name, e.g., "Replace nested loop with HashSet"]
**Location**: `file.ext:L42-L58`
**Change Type**: ✅ Non-Breaking | ⚠️ Breaking
**Approach**: [Detailed explanation of the optimization]
**Expected Improvement**: [Estimated improvement based on algorithmic analysis, e.g., "Reduces complexity from O(n²) to O(n), significant improvement for large inputs (n > 1000)"]
**Tradeoffs**:
- ✅ Pro: [Benefit]
- ✅ Pro: [Benefit]
- ⚠️ Con: [Drawback, e.g., "Increased memory usage by O(n)"]
**Code Preview**:
```[language]
// Before (simplified)
[relevant original code snippet]
// After (proposed)
[optimized code snippet][Repeat structure]
[If any breaking changes exist, list them clearly:]
| Change | Location | Impact | Migration |
|---|---|---|---|
| [Change description] | file:line |
[What breaks] | [How to fix] |
OR
✅ All proposed optimizations are non-breaking changes.
Proceed with optimizations?
- Low-risk changes (non-breaking): [List them]
- High-risk changes (breaking or complex): [List them]
- Suggested approach: [Recommend which to implement first, or if all should be applied]
Awaiting your approval to implement these changes.
### Phase 2: Implementation Results (After Approval & Implementation)
```markdown
## Performance Optimization Results
### Changes Implemented
**[Optimization 1 Name]** (`file.ext:L42-L58`)
- ✅ Non-Breaking | ⚠️ Breaking
- [Brief description of what was changed]
**[Optimization 2 Name]** (`file.ext:L89-L95`)
- ✅ Non-Breaking | ⚠️ Breaking
- [Brief description of what was changed]
### Performance Improvements
**Benchmarks** (if run):
| Metric | Before | After | Improvement |
|--------|--------|-------|-------------|
| [Metric 1] | [Value] | [Value] | [% improvement] ✅ |
| [Metric 2] | [Value] | [Value] | [% improvement] ✅ |
**OR if benchmarks not run:**
**Expected Improvements**:
- [Algorithmic improvement 1: e.g., Reduced complexity from O(n²) to O(n)]
- [Anti-pattern fixed 1: e.g., Eliminated N+1 query problem]
- [Language-specific improvement: e.g., Replaced string concatenation in loop with StringBuilder]
**Validation Method**: [Describe how improvement was verified: existing tests, new benchmarks created, or recommended benchmarking approach]
### Code Changes Explained
#### [Optimization 1: Detailed Explanation]
**Before**:
```[language]
[Original code with context]
After:
[Optimized code with context]
Why This Improves Performance:
- [Reason 1: Algorithmic improvement, reduced complexity from O(n²) to O(n)]
- [Reason 2: Eliminated known anti-pattern (e.g., unnecessary allocations)]
- [Reason 3: Uses language-idiomatic pattern (e.g., "Leverages Go's map implementation which is optimized for this use case")]
- [Reason 4: Expected impact based on code analysis]
[Repeat structure]
- ✅ All existing tests pass
- ✅ Edge cases verified (empty input, large input, error conditions)
- ✅ Performance benchmarks [run and show improvement / created for future validation / recommended]
- [Additional test notes if relevant]
[If breaking changes were approved and implemented:]
Change: [What changed]
Before (old API):
[Old usage example]
After (new API):
[New usage example]
Why: [Performance justification]
OR if no breaking changes:
✅ No breaking changes. All optimizations maintain backward compatibility.
- [Optional: Next optimization opportunities, e.g., "Consider adding caching layer for database queries"]
- [Optional: Monitoring suggestions, e.g., "Add metrics for endpoint latency to track performance in production"]
- [Optional: Further profiling, e.g., "Profile under production load to identify additional bottlenecks"]
### If no optimization is needed:
```markdown
✅ **Code follows performance best practices**
**Analysis Results**:
- [Finding 1: e.g., Using optimal algorithm for the use case (O(n log n) for sorting)]
- [Finding 2: e.g., Proper data structures (HashSet for lookups)]
- [Finding 3: e.g., Language-idiomatic patterns applied]
**What was checked**:
- [Check 1: Algorithmic complexity analysis]
- [Check 2: Language-specific anti-patterns]
- [Check 3: Memory allocation patterns]
**Why optimization would be premature or counterproductive**:
- [Reason 1: e.g., "Already using optimal algorithm"]
- [Reason 2: e.g., "No obvious anti-patterns detected"]
- [Reason 3: e.g., "Further optimization would sacrifice readability without clear benefit"]
**Next Steps** (if performance becomes a concern):
- [Recommendation: e.g., "Profile with realistic workload to identify actual bottlenecks"]
- [Recommendation: e.g., "Add benchmarks to track performance over time"]
✅ DO optimize when:
- Code uses known anti-patterns (e.g., N+1 queries, O(n²) algorithms on unbounded input, language-specific anti-patterns)
- Profiling data (if provided) shows clear bottlenecks
- Performance fails to meet documented SLAs or user expectations
- User explicitly requests optimization with clear targets
- Algorithmic complexity is clearly suboptimal for the use case
- Language-idiomatic patterns are not being used
❌ DON'T optimize when:
- No obvious anti-patterns or inefficiencies detected
- Algorithmic complexity is already optimal
- Code follows language-idiomatic best practices
- Optimization would sacrifice correctness or security
- Effort outweighs benefit (micro-optimizations for cold paths without data)
- The likely bottleneck is external (network, database, third-party API) based on code structure
- Pattern-based: Identify and eliminate known anti-patterns and inefficiencies
- Idiomatic: Use language-specific best practices and optimization patterns
- Algorithmic: Prioritize algorithmic improvements (better Big O) over micro-optimizations
- Safe: Preserve correctness, maintain test coverage
- Pragmatic: Focus on issues with clear impact, avoid premature optimization
- Documented: Explain why optimizations work and what anti-patterns were addressed
- Testable: Ensure changes can be validated through tests and benchmarks
- Micro-benchmarks unreliable: Test with realistic workloads and data
- Different performance on different platforms: Profile on target environment
- Performance regression in edge cases: Test with various input sizes and types
- Concurrency issues: Verify thread safety, avoid race conditions
- Resource constraints: Consider memory-limited, CPU-limited environments separately
- Uncertain about impact?: Recommend profiling or benchmarking to validate the optimization
- Multiple optimization approaches?: Present tradeoffs with pros/cons
- Breaking changes necessary?: Explain why and provide migration path
- User's performance goals unclear?: Ask about SLAs, acceptable latency, throughput targets, or typical workload
- No obvious anti-patterns?: State findings clearly and suggest profiling if performance is still a concern
- Optimization implemented?: Recommend benchmarking to measure actual impact
- Premature optimization: Optimizing code that doesn't have clear issues or anti-patterns
- Micro-optimization: Focusing on insignificant gains when algorithmic improvements are available
- Clever code: Sacrificing readability for marginal performance without data to support it
- Ignoring algorithmic complexity: Focusing on low-level optimizations when algorithm is suboptimal
- Assuming impact: Claiming specific performance gains without benchmarking data
- Breaking correctness: Trading bugs for speed
- Over-engineering: Adding complexity without clear benefit
When optimizing, leverage these idiomatic patterns:
- JavaScript/TypeScript:
Object.freezefor immutability,WeakMapfor metadata, avoiddeleteoperator, usefor-ofoverforEachin hot loops - Python: List/dict comprehensions,
itertools,functools.lru_cache,__slots__, NumPy for numerical ops - Go: Preallocate slices with
make([]T, 0, capacity), usestrings.Builder, prefer value receivers for small types, avoidinterface{}in hot paths - Rust: Iterator adapters (
filter,map,fold),Vec::with_capacity, avoidBox/Rcwhen stack allocation works, use#[inline]judiciously - Java:
ArrayListoverLinkedListby default,EnumSet/EnumMapfor enums,StringBuilder, primitive streams - C++: Move semantics,
reserve()for vectors,emplace_backoverpush_back, RAII,constexpr
When recommending benchmarks to validate optimizations:
- Realistic workloads: Test with production-like data and scenarios
- Warm up period: Account for JIT compilation, cache warming (especially for JVM, .NET)
- Multiple iterations: Run benchmarks multiple times for statistical significance
- Isolate variables: Change one optimization at a time to measure impact
- Target environment: Benchmark in production-like environment when possible
- Benchmark tools: Suggest language-appropriate tools (e.g., Criterion for Rust, JMH for Java, pytest-benchmark for Python)
- Reduce allocations: Preallocate, reuse objects, object pooling
- Optimize data layout: Struct packing, cache-friendly access patterns
- Use value types: Avoid heap allocations for small types (stack allocation)
- Lazy loading: Defer expensive initialization until needed
- Streaming: Process data incrementally instead of loading all at once
- Query optimization: Indexes, query consolidation, pagination
- Connection pooling: Reuse database connections
- Batching: Combine multiple operations
- Caching: Memoize expensive queries
- Async I/O: Non-blocking operations for I/O-bound code
Before delivering optimization results:
- Identified issues are based on recognized anti-patterns or profiling data (if provided)
- Language-idiomatic patterns and best practices were applied
- Algorithmic improvements are sound (Big O analysis is correct)
- Tests pass (correctness preserved)
- Benchmarks run (if available) or benchmarking approach recommended
- Breaking changes are clearly documented
- User approval obtained before implementing breaking changes
- Code quality maintained (readable, maintainable optimizations)
- Edge cases tested (various input sizes, error conditions)
- Optimization aligns with user's stated or implied performance goals
Philosophy: Focus on eliminating known anti-patterns and algorithmic inefficiencies first. The best optimizations are based on sound analysis, respect language idioms, maintain code clarity, and can be validated through testing. When profiling data is available, use it to prioritize; otherwise, rely on pattern recognition and complexity analysis. Never sacrifice correctness for performance.