Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
0e8b08c
Initial plan
Copilot Aug 7, 2025
ff81f99
Initial plan
Copilot Aug 7, 2025
f5011db
Implement improved dependency management for AzureDataLakeManagement …
Copilot Aug 7, 2025
2acf7e6
Add comprehensive GitHub Copilot instructions for Azure Data Lake Man…
Copilot Aug 7, 2025
9ab562c
Merge pull request #33 from SteveCInVA/copilot/fix-32
SteveCInVA Nov 4, 2025
96a7b61
Merge pull request #31 from SteveCInVA/copilot/fix-30
SteveCInVA Nov 4, 2025
8139773
Initial plan
Copilot Nov 4, 2025
19b73ac
Add dev container support with PowerShell and required VS Code extens…
Copilot Nov 4, 2025
4b4209a
Add comprehensive tests for dev container configuration
Copilot Nov 4, 2025
11a7119
Clean up test file - remove empty AfterAll block
Copilot Nov 4, 2025
6cb9e14
README with module usage and dependency management details, improve t…
SteveCInVA Nov 4, 2025
71f4fd0
Update module version and PowerShell version in module manifest
SteveCInVA Nov 4, 2025
2009432
Merge pull request #35 from SteveCInVA/copilot/add-dev-container-support
SteveCInVA Nov 4, 2025
ea7d2b2
Initial plan
Copilot Nov 4, 2025
692dead
Migrate from AzureAD to Microsoft.Graph for PowerShell 7+ compatibility
Copilot Nov 4, 2025
950cc0c
Add comprehensive Microsoft.Graph migration tests
Copilot Nov 4, 2025
15eaf15
Address code review feedback: Improve property access and error handling
Copilot Nov 4, 2025
71ac503
Add comprehensive migration guide for AzureAD to Microsoft.Graph
Copilot Nov 4, 2025
89319c2
Update module dependencies to replace AzureAD with Microsoft.Graph mo…
SteveCInVA Nov 4, 2025
9338b62
Fix PowerShell default version setting in devcontainer configuration
SteveCInVA Nov 4, 2025
247a997
Merge pull request #37 from SteveCInVA/copilot/fix-azuread-module-com…
SteveCInVA Nov 4, 2025
693be52
Fixed spelling error
SteveCInVA Nov 4, 2025
ff08462
Add support for -whatif and -confirm + remove warnings.
SteveCInVA Nov 7, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 46 additions & 0 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
{
"name": "Azure Data Lake Management PowerShell",
"image": "mcr.microsoft.com/devcontainers/base:ubuntu",

// Features to install - PowerShell is a predefined feature
"features": {
"ghcr.io/devcontainers/features/powershell:1": {
"version": "latest"
}
},

// Configure tool-specific properties
"customizations": {
"vscode": {
// Set *default* container specific settings.json values on container create
"settings": {
"terminal.integrated.defaultProfile.linux": "pwsh",
"powershell.powerShellDefaultVersion": "PowerShell",
"githubPullRequests.remotes": [
"https://github.com/SteveCInVA/AzureDataLakeManagement.git"
]
},

// Add the IDs of extensions you want installed when the container is created
"extensions": [
"ms-vscode.powershell",
"pspester.pester-test",
"github.copilot",
"github.vscode-github-actions",
"jgclark.vscode-todo-highlight"
]
}
},

// Use 'postCreateCommand' to run commands after the container is created
"postCreateCommand": "pwsh -Command 'Install-Module -Name PSScriptAnalyzer, Pester -Force -Scope CurrentUser -SkipPublisherCheck'"

// Use 'forwardPorts' to make a list of ports inside the container available locally
// "forwardPorts": [],

// Use 'postStartCommand' to run commands each time the container starts
// "postStartCommand": ""

// Uncomment to connect as root instead. More info: https://aka.ms/dev-containers-non-root
// "remoteUser": "root"
}
324 changes: 324 additions & 0 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,324 @@
# Azure Data Lake Management PowerShell Module

Always follow these instructions first and only search or use bash commands when you encounter information that contradicts what is documented here or when these instructions are incomplete.

This repository contains a PowerShell module for managing Azure Data Lake Storage Gen 2 folders and Access Control Lists (ACLs). The module simplifies ACL management by using object names rather than IDs and provides functions to create, delete, move folders and manage permissions recursively.

## Working Effectively

### Prerequisites and Environment Setup
- Install PowerShell 7+ (PowerShell Core): Download from https://github.com/PowerShell/PowerShell/releases
- Install required Azure PowerShell modules:
```powershell
Install-Module -Name Az.Storage -Scope CurrentUser -Force
Install-Module -Name Microsoft.Graph.Applications -Scope CurrentUser -Force
Install-Module -Name Microsoft.Graph.Users -Scope CurrentUser -Force
Install-Module -Name Microsoft.Graph.Groups -Scope CurrentUser -Force
Install-Module -Name Microsoft.Graph.DirectoryObjects -Scope CurrentUser -Force

```
- Authenticate to Azure before testing:
```powershell
Connect-AzAccount
Connect-MgGraph -Scopes "User.Read.All", "Group.Read.All", "Application.Read.All"
```

### Code Quality and Validation
- Run PSScriptAnalyzer for code quality checks (takes ~3 seconds):
```powershell
Invoke-ScriptAnalyzer -Path ./AzureDataLakeManagement/AzureDataLakeManagement.psm1
```
- Test module manifest (takes ~1 second):
```powershell
Test-ModuleManifest ./AzureDataLakeManagement/AzureDataLakeManagement.psd1
```
**Note**: Will show warnings about missing Az.Storage, Microsoft.Graph.Applications, Microsoft.Graph.Users, Microsoft.Graph.Groups, and Microsoft.Graph.DirectoryObjects modules if they're not installed. This is expected in offline environments.
- ALWAYS run PSScriptAnalyzer before committing changes or the code quality will deteriorate.

### Offline Development and Testing
When Azure modules or connectivity is not available:
- Module import will work but functions will fail at runtime
- PSScriptAnalyzer and manifest testing work completely offline
- Function syntax and help documentation can be validated offline
- Use these commands for offline validation:
```powershell
# These work without Azure connectivity
Import-Module -Force './AzureDataLakeManagement/AzureDataLakeManagement.psm1'
Get-Command -Module AzureDataLakeManagement
Get-Help Add-DataLakeFolder -Examples
Invoke-ScriptAnalyzer -Path ./AzureDataLakeManagement/AzureDataLakeManagement.psm1
Test-ModuleManifest ./AzureDataLakeManagement/AzureDataLakeManagement.psd1
```

### Module Development and Testing
- Import the module for testing (~1 second):
```powershell
Import-Module -Force './AzureDataLakeManagement/AzureDataLakeManagement.psm1'
```
- Get available functions:
```powershell
Get-Command -Module AzureDataLakeManagement
```
- Access function help and examples (~1 second per function):
```powershell
Get-Help Add-DataLakeFolder -Examples
Get-Help Set-DataLakeFolderACL -Full
```
- **CRITICAL**: Test functions only with test/development Azure resources. Never test against production data.

### Publishing Process
- **Prerequisites for Publishing**:
- PowerShell Gallery API Key (set as environment variable `PSGalleryKey`)
- Module version updated in `.psd1` file
- All PSScriptAnalyzer warnings addressed
- Manual validation completed

- **Manual publish to PowerShell Gallery** (~30 seconds):
```powershell
# Set your API key first
$env:PSGalleryKey = "your-api-key-here"
.\publish.ps1
```

- **GitHub Actions publish**:
- Workflow: `.github/workflows/manual_publish.yml`
- Requires `PSGalleryKey` secret configured in repository
- Manually triggered via GitHub Actions UI
- Uses `workflow_dispatch` trigger (not automatic)

- **Pre-publish validation checklist**:
```powershell
# 1. Code quality check
Invoke-ScriptAnalyzer -Path ./AzureDataLakeManagement/AzureDataLakeManagement.psm1

# 2. Module manifest validation
Test-ModuleManifest ./AzureDataLakeManagement/AzureDataLakeManagement.psd1

# 3. Module import test
Import-Module -Force './AzureDataLakeManagement/AzureDataLakeManagement.psm1'
Get-Command -Module AzureDataLakeManagement

# 4. Verify version number is updated in .psd1
# 5. Complete manual validation scenarios with test Azure resources
```

## Key Module Components

### Primary Functions (8 total):
1. **Get-AADObjectId** - Retrieve Azure AD object details by name/UPN
2. **Get-AzureSubscriptionInfo** - Get subscription information
3. **Add-DataLakeFolder** - Create folder structures in Data Lake Storage
4. **Remove-DataLakeFolder** - Delete folders from Data Lake Storage
5. **Set-DataLakeFolderACL** - Apply ACL permissions to folders (recursively)
6. **Get-DataLakeFolderACL** - Retrieve current ACL permissions
7. **Move-DataLakeFolder** - Move/rename folders between containers
8. **Remove-DataLakeFolderACL** - Remove ACL permissions from folders

### Core Files:
- `AzureDataLakeManagement/AzureDataLakeManagement.psm1` - Main module (1026 lines, 8 functions)
- `AzureDataLakeManagement/AzureDataLakeManagement.psd1` - Module manifest and metadata
- `example.ps1` - Complete usage examples showing folder creation and ACL management
- `publish.ps1` - PowerShell Gallery publishing script

## Validation and Testing

### Code Quality Validation
Run these before every commit:
```powershell
# Static analysis (3 seconds) - NEVER CANCEL
Invoke-ScriptAnalyzer -Path ./AzureDataLakeManagement/AzureDataLakeManagement.psm1

# Module manifest validation (1 second)
Test-ModuleManifest ./AzureDataLakeManagement/AzureDataLakeManagement.psd1
```

### Manual Validation Scenarios
**CRITICAL**: Always test against development/test Azure resources only. Complete these scenarios after making changes:

1. **Authentication and Module Import Test** (1-2 minutes):
```powershell
Connect-AzAccount
Connect-MgGraph -Scopes "User.Read.All", "Group.Read.All", "Application.Read.All"
Import-Module -Force './AzureDataLakeManagement/AzureDataLakeManagement.psm1'
Get-Command -Module AzureDataLakeManagement
# Should show all 8 functions
```

2. **Basic Folder Operations Test** (5-10 minutes):
```powershell
# Use test subscription and storage account
$subName = '<subscriptionName>'
$rgName = 'resourceGroup01'
$storageAccountName = 'storage01'
$containerName = 'bronze'

# Create test folder structure
Add-DataLakeFolder -SubscriptionName $subName -resourceGroup $rgName -storageAccountName $storageAccountName -containerName $containerName -folderPath 'test-dataset\sample-folder'

# Verify folder exists in Azure Storage Explorer or portal

# Test folder move operation
Move-DataLakeFolder -SubscriptionName $subName -resourceGroup $rgName -storageAccountName $storageAccountName -SourceContainerName $containerName -sourceFolderPath 'test-dataset\sample-folder' -DestinationContainerName $containerName -destinationFolderPath 'test-dataset\moved-folder'

# Clean up
Remove-DataLakeFolder -SubscriptionName $subName -resourceGroup $rgName -storageAccountName $storageAccountName -containerName $containerName -folderPath 'test-dataset'
```

3. **ACL Management Test** (5-10 minutes):
```powershell
# Create test folder
Add-DataLakeFolder -SubscriptionName $subName -resourceGroup $rgName -storageAccountName $storageAccountName -containerName $containerName -folderPath 'acl-test'

# Apply test ACL (use test user/group)
Set-DataLakeFolderACL -SubscriptionName $subName -ResourceGroupName $rgName -StorageAccountName $storageAccountName -ContainerName $containerName -folderPath 'acl-test' -Identity 'stecarr@MngEnvMCAP254199.onmicrosoft.com' -accessControlType Read

# Verify ACL was applied
Get-DataLakeFolderACL -SubscriptionName $subName -ResourceGroupName $rgName -StorageAccountName $storageAccountName -ContainerName $containerName -folderPath 'acl-test'

# Test ACL removal
Remove-DataLakeFolderACL -SubscriptionName $subName -ResourceGroupName $rgName -StorageAccountName $storageAccountName -ContainerName $containerName -folderPath 'acl-test' -Identity 'stecarr@MngEnvMCAP254199.onmicrosoft.com'

# Clean up
Remove-DataLakeFolder -SubscriptionName $subName -resourceGroup $rgName -storageAccountName $storageAccountName -containerName $containerName -folderPath 'acl-test'
```

4. **Azure AD Object Resolution Test** (2-3 minutes):
```powershell
# Test user lookup
Get-AADObjectId -Identity 'stecarr@MngEnvMCAP254199.onmicrosoft.com'

# Test group lookup
Get-AADObjectId -Identity 'allcompany'

# Test service principal lookup
Get-AADObjectId -Identity 'CompliancePolicy'

# Should return ObjectId, ObjectType, and DisplayName for each
```

5. **Complete example.ps1 Workflow Test** (10-15 minutes):
```powershell
# Modify variables in example.ps1 first, then run:
.\example.ps1

# Verify in Azure portal:
# - Multiple folder structures created
# - ACL permissions applied correctly
# - Test folders cleaned up properly
```

### Development Workflow Timing
- PSScriptAnalyzer execution: ~3 seconds - NEVER CANCEL
- Module manifest testing: ~1 second
- Module import: ~1 second (without Azure dependencies)
- Module import with Azure modules: ~2-5 seconds (depends on Azure module loading)
- Single folder operation: ~3-10 seconds (depends on Azure latency)
- ACL operations: ~5-15 seconds (depends on Azure AD latency and hierarchy depth)
- Full validation scenario: ~10-20 minutes total
- Complete example.ps1 workflow: ~5-15 minutes (creates multiple folders and ACLs)

### Using example.ps1 for Learning
The `example.ps1` file demonstrates a complete workflow:
1. Authentication to Azure and Azure AD
2. Creating hierarchical folder structures
3. Setting various ACL types (user, group, service principal)
4. Error handling scenarios
5. Cleanup operations

**CRITICAL**: Always modify the variables in example.ps1 before running:
```powershell
$subName = '<subscriptionName>'
$rgName = 'resourceGroup01'
$storageAccountName = 'storage01'
$containerName = 'bronze'
```

## Common Development Tasks

### Adding New Functions
1. Add function to `AzureDataLakeManagement.psm1`
2. Update `FunctionsToExport` in `AzureDataLakeManagement.psd1`
3. Add usage example to `example.ps1`
4. Run PSScriptAnalyzer validation
5. Test with manual validation scenarios

### Debugging Issues
- Use VS Code with PowerShell extension for debugging
- Import module with `-Force` to reload changes
- Use `-Verbose` parameter on functions for detailed output
- Check Azure portal/Storage Explorer to verify actual changes

### Common Error Scenarios
1. **Missing Azure Authentication**: Functions fail with authentication errors
- Solution: Run `Connect-AzAccount` and `Connect-AzureAD`

2. **Module Dependencies Missing**: Import fails with module not found errors
- Solution: Install Az.Storage, AzureAD, and Az.Accounts modules

3. **Path Format Issues**: Functions expect backslash separators in folderPath
- Correct: `'dataset1\folder1\subfolder'`
- Incorrect: `'dataset1/folder1/subfolder'`

4. **Permissions Issues**: ACL operations fail due to insufficient permissions
- Ensure Azure AD permissions and Storage Account permissions are configured

5. **Storage Account Access**: Operations fail if storage account keys are not accessible
- Module requires either storage account key access OR proper Azure AD permissions

### Known Code Quality Issues
PSScriptAnalyzer currently identifies 17 warnings that should be addressed in new code:
- 8 instances of `Write-Host` usage (use `Write-Output`, `Write-Verbose`, or `Write-Information`)
- 6 unused parameter warnings (remove unused parameters)
- 3 instances of Use Singular nouns in function names (rename functions to use singular nouns)

Run `Invoke-ScriptAnalyzer` to see the complete list with line numbers and detailed guidance.

## Repository Structure Reference
```
.
├── .github/
│ └── workflows/
│ └── manual_publish.yml # GitHub Actions publishing workflow
├── .vscode/
│ ├── launch.json # VS Code debugging configuration
│ └── settings.json # VS Code settings
├── AzureDataLakeManagement/
│ ├── AzureDataLakeManagement.psd1 # Module manifest
│ └── AzureDataLakeManagement.psm1 # Main module (8 functions)
├── .gitignore
├── LICENSE
├── README.md # Project overview and version history
├── example.ps1 # Complete usage examples
└── publish.ps1 # PowerShell Gallery publishing script
```

## Quick Reference Commands

### Daily Development
```powershell
# Load and test module
Import-Module -Force './AzureDataLakeManagement/AzureDataLakeManagement.psm1'

# Code quality check (run before commit)
Invoke-ScriptAnalyzer -Path ./AzureDataLakeManagement/AzureDataLakeManagement.psm1

# Test manifest
Test-ModuleManifest ./AzureDataLakeManagement/AzureDataLakeManagement.psd1
```

### Azure Authentication
```powershell
Connect-AzAccount
Connect-MgGraph -Scopes "User.Read.All", "Group.Read.All", "Application.Read.All"
Get-AzSubscription # Verify connection
```

### Function Usage Pattern
```powershell
# All functions follow this parameter pattern:
-SubscriptionName # Azure subscription name
-ResourceGroupName # Resource group containing storage account
-StorageAccountName # Storage account name
-ContainerName # Container/filesystem name
-folderPath # Path within container (use backslash separators)
```
Loading
Loading