Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
98e0de3
CY002: Identified TEAVS and ADCRS components
Jessica-Kakshapati Mar 30, 2026
aa1c304
CY002: Added key assets and components identification
Jessica-Kakshapati Mar 30, 2026
ef5536c
CY006: Defined end points
Jessica-Kakshapati Mar 30, 2026
45b22f7
CY007: Added request and response design for TEAVS API
Jessica-Kakshapati Mar 30, 2026
bc412e3
CY008: Added authentication and role-based access design
Jessica-Kakshapati Mar 30, 2026
be0010e
CY008: Added authentication and role-based access design
Jessica-Kakshapati Mar 30, 2026
b4f0e7d
CY009: Added injection prevention and input validation rules
Jessica-Kakshapati Mar 31, 2026
5ca5601
CY010: Added rate limiting and abuse prevention design
Jessica-Kakshapati Mar 31, 2026
4be9e01
CY010: Added rate limiting and abuse prevention designCompleted secur…
Jessica-Kakshapati Mar 31, 2026
ea08d02
CY008: Strengthened input validation by introducing request size limi…
elviacorrea Apr 2, 2026
38883b5
CY009: Enhanced rate limiting with abuse detection, brute-force prote…
elviacorrea Apr 2, 2026
4e23ac3
Created detection-response folder for CY016 and CY017 work
naren38 Apr 5, 2026
e77aef5
Add files via upload
naren38 Apr 5, 2026
5b7e9e3
Added implementation rules for detection, monitoring and response cov…
naren38 Apr 5, 2026
94ebf51
Added prototype detection rule functions for key cyber threats to sup…
naren38 Apr 5, 2026
891ae05
AI001 - Adding sample JSON records for hazard events, cyber threats, …
dannyz54321 Apr 5, 2026
85cc120
AI001: move sample JSON files into ai-ml folder
dannyz54321 Apr 5, 2026
5983b50
Move Sample Records to dedicated folder
SunainM Apr 6, 2026
a73212b
implementation.md
naren38 Apr 6, 2026
ef8bace
Rename implementation.md to cyber/detection-response/implementation.md
naren38 Apr 6, 2026
fdf6c9b
Added Docs
SunainM Apr 7, 2026
87c19a7
delete duplicate folder
SunainM Mar 31, 2026
7636b4e
feat: add documentation and requirements for data cleaning pipeline; …
SunainM Mar 31, 2026
9442275
Unified Pipeline (#11)
SunainM Mar 31, 2026
5abd11a
AI003 Task 3: updated docs and README (#18)
FAISAL1227 Apr 5, 2026
ef8be7a
cleanup
SunainM Apr 8, 2026
de1ec37
Revert "cleanup"
SunainM Apr 8, 2026
8013d5c
feat: implement dynamic data cleaning pipeline using JSON configs
aarnavanoop Apr 7, 2026
a2befff
fix: restore pipeline.py and extend config for PHOENIX schema datasets
aarnavanoop Apr 10, 2026
6443f26
feat: enhance data cleaning pipeline with improved validation and con…
SunainM Apr 10, 2026
d9b6122
AI033 Documentation
SunainM Apr 10, 2026
6c893ae
Create Datasets folder
SunainM Apr 12, 2026
925b7f9
Merge pull request #49 from Hardhat-Enterprises/ai-ml/dev/Add-data-se…
SunainM Apr 12, 2026
3eb5704
Merge pull request #31 from Hardhat-Enterprises/dev-ai-ml
s222530306 Apr 14, 2026
0ac4e00
Create PHOENIX_CY004_EntryPoints_Final.docx
cyber-shehan Apr 15, 2026
dc90bae
Delete cyber/CY004-entry-points/PHOENIX_CY004_EntryPoints_Final.docx
cyber-shehan Apr 15, 2026
e3e38fd
Create Rasanjana
cyber-shehan Apr 15, 2026
4574c60
Delete cyber/Rasanjana
cyber-shehan Apr 15, 2026
2bd6f9d
feat: add CY004 entry points analysis - Rasanjana
cyber-shehan Apr 15, 2026
7660f95
feat: add CY004 Attack_Scenarios - Rasanjana
cyber-shehan Apr 15, 2026
7e5a638
Rename Rasanjana CY004 Attack_Scenarios.docx.pdf to Rasanjana CY004 A…
cyber-shehan Apr 15, 2026
d234f19
Rename Rasanjana CY004 entry-points.docx.pdf to Rasanjana CY004 entry…
cyber-shehan Apr 15, 2026
429ab26
feat: add CY005-Apply STRIDE - Rasanjana
cyber-shehan Apr 15, 2026
5ca404c
feat: add CY005-Map Mitigations - Rasanjana
cyber-shehan Apr 15, 2026
aeeeebb
Rename Rasanjana CY005-Apply STRIDE.pdf.pdf to Rasanjana CY005-Apply …
cyber-shehan Apr 15, 2026
3955417
Rename Rasanjana CY005-Map Mitigations.pdf.pdf to Rasanjana CY005-Map…
cyber-shehan Apr 15, 2026
e409c57
Updated Create Schema SQL to reflect latest changes to User and DAta …
tobydwc Apr 16, 2026
bdba61f
CY010: Digital signing workflow documentation
elviacorrea Apr 17, 2026
ab41092
CY010: Signature verification process documentation
elviacorrea Apr 17, 2026
3effc13
Merge pull request #66 from Hardhat-Enterprises/260416-Toby-Database-…
s222530306 Apr 20, 2026
bdac0bb
Removed unrelated files from branch to ensure clean PR
elviacorrea Apr 21, 2026
abc3c1f
Merge pull request #23 from Hardhat-Enterprises/cyber-detection-respo…
s222530306 Apr 27, 2026
63ac881
Merge pull request #59 from Hardhat-Enterprises/cyber/cy004-cy005-Ras…
s222530306 Apr 27, 2026
4187088
Merge pull request #25 from Hardhat-Enterprises/cyber/secure-design-e…
s222530306 May 9, 2026
6f0dcea
CY011: Defined secure storage, data integrity, and secret management
Vipul0390 May 11, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
439 changes: 0 additions & 439 deletions ai-ml/AI001/sample_records.json

This file was deleted.

60 changes: 60 additions & 0 deletions ai-ml/Sample Records/cyber_threat_samples.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@

{
"threat_id": "b2e4d5f6-1111-4c3d-c111-000000000001",
"threat_type": "phishing",
"title": "Fake disaster relief payment email",
"description": "Email campaign impersonating government agencies offering emergency relief payments.",
"risk_level": "critical",
"status": "active",
"category": "cyber",
"confidence_score": 0.92,
"detected_at": "2026-03-21T12:10:00Z",
"source_id": "33333333-cccc-4ccc-dddd-000000000010",
"created_at": "2026-03-21T12:20:00Z",
"updated_at": "2026-03-21T12:20:00Z"
}

{
"threat_id": "b2e4d5f6-2222-4c3d-c222-000000000002",
"threat_type": "malware",
"title": "Malicious weather alert application",
"description": "Fake mobile application distributing malware under the guise of real-time weather alerts.",
"risk_level": "high",
"status": "active",
"category": "cyber",
"confidence_score": 0.87,
"detected_at": "2026-02-15T08:45:00Z",
"source_id": "33333333-cccc-4ccc-dddd-000000000011",
"created_at": "2026-02-15T09:00:00Z",
"updated_at": "2026-02-15T09:00:00Z"
}

{
"threat_id": "b2e4d5f6-3333-4c3d-c333-000000000003",
"threat_type": "misinformation",
"title": "False evacuation alert circulating online",
"description": "Social media posts spreading false evacuation notices during severe storm events.",
"risk_level": "medium",
"status": "monitoring",
"category": "cyber",
"confidence_score": 0.78,
"detected_at": "2026-01-30T16:20:00Z",
"source_id": "33333333-cccc-4ccc-dddd-000000000012",
"created_at": "2026-01-30T16:30:00Z",
"updated_at": "2026-01-30T16:30:00Z"
}

{
"threat_id": "b2e4d5f6-4444-4c3d-c444-000000000004",
"threat_type": "data_breach",
"title": "Unauthorized access to emergency response system",
"description": "Multiple unauthorized login attempts detected targeting emergency service infrastructure.",
"risk_level": "high",
"status": "active",
"category": "cyber",
"confidence_score": 0.85,
"detected_at": "2026-03-10T22:15:00Z",
"source_id": "33333333-cccc-4ccc-dddd-000000000013",
"created_at": "2026-03-10T22:30:00Z",
"updated_at": "2026-03-10T22:30:00Z"
}
80 changes: 80 additions & 0 deletions ai-ml/Sample Records/hazard_event_samples.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
{
"hazard_event_id": "a1f3c2d4-1111-4a2b-b111-000000000001",
"hazard_type": "heatwave",
"description": "Extreme heatwave conditions affecting metropolitan communities.",
"severity_level": "high",
"event_status": "active",
"start_time": "2026-01-10T09:00:00Z",
"end_time": null,
"source_id": "33333333-cccc-4ccc-dddd-000000000001",
"source_ref_event": "BOM-HEAT-001",
"geo_location_id": "22222222-bbbb-4bbb-cccc-000000000001",
"created_at": "2026-01-10T09:15:00Z",
"updated_at": "2026-01-10T09:15:00Z"
}

{
"hazard_event_id": "a1f3c2d4-2222-4a2b-b222-000000000002",
"hazard_type": "storm",
"description": "Severe thunderstorm with heavy rainfall and strong winds.",
"severity_level": "medium",
"event_status": "active",
"start_time": "2026-02-05T14:30:00Z",
"end_time": null,
"source_id": "33333333-cccc-4ccc-dddd-000000000002",
"source_ref_event": "BOM-STORM-002",
"geo_location_id": "22222222-bbbb-4bbb-cccc-000000000002",
"created_at": "2026-02-05T14:45:00Z",
"updated_at": "2026-02-05T14:45:00Z"
}

{
"hazard_event_id": "a1f3c2d4-3333-4a2b-b333-000000000003",
"hazard_type": "bushfire",
"description": "Rapidly spreading bushfire threatening rural communities.",
"severity_level": "critical",
"event_status": "active",
"start_time": "2026-03-21T10:30:00Z",
"end_time": null,
"source_id": "33333333-cccc-4ccc-dddd-000000000003",
"source_ref_event": "BOM-FIRE-003",
"geo_location_id": "22222222-bbbb-4bbb-cccc-000000000003",
"created_at": "2026-03-21T11:05:00Z",
"updated_at": "2026-03-21T11:05:00Z"
}

{
"hazard_event_id": "a1f3c2d4-4444-4a2b-b444-000000000004",
"hazard_type": "flood",
"description": "River flooding caused by prolonged rainfall across low-lying areas.",
"severity_level": "high",
"event_status": "monitoring",
"start_time": "2026-04-01T06:00:00Z",
"end_time": null,
"source_id": "33333333-cccc-4ccc-dddd-000000000004",
"source_ref_event": "BOM-FLOOD-004",
"geo_location_id": "22222222-bbbb-4bbb-cccc-000000000004",
"created_at": "2026-04-01T06:15:00Z",
"updated_at": "2026-04-01T06:15:00Z"
}

{
"hazard_event_id": "a1f3c2d4-5555-4a2b-b555-000000000005",
"hazard_type": "cyclone",
"description": "Category 4 cyclone approaching coastal regions with destructive winds.",
"severity_level": "critical",
"event_status": "active",
"start_time": "2026-02-20T03:00:00Z",
"end_time": null,
"source_id": "33333333-cccc-4ccc-dddd-000000000005",
"source_ref_event": "BOM-CYC-005",
"geo_location_id": "22222222-bbbb-4bbb-cccc-000000000005",
"created_at": "2026-02-20T03:10:00Z",
"updated_at": "2026-02-20T03:10:00Z"
}






64 changes: 64 additions & 0 deletions ai-ml/Sample Records/risk_assessment_integration.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@

{
"integration_event_id": "c3f5e6a7-1111-4d4e-d111-000000000001",
"related_hazard_event_id": "a1f3c2d4-3333-4a2b-b333-000000000003",
"related_threat_id": "b2e4d5f6-1111-4c3d-c111-000000000001",
"correlation_score": 0.88,
"linkage_reason": "Phishing campaign detected during active bushfire emergency targeting affected communities.",
"integration_confidence": 0.91,
"linked_event_type": 3,
"event_status": 1,
"event_time": "2026-03-21T12:00:00Z",
"detected_at": "2026-03-21T12:10:00Z",
"reported_at": "2026-03-21T12:20:00Z",
"created_at": "2026-03-21T12:25:00Z",
"updated_at": "2026-03-21T12:25:00Z"
}

{
"integration_event_id": "c3f5e6a7-2222-4d4e-d222-000000000002",
"related_hazard_event_id": "a1f3c2d4-2222-4a2b-b222-000000000002",
"related_threat_id": "b2e4d5f6-3333-4c3d-c333-000000000003",
"correlation_score": 0.72,
"linkage_reason": "False evacuation alerts spread during storm warnings causing public confusion.",
"integration_confidence": 0.80,
"linked_event_type": 3,
"event_status": 2,
"event_time": "2026-02-05T15:00:00Z",
"detected_at": "2026-02-05T15:10:00Z",
"reported_at": "2026-02-05T15:25:00Z",
"created_at": "2026-02-05T15:30:00Z",
"updated_at": "2026-02-05T15:30:00Z"
}

{
"integration_event_id": "c3f5e6a7-3333-4d4e-d333-000000000003",
"related_hazard_event_id": "a1f3c2d4-4444-4a2b-b444-000000000004",
"related_threat_id": null,
"correlation_score": 0.60,
"linkage_reason": "Flood event monitored without associated cyber threat.",
"integration_confidence": 0.70,
"linked_event_type": 1,
"event_status": 2,
"event_time": "2026-04-01T07:00:00Z",
"detected_at": "2026-04-01T07:10:00Z",
"reported_at": "2026-04-01T07:20:00Z",
"created_at": "2026-04-01T07:25:00Z",
"updated_at": "2026-04-01T07:25:00Z"
}

{
"integration_event_id": "c3f5e6a7-4444-4d4e-d444-000000000004",
"related_hazard_event_id": null,
"related_threat_id": "b2e4d5f6-2222-4c3d-c222-000000000002",
"correlation_score": 0.65,
"linkage_reason": "Malware campaign detected independently of hazard events.",
"integration_confidence": 0.75,
"linked_event_type": 2,
"event_status": 1,
"event_time": "2026-02-15T09:00:00Z",
"detected_at": "2026-02-15T09:10:00Z",
"reported_at": "2026-02-15T09:20:00Z",
"created_at": "2026-02-15T09:25:00Z",
"updated_at": "2026-02-15T09:25:00Z"
}
Empty file removed ai-ml/cleaning/.gitkeep
Empty file.
134 changes: 134 additions & 0 deletions ai-ml/cleaning/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
# Data Cleaning Pipeline (`ai-ml/cleaning`)

This module provides a configurable CSV cleaning + validation pipeline.

It is designed to:

- clean raw tabular data (missing values, duplicates, type conversion, string normalization)
- validate cleaned data against rules (required fields, allowed values, ranges, types, date formats)
- generate output artifacts (cleaned CSV, validation report, comparison report, pipeline log)

## Folder Structure

- `config/pipeline_config.json` - all pipeline configuration
- `data/input/` - raw CSV input files
- `data/output/` - cleaned CSV output
- `data/reports/` - JSON reports (validation + before/after comparison)
- `data/logs/` - pipeline event log
- `src/` - pipeline implementation

## What The Pipeline Does

For each run, the pipeline:

1. reads input CSV from `paths.input_csv`
2. selects cleaning/validation rules:

- dataset-specific rules under `datasets.<name>` when columns match
- otherwise falls back to top-level `cleaning` + `validation` (`generic`)

1. applies cleaning steps
2. runs validation checks
3. writes outputs to configured paths

## Run The Pipeline

From repo root:

```powershell
python ai-ml/cleaning/src/main.py
```

With a custom config:

```powershell
python ai-ml/cleaning/src/main.py --config ai-ml/cleaning/config/pipeline_config.json
```

## Use In Other Python Scripts

Example integration:

```python
from pathlib import Path
import sys

cleaning_root = Path("ai-ml/cleaning").resolve()
if str(cleaning_root) not in sys.path:
sys.path.insert(0, str(cleaning_root))

from src.pipeline import run_pipeline

summary = run_pipeline(cleaning_root / "config" / "pipeline_config.json")
print(summary)
```

## How To Modify `pipeline_config.json`

### 1. Set input/output paths

Update `paths`:

- `input_csv`
- `cleaned_csv`
- `validation_report`
- `comparison_report`
- `pipeline_log`

### 2. Configure generic fallback rules

Top-level `cleaning` and `validation` are used when no dataset-specific schema matches.

### 3. Add or edit dataset-specific rules

Under `datasets`, each dataset entry should contain:

- `cleaning`
- `validation.required_columns`
- `validation.column_rules`

Minimal pattern:

```json
"datasets": {
"my_dataset": {
"cleaning": {
"missing_values": { "drop": [], "fill": {} },
"duplicates": { "subset": [] },
"type_conversion": { "int": [], "float": [], "datetime": [] },
"string_standardisation": ["col_a", "col_b"]
},
"validation": {
"required_columns": ["id"],
"column_rules": {
"id": { "required": true, "type": "int", "unique": true }
}
}
}
}
```

## Validation Rules Supported

Per column (`validation.column_rules.<column>`):

- `required: true|false`
- `type: "int" | "float" | "str" | "date"`
- `unique: true|false`
- `allowed_values: [...]`
- `min` / `max` (numeric)
- `format` (for `type: "date"`)

## Cleaning Rules Supported

- `missing_values.drop`: drop rows where these columns are null
- `missing_values.fill`: fill nulls with provided values
- `duplicates.subset`: drop duplicate rows by subset
- `type_conversion.int|float|datetime`: coercive conversion
- `string_standardisation`: trim + normalize configured text columns

## Notes

- Input is expected to be CSV.
- Validation `FAIL` means rule violations were found; pipeline still completes and writes reports.
- If a new dataset is not being picked, verify its `required_columns` match the CSV column names exactly.
21 changes: 0 additions & 21 deletions ai-ml/cleaning/ai003/README.md

This file was deleted.

Binary file not shown.
Binary file not shown.
6 changes: 0 additions & 6 deletions ai-ml/cleaning/ai003/cleaned_output.csv

This file was deleted.

Loading