Skip to content

Commit 2c052eb

Browse files
committed
add blog and detailed flight log extraction process
1 parent c001a6d commit 2c052eb

5 files changed

Lines changed: 191 additions & 4 deletions

File tree

docs/blog/index.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# Project Blog
2+
3+
Welcome to the "NLP for Drone Flight Log Analysis" project blog! Here, we share updates, detailed how-to guides, insights, and behind-the-scenes looks at our research.
4+
5+
---
6+
7+
## Latest Posts:
8+
9+
### 1. Extracting Flight Logs from VTO Labs Forensic Images
10+
11+
**Date:** July 23, 2025
12+
**Author:** S. Silalahi
13+
14+
This post provides a comprehensive, step-by-step guide on how we extracted human-readable flight log files from the encrypted forensic images of drone controller devices provided by VTO Labs. Learn about the tools and procedures involved in converting raw data into analyzable messages.
15+
16+
[:octicons-arrow-right-24: Read More](vto-labs-extraction.md)
17+
18+
---
19+
<!--
20+
### [Future Post Title Example]
21+
22+
**Date:** YYYY-MM-DD
23+
**Author:** [Author Name]
24+
25+
[A brief summary of your next blog post idea. You can copy-paste and modify the structure above for new posts.]
26+
27+
[:octicons-arrow-right-24: Read More](future-post-filename.md) -->

docs/blog/vto-labs-extraction.md

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
# Extracting Flight Logs from VTO Labs Forensic Images
2+
3+
**Date:** July 23, 2025
4+
**Author:** S. Silalahi
5+
6+
This blog post provides a detailed, step-by-step walkthrough of the process we employed to extract and decrypt human-readable flight log messages from the forensic images of drone controller devices, as sourced from the VTO Labs Drone Forensics Program. This process was a crucial part of building our comprehensive "NLP for Drone Flight Log Analysis" dataset.
7+
8+
---
9+
10+
## 1. Understanding the Source Data
11+
12+
The VTO Labs dataset (accessible via [https://drive.google.com/drive/folders/1-UrxFGpCo54bVujwFmmqNbsZEV28dSNz](https://drive.google.com/drive/folders/1-UrxFGpCo54bVujwFmmqNbsZEV28dSNz)) consists of forensic images from various drone models and components. Our initial analysis revealed that relevant, human-readable flight log messages were predominantly found within data acquired from **controller devices**. Other components often contained encrypted, proprietary, or purely telemetry data not suitable for direct NLP analysis.
13+
14+
## 2. Locating and Extracting Flight Log Files from Controller Artifacts
15+
16+
The VTO Labs collection of drone images includes data acquired from various controller devices, specifically **Android phones**, **Android tablets**, and **iOS phones**. The extraction methodology varies slightly depending on the operating system and artifact file type. Our goal was to identify and extract files containing human-readable log messages.
17+
18+
### 2.1. Android-Based Controllers (Phones & Tablets)
19+
20+
For Android-based controller artifacts, VTO Labs provides two main types of files: `.zip` archives and `.001` forensic images.
21+
22+
#### 2.1.1. For `.zip` Archives:
23+
24+
These archives can often be directly unzipped using standard archival tools. Once unzipped, the flight logs are typically found in the following directories:
25+
26+
* **Encrypted `.txt` Flight Log Files (`DJIFlightRecord_YYYY-MM-DD_[HH-MM-SS].txt`):**
27+
* `/dji/dji.go.v4/FlightRecord/`
28+
* `/dji/dji.pilot/FlightRecord/`
29+
* **Unencrypted Human-Readable Error Logs:**
30+
* `/dji/dji.go.v4/LOG/ERROR_POP_LOG/`
31+
* `/dji/dji.pilot/LOG/ERROR_POP_LOG/`
32+
These folders contain simpler, often plain-text, human-readable error messages that did not require decryption.
33+
34+
#### 2.1.2. For `.001` Forensic Images:
35+
36+
For artifacts provided as `.001` forensic images (which require specialized forensic tools to mount and access), we used [**Autopsy**](https://www.autopsy.com/), an open-source digital forensics platform, to navigate and extract relevant files.
37+
38+
* **Tool Used:** Autopsy (open-source digital forensics platform)
39+
* **Extraction Path:** Within Autopsy, we gained access to the file system and extracted the encrypted `.txt` flight log files from paths similar to:
40+
* `/dji/dji.go.v4/FlightRecord/`
41+
* `/dji/dji.pilot/FlightRecord/`
42+
43+
### 2.2. iOS-Based Controllers (iPhones)
44+
45+
For iOS-based controller artifacts, all data was provided in `.zip` archives.
46+
47+
* **Extraction Method:**
48+
* Most `.zip` files from iOS controllers could be **directly unzipped** to reveal their contents.
49+
* For any `.zip` files that resisted direct unzipping or appeared corrupted, we reverted to using **Autopsy** to access their internal structure and extract the files containing human-readable log messages.
50+
51+
### 2.3. Post-Extraction Processing for All Controller Logs
52+
53+
After collecting both the encrypted `.txt` flight log files and the unencrypted `ERROR_POP_LOG` files (where applicable) from both Android and iOS sources:
54+
55+
* The encrypted `.txt` log files (e.g., `DJIFlightRecord_YYYY-MM-DD_[HH-MM-SS].txt`) were then decrypted using the DJI Phantom Help Log Viewer, as detailed in the next section.
56+
* The unencrypted `ERROR_POP_LOG` files were immediately ready for inclusion in our raw message collection without further decryption.
57+
58+
<!-- ## 2. Locating Encrypted Flight Log Files
59+
60+
Within the controller device artifacts, we identified files typically named in the format `DJIFlightRecord_YYYY-MM-DD_[HH-MM-SS].txt`. These files are encrypted and contain the raw flight data. After going through all the available artifacts, we compile the found flight log files, named them DroNER and made it publicly accessible on [Mendeley Data](https://data.mendeley.com/datasets/fwcjyc754h/1).
61+
62+
**File Path Structure (Conceptual):**
63+
```
64+
Drone_Model/
65+
└── DatasetID/
66+
└── YYYY_Month/
67+
└── controller_device/
68+
└── DJIFlightRecord_2017-08-29_[14-30-27].txt
69+
```
70+
Under the `raw` folder of the `droner`, the extracted log files are arranged in the above structure. For instance, a log file from DJI Matrice 600 model is stored in `DJI_Matrice_600\df034\2018_June\mobile_android_logical` directory, with a sample flight log file name of `DJIFlightRecord_2018-06-20_[10-11-44]`. -->
71+
72+
## 3. Decrypting the Logs with DJI Phantom Help
73+
74+
DJI flight logs are encrypted, necessitating a decryption step. We utilized the online **DJI Phantom Help Log Viewer** for this purpose.
75+
76+
* **Tool Used:** [https://www.phantomhelp.com/logviewer/upload/](https://www.phantomhelp.com/logviewer/upload/)
77+
78+
* **Step-by-Step Decryption:**
79+
1. **Access the Tool:** Navigate to the DJI Phantom Help Log Viewer website.
80+
2. **Upload File:** Click the "Upload" button and select an encrypted `DJIFlightRecord_*.txt` file from your local `controller_device` directory.
81+
3. **Process:** The tool automatically processes and decrypts the file.
82+
4. **Download CSV:** Once decrypted, download the resulting human-readable data as a `.csv` file.
83+
5. **Repeat:** This manual process was performed for each individual encrypted flight log file.
84+
85+
!!! warning "Manual Process Note"
86+
It's important to note that this was a file-by-file manual decryption process due to the nature of the online tool. For very large collections, this step can be quite time-consuming.
87+
88+
## 4. Extracting Relevant Messages for NLP
89+
90+
The downloaded CSV files contain numerous columns, encompassing various types of flight data. For our NLP analysis, we focused on columns specifically containing human-readable messages.
91+
92+
* **Key Columns Extracted:**
93+
* `APP.message`: Contains general operational messages.
94+
* `APP.tip`: Provides advisory or tip messages.
95+
* `APP.warning`: Contains warning or error messages.
96+
97+
* **Constructing the Forensic Timeline:**
98+
For each extracted log, we then paired these messages with their precise timestamps. This allowed us to reconstruct a chronological "forensic timeline" of events and communications from the drone system during the flight. This timeline became the raw input for our data cleansing procedures.
99+
100+
* **Timeline Elements:** `Timestamp (Date and Time)` + `Message Content`
101+
102+
---
103+
104+
## Conclusion
105+
106+
This meticulous extraction and decryption process was fundamental in transforming raw, often inaccessible, drone flight data into a rich source of textual information for our NLP research. It highlights the practical challenges involved in acquiring and preparing data from real-world forensic artifacts.

docs/methodology/data-collection.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -33,19 +33,24 @@ In addition to AirData, a significant portion of our raw data was derived from f
3333
* **Message Source Identification:**
3434
Upon initial examination of the VTO Labs data, it was observed that human-readable flight log messages were consistently found **only within log files acquired from controller devices**. Logs from other drone components (e.g., the aircraft itself) were often in proprietary binary formats or contained only telemetry data without explicit textual messages relevant to our NLP tasks.
3535

36+
!!! info "Detailed Extraction Guide"
37+
For a comprehensive, step-by-step guide on how we extracted and decrypted these flight log files from the VTO Labs forensic images, please refer to our dedicated blog post:
38+
[:octicons-arrow-right-24: Extracting Flight Logs from VTO Labs Forensic Images](../blog/vto-labs-extraction.md)
39+
40+
3641
* **Extraction and Decryption Process:**
3742
The extraction process for VTO Labs data involved the following steps:
3843

3944
1. **Controller Log Identification:** For each drone model and dataset ID, all flight log files originating from the `controller_device` artifact were identified.
40-
2. **Local Storage Structure:** These raw, encrypted flight logs were organized and stored locally using the following hierarchical folder structure to maintain provenance:
45+
<!-- 2. **Local Storage Structure:** These raw, encrypted flight logs were organized and stored locally using the following hierarchical folder structure to maintain provenance:
4146
```
4247
Drone_Model/
4348
├── DatasetID/
4449
│ └── YYYY_Month/
4550
│ └── controller_device/
4651
│ └── DJIFlightRecord_YYYY-MM-DD_[HH-MM-SS].txt
4752
```
48-
*Example Filename:* `DJIFlightRecord_2017-08-29_[14-30-27].txt`
53+
*Example Filename:* `DJIFlightRecord_2017-08-29_[14-30-27].txt` -->
4954
3. **Decryption:** The raw `.txt` flight log files from DJI drones are typically encrypted. To obtain human-readable messages, each file was individually decrypted using the **DJI Phantom Help Log Viewer** online tool.
5055
* **Tool Used:** [https://www.phantomhelp.com/logviewer/upload/](https://www.phantomhelp.com/logviewer/upload/)
5156
* **Procedure:** Each encrypted `.txt` log file was uploaded to the Phantom Help Log Viewer, decrypted, and then the resulting human-readable data was downloaded as a `.csv` file. This was a manual, file-by-file process.

docs/methodology/index.md

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
# Our Methodology: A Transparent Approach
2+
3+
The success and reliability of any data-driven research, especially in critical domains like aviation safety, hinge upon a meticulously documented and rigorously applied methodology. This section provides a comprehensive overview of the processes involved in creating the "NLP for Drone Flight Log Analysis" dataset and the associated analytical frameworks.
4+
5+
Our commitment to open science and reproducibility means that every step, from initial data collection and cleansing to detailed annotation procedures, is thoroughly explained and justified by relevant industry standards and academic best practices.
6+
7+
---
8+
9+
## Key Phases of Our Methodology:
10+
11+
Our approach is structured around the following interconnected phases:
12+
13+
<div class="grid cards" markdown>
14+
15+
- :material-cloud-download:{ .lg .middle } __Data Collection__
16+
17+
Details on how raw drone flight log messages were acquired from diverse sources, including AirData UAV and forensic artifacts from VTO Labs.
18+
19+
[:octicons-arrow-right-24: Learn More](data-collection.md)
20+
21+
- :material-broom:{ .lg .middle } __Data Cleansing__
22+
23+
An in-depth explanation of the procedures used to transform noisy, inconsistent raw messages into a clean, standardized, and machine-readable format.
24+
25+
[:octicons-arrow-right-24: Learn More](cleansing.md)
26+
27+
- :material-clipboard-edit-outline:{ .lg .middle } __Annotation Procedures__
28+
29+
Detailed guidelines for how flight log messages were annotated for various NLP tasks, ensuring consistency and high inter-annotator agreement.
30+
31+
[:octicons-arrow-right-24: Learn More](general-annotation.md)
32+
33+
- :material-scale-balance:{ .lg .middle } __Justification & Standards__
34+
35+
How our annotation guidelines and cleansing decisions are grounded in existing aviation regulations and standard documents (e.g., FAA, NTSB, INTERPOL).
36+
37+
[:octicons-arrow-right-24: Learn More](annotation-justification.md)
38+
39+
</div>
40+
41+
---
42+
43+
## Our Commitment to Reproducibility
44+
45+
By detailing our methodology here, we aim to provide researchers, practitioners, and enthusiasts with the necessary context and understanding to replicate our work, build upon our findings, and contribute to the advancement of NLP in drone aviation. We encourage thorough review and welcome feedback on our methods.
46+
47+
!!! tip "Dive Deeper"
48+
Explore the sub-sections in the sidebar for granular details on each phase of our methodology.

mkdocs.yml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
site_name: NLP for Drone Flight Log Analysis
2-
site_url: https://swardiantara.github.io/DroneNLP/ # IMPORTANT: Update this with your GitHub username and repository name
2+
site_url: https://dronenlp.github.io/documentation/
33
site_description: Unlocking Insights from Drone Flight Logs with NLP.
44
site_author: Swardiantara Silalahi
55

@@ -35,6 +35,7 @@ nav:
3535
- Home: index.md
3636
- About the Project: about.md
3737
- Methodology:
38+
- methodology/index.md
3839
- Data Collection & Initial Scope: methodology/data-collection.md
3940
- Data Cleansing Procedure: methodology/cleansing.md
4041
- Annotation Justification & Standards: methodology/annotation-justification.md
@@ -46,7 +47,7 @@ nav:
4647
- Challenge 002: Autel EVO II - Flight Y - dataset/challenge-002.md
4748
- Dataset Download: download.md
4849
- Publications: publications.md
49-
- Resources: resources.md
50+
- Blog: blog/index.md
5051
- Contact: contact.md
5152

5253
markdown_extensions:

0 commit comments

Comments
 (0)