MiniAWS - Distributed Object Storage System

A C++ RAID 0 Implementation modeled after AWS S3

View the Web Dashboard & Interface Here

Overview

This system is a high-performance distributed file system intended to simulate the functionalities of AWS's cloud object storage S3 using external hard drive(s) as storage nodes. Instead of writing large files sequentially, this system leverages the RAID 0 striping system to chunk large files and uses a round-robin algorithm to distribute them across each storage node.

Benefits

This method of writing (with the use of extra hard drives) is faster than standard file writing because each drive can write its own chunk simultaneously (cuts down the file write time by however many drives there are)
This system uses parallelism to cut down write times (via a DMA thread), this allows the main program to keep running and do other tasks without the need for interrupting it
MiniAWS is also scalable, instead of storing a large file in a single storage unit (such as the computer's C or D drive), adding an additional storage node is simple and maintains the file's accessibility, at a lower storage cost per drive
Optimized and encrypted storage: the original file is not physically stored, as of a recent update, this system stores file information as metadata and reassembles the file upon a READ operation. Since we don't store the original file, we also remove unneccessary storage

(ex: a 8GB file is preserved and chunked => 16GB of total storage used up)

vs.

(ex: 8GB file is chunked, original deleted, info is stored as metadata => only 8Gb of total storage used up)

You can read more about the metadata storage system below:

Metadata-Based Storage Architecture

Upload: The file is chunked, metadata (original name, size) is saved to a JSON store, and the local copy is securely deleted
Download: The system queries the metadata, locates the chunks across the drives, and reassembles the file using the name and original size of file by referencing the Stripe Map

⚙️ How It Works

Command: A file is received by --read <fileName> which can be executed via this CLI or a Web Dashboard (linked above)
Segmentation: The RAID Controller splits the file into N chunks of a predetermined chunk size (TODO: make chunk sizes customizable)
Mapping: The Stripe Map assigns each chunk to a specific drive index (0-3)
Writing: The RAID controller enqueues each chunk and hands it off to the DMA Controller, which asynchronously writes these chunks to their assigned "Disk" folders.
Reassembling: During retrieval, the system reverses the process—reading chunks based on the Stripe Map to reconstruct the original binary.

A demonstration of the output can be viewed in this repo under MiniAWS directory

NOTE: temp_dashboard.jpg is the reassembled version of the image after `--read <dashboard.jpg>`, in a production setting, this temp file would be used as a preview/download before being wiped on the physical drive

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
Input		Input
MiniAWS		MiniAWS
.gitignore		.gitignore
Commands.cpp		Commands.cpp
Commands.h		Commands.h
DMAController.cpp		DMAController.cpp
DMAController.h		DMAController.h
RAID.cpp		RAID.cpp
RAID.h		RAID.h
RAID.sln		RAID.sln
RAID.vcxproj		RAID.vcxproj
RAID.vcxproj.filters		RAID.vcxproj.filters
RAID.vcxproj.user		RAID.vcxproj.user
README.md		README.md
StripeMap.cpp		StripeMap.cpp
StripeMap.h		StripeMap.h
main.cpp		main.cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MiniAWS - Distributed Object Storage System

A C++ RAID 0 Implementation modeled after AWS S3

Overview

Benefits

Metadata-Based Storage Architecture

⚙️ How It Works

NOTE: temp_dashboard.jpg is the reassembled version of the image after `--read <dashboard.jpg>`, in a production setting, this temp file would be used as a preview/download before being wiped on the physical drive

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MiniAWS - Distributed Object Storage System

A C++ RAID 0 Implementation modeled after AWS S3

Overview

Benefits

Metadata-Based Storage Architecture

⚙️ How It Works

NOTE: temp_dashboard.jpg is the reassembled version of the image after --read <dashboard.jpg>, in a production setting, this temp file would be used as a preview/download before being wiped on the physical drive

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

NOTE: temp_dashboard.jpg is the reassembled version of the image after `--read <dashboard.jpg>`, in a production setting, this temp file would be used as a preview/download before being wiped on the physical drive

Packages