-
Notifications
You must be signed in to change notification settings - Fork 0
concept: noSQL
David Liu edited this page Jan 8, 2025
·
11 revisions
RDBMS are hard to do below functions efficiently in a distributed system
- Transactions
- Referential Integrity: Joins
Polymorphism
- aka. schemaless
- This eliminates the need for maintaining a central system (data) catalog or updating an object-relational mapper (ORM)
- downstream techniques, e.g. SCD types and design, will not be available, because every dimension is already joined to the fact data in extremely denormalized ways.
based on Dynamo paper published by Amazon at the ACM Symposium on Operating Systems Principles
Riak
- Riak implements the principles from Amazon's Dynamo paper
- Written in Erlang
- pluggable storage backend for its core storage
- Bitcask is default
- LevelDB
- open source in 2017
A very wide, sparsely populated table structure
- It includes a number of families of columns that specify the keys for this particular table structure.
- Journey is started from BigTable paper in 2006 at the Operating Systems Design and Implementation (OSDI)
Use cases
- In many cases, these systems are combined with batch-oriented systems that use Map/Reduce as the processing model for advanced querying
- Like Document store with table schema
Vendor
- GCP BigTable
- Apache Cassandra
- HBase
- Document can be an XML, JSON, YAML, LibreOffice/MS 365
- Document is stored and retrieved in a schema-less fashion
Vendor