Designing Data-Intensive Applications
Data Storage
There are two main types of database storage enginers: log-structured storage engines and page-oriented storage engines.
Simplest possible data storage mechanishm is an append-only logic of key-value tuples. The writes are very simple and can be done in constant time. However querying for a specific key would require us to traverse range over the entire log (in worst case scenario). So the reads would be O(n) time complexity. This quickly becomes unfeasible as the database grows (imagine searching for a specific user in Facebook’s 2.8B active users!).
Data Models
Relational model
An object-relational mapping (ORM) framework is generally used to work with data in a manner compatible with object-oriented programming languages.
Data is normalized. Total database size can be smaller since duplication is reduced. However queries may required multiple calls to tables and join operations. Another benefit is that changes can be made easily by editing the referenced rows.
Schema on write This can make it difficult to modify the schema. Modifying the structure of data requires expensive migration.
Declarative query language. Query optimizer will select best way to execute the request.
Examples of RDBMS databases: Oracle, MySQL, Microsoft SQL Server, PostgreSQL
Document model
Examples of document model databases: MongoDB
Better scalability Non-restrictive schema Specialized query operations?
Better locality - All data is stored in a single document. Avoids performing multiple queries and performing joins.
Schema on read The document self-describes the schema.
Imperative query language.