Lesson 1Consistency models: strong, causal, eventual, read-your-writes, monotonic readsWe look into how consistency works in distributed setups, covering strong, causal, and eventual types, plus read-your-writes and monotonic reads, explaining what they promise, the odd problems they can cause, and how apps pick the right one to match what users expect.
Strong consistency guaranteesEventual consistency and convergenceCausal consistency and orderingRead-your-writes and monotonic readsChoosing models for applicationsLesson 2Distributed consensus algorithms: Paxos, Raft, and practical implementations (etcd, Consul)We introduce Paxos and Raft ways to agree in distributed systems and their jobs in picking leaders, copying logs, and changing setups, then link the ideas to real tools like etcd and Consul used for keeping track of info, locks, and working together.
Consensus problem and safety goalsPaxos algorithm core ideasRaft algorithm and log replicationCluster membership and reconfigurationUsing etcd and Consul in practiceLesson 3Sharding and partitioning strategies: range, hash, and directory-basedWe go into details on sharding and splitting data strategies, like range, hash, and directory methods, focusing on spreading data, avoiding busy spots, rebalancing, and directing traffic, and how to pick and improve a plan as work and data grow.
Range-based partitioning designHash-based sharding and hashingDirectory and lookup-based routingRebalancing and resharding methodsAvoiding hotspots and skewed keysLesson 4Replication models: leader-follower, multi-leader, and leaderless patternsWe cover leader-follower, multi-leader, and no-leader copying methods, explaining how writes and reads happen, handling failures, delays, and fixing clashes, and how each affects speed, amount handled, lasting safety, and running costs in setups around the world.
Leader-follower replication flowsMulti-leader replication and conflictsLeaderless quorum-based replicationReplication lag and read consistencyOperational trade-offs of each modelLesson 5CAP theorem and trade-offs between consistency, availability, and partition toleranceWe explore the CAP theorem and what it means for distributed databases, making clear how being up-to-date, available, and handling splits work together, and how real systems handle the choices using practical designs and service goals.
Formal statement of the CAP theoremConsistency vs availability in practicePartition tolerance in real networksDesigning around CAP with SLAsLesson 6Network partitions, latency, and failure modes across WAN linksWe check how network splits, delays, and failures show up across wide area links, covering timeouts, partial breakdowns, and split-brain issues, and how to plan spotting, retries, and fallback ways that keep systems steady when under pressure.
Characteristics of WAN linksDetecting partitions and timeoutsHandling partial and asymmetric failuresSplit-brain risks and mitigationGraceful degradation strategiesLesson 7Idempotency, retries, and at-least-once vs exactly-once semanticsWe explain doing things the same way multiple times safely for retries, telling apart at-least-once, at-most-once, and exactly-once meanings, and showing ways for removing duplicates, tracking requests, and handling messages in shaky distributed places.
Defining idempotent operationsDesigning safe retry mechanismsAt-least-once vs at-most-onceExactly-once semantics limitationsDeduplication and request trackingLesson 8Concurrency control: optimistic vs pessimistic, MVCC, conflict resolution techniquesWe look at handling multiple things at once in distributed databases, comparing hopeful and careful approaches, explaining MVCC inside, and showing clash spotting and fixing methods that keep things right while allowing lots of work together.
Pessimistic locking in distributed systemsOptimistic control and validationMVCC snapshots and version chainsConflict detection and resolutionDeadlocks, timeouts, and retriesLesson 9Physical topology patterns: single region, active-passive, active-active, and hybridWe describe setup layouts for distributed databases, like one area, active-passive, active-active, and mixed types, and check their effects on speed, switchover, data sameness, and running trouble.
Single-region deployment trade-offsActive-passive failover patternsActive-active multi-region setupsHybrid and tiered topology designsLatency, RPO, and RTO considerations