Distribute System

Distributed System communication


Consistency: Consistency means that all clients see the same data at the same time, regardless of which node they connect to.
Availability: Availability means that any client requesting data receives a response, even if some of the nodes are down.
Partition Tolerance: A partition indicates a communication break between two nodes. Partition tolerance means that the system continues to operate despite network partitions.


Availability Percentages versus Service Downtime
Availability %
Downtime per Year
Downtime per Month
Downtime per Week
90% (1 nine)
36.5 days
72 hours
16.8 hours
99% (2 nines)
3.65 days
7.20 hours
1.68 hours
99.5% (2.5 nines)
1.83 days
3.60 hours
50.4 minutes
99.9% (3 nines)
8.76 hours
43.8 minutes
10.1 minutes
99.99% (4 nines)
52.56 minutes
4.32 minutes
1.01 minutes
99.999% (5 nines)
5.26 minutes
25.9 seconds
6.05 seconds
99.9999% (6 nines)
31.5 seconds
2.59 seconds
0.605 seconds
99.99999% (7 nines)
3.15 seconds
0.259 seconds
0.0605 seconds


Mean time between failures (MTBF) andย mean time to repair (MTTR)


the ability of a system to handle an increasing amount of workload without compromising performance.
Size scalability
Ability to add resources to handle more workload
Adding more CPUs to handle more requests
Administrative scalability
Capacity for multiple users to share a single distributed system
Multiple companies sharing a cloud-based system
Geographical scalability
Ability to cater to a broad geographical region
A search engine serving users in multiple countries
Vertical scalability
Scaling up by providing additional capabilities to an existing device
Adding more RAM to a server
Horizontal scalability
Scaling out by increasing the number of machines in the network
Adding more nodes to a distributed system


Ease of ensuring smooth system operations under normal circumstances and achieving normal conditions under a fault
Important for maintaining system availability and reducing downtime
Simplicity of the code base, making it easy to understand and maintain
Important for reducing maintenance time and costs
Capability of the system to integrate modified, new, and unforeseen features without difficulty
Important for adapting to changing requirements and improving system functionality
Maintainability (M)
Probability that the service will restore its functions within a specified time of fault occurrence
M = probability of restoring the component to its fully active form within a specified time
High M value
Mean Time To Repair (MTTR)
Average amount of time required to repair and restore a failed component
MTTR = total maintenance time / total number of repairs
Low MTTR value

Fault tolerance

Replicating both services and data to swap out failed nodes or data stores with healthy ones
Consistency vs. availability trade-off, synchronous vs. asynchronous updates
Saving the system's state in stable storage when the system state is consistent
Consistency vs. availability trade-off, synchronous vs. inconsistent updates

Types of Data Center Servers

Server Type
Example Resources
Web Servers
Handle API calls from clients behind the load-balancer (mostly serve static content)
Medium memory and storage resources, good computational resources
Application Servers
Run core application software and business logic (servers primarily provide dynamic content)
Extensive computational and storage resources, volatile and non-volatile storage, up to 256 GB RAM and 6.5 TB storage
Storage Servers
Store and manage structured and non-structured data
Structured (SQL) and non-structured (NoSQL) data management systems, storage capacity up to 120 TB, exabytes of storage, 32 GB RAM

Load balancing

Load Balancing Technique
Local Load Balancing
Global Load Balancing
Balancing within a data center.
Balancing traffic across multiple geographical regions.
Improving efficiency and better resource utilization within a data center.
Distributing traffic intelligently across multiple geographical regions.
Within a data center.
Across multiple geographical regions.
Technology used
Reverse proxy.
Load Balancing as a Service (LBaaS).
Installation location
Within the data center.
Can be installed on-premises or obtained through LBaaS.
Load balancing technique
Divides incoming requests among the pool of available servers.
Uses techniques such as DNS and round-robin to perform load balancing.
Limited control over the client's behavior, smaller packet size (512 bytes), clients can't determine the closest address to establish a connection with.
Can suffer from uneven load distribution on end-servers, keeping on distributing the IP address of the crashed servers until the TTL of the cached entries expires.
Use of ADCs
Used as an additional layer of load balancing.
ADCs can implement GSLB.


Relational Database
Non-Relational (NoSQL) Database
Data structure
Organized in one or more tables/relations
Can be structured, semi-structured, or unstructured data
Query language
Structured Query Language (SQL)
Various languages depending on the type of NoSQL database
Follows strict schema and requires data to conform to it
Dynamic schema allows for flexible data
Vertical scaling. Or horizontally scale by separating table
Can scale horizontally by adding more nodes
ACID properties
Provides full ACID compliance
Often sacrifices some level of consistency for availability and partition tolerance
Use case
Best for structured data with complex relationships and strict integrity constraints
Best for unstructured or semi-structured data and high scalability needs
Type of NoSQL Database
Key-value Database
Stores data as key-value pairs using hash tables, with the key serving as a unique or primary key, and values being anything from simple scalar values to complex objects. Efficient for session-oriented applications, such as web applications. Example databases include Amazon DynamoDB, Redis, and Memcached DB.
Document Database
Designed to store and retrieve documents in formats like XML, JSON, BSON, etc. Documents are composed of a hierarchical tree data structure that can include maps, collections, and scalar values. Suitable for unstructured catalog data and content management applications. Example databases include MongoDB and Google Cloud Firestore.
Graph Database
Uses the graph data structure to store data, where nodes represent entities, and edges show relationships between entities. Allows storing data once and interpreting it differently based on relationships. Suitable for social applications, data regulation and privacy, machine learning research, and financial services-based applications. Example databases include Neo4J, OrientDB, and InfiniteGraph.
Columnar Database
Stores data in columns instead of rows, enabling access to all entries in the database column quickly and efficiently. Suitable for large numbers of aggregation and data analytics queries. Example databases include Cassandra, HBase, Hypertable, and Amazon SimpleDB.


  • Cloud computing is running applications on computing resources managed by cloud providers. When using cloud computing, we do not have to purchase or manage hardware ourselves.
  • Serverless computing builds on the convenience of cloud computing with even more automation. It enables developers to build and run applications without having to provision cloud servers. The serverless provider handles the infrastructure and automatically scales the computing resources up or down as needed. This provides a great developer experience since developers can focus on the application code itself, without having to worry about scaling.