Web Scalability for Startup Engineers
Web Scalability for Startup Engineers
I am currently reading Web Scalability for Startup Engineers by Artur Ejsmont. I plan to write my thoughts as I progress through the book.
Principles of Good Software Design
Simplicity
Local simplicity
When you look at a class, you should be able to quickly understand how it works without having to understand other parts of the system.
Module level - understand what the module does without having to understand its individual functions
Application level - understand key modules and high-level functionality without needing to know individual class details
System level - understand top-level applications and their responsibilities without knowledge of implementation
Avoid Overengineering, Test Driven Development
In an attempt to design solutions that are easy to modify and extend, sometimes design can be overengineered. Overengineered solutions increase complexity and developer cost for functionality that may never be needed or used. Follow the YAGNI principle. Ya ain’t gonna need it.
Martin Fowler on YAGNI
Test driven development avoids unncessary code by focusing strictly on implementing system requirements.
Examples to study
Hadoop is a large and complex platform but it does a great job of hiding complexity from end users. The MapReduce whitepaper is a recommended reading.
Google Maps API is also a great example of a simple, flexible API that solve complex problems.
Loose Coupling
Coupling can be measured by how much two systems know about and depend on each other. An example of no coupling would be two systems that are completely unaware of each other. Each system can be modified independently without needing to modify the other. Systems that are tightly coupled cannot be modified independently. A change in one would require a change in other.
A loosly coupled system provides several benefits:
- Hides complexity
- Easier to understand
- Can be easily modified with little code changes to other systems
Practices to reduce coupling:
- Encapsulation. Only share what is absolutely required
- Avoid circular dependencies
- Avoid requiring the customer to call the public interface or API in specific order (e.g. avoid initialization functions, if possible)
Don’t Repeat Yourself (DRY)
Coding to Contract
Draw Diagrams
Single Responsibility
Open-Closed Principle
Open-closed principle refers to “open for extension and closed for modification.“ It reduces the need to change existing code and makes futures changes cheaper. The best example of this principle is plugin systems provided by modern web browsers that can modify the behavior of the browser (e.g block ads) without requiring any code changes to the browser itself. Uncle Bob has a blog post discussing the open-closed principle.
Dependency Injection and Inversion of Control
James Shore:
Dependency Injection is a 25-dollar term for a 5-cent concept.
Simply, dependency injection means giving an object it’s instance variables. This makes it easy to unit test individual classes.
James Shore provides a great explanation of dependency injection on his blog.
Inversion of control is the broader principle of which dependency injection is a subclass. An answer on StackOverflow provides a great concrete example of IoC.
Designing for Scale
Adding more nodes
Adding indistinguishable components
One of the simplest way to scale is to distribute load over multiple identical servers. A load balancer would distribute traffic over multiple servers. All servers should respond identitically.
Stateless service is a term used to indicate that a service does not depend on the local state, so processing requests does not affect the way the service behaves.
Stateless services are easiest to scale this way because there is no state. By default, stateless services will always respond identifically. However, stateful services require some extra work to syncthronize the state.
Functional partitioning
Dividing the system into smaller subsystems based on functionality
Functional partitioning is a key principle of servie oriented architecture. The general idea is to split up the system into multiple loosely coupled functional groups that handle specific responsibilities. For example, you can divide the database, cache, messaging queue servers into their own subsystems. This provides the ability to scale each subsystem independently. It is even common to partition the application service (monolith) into smaller functional subsystems. This allows multiple engineer teams to more easily develop each subsystem in independent and parallel manner.
Data Partitioning
Keeping a subset of the data on each machine