Web Scalability for Startup Engineers

Image of Web Scalability for Startup Engineers book

Web Scalability for Startup Engineers

I am currently reading Web Scalability for Startup Engineers by Artur Ejsmont. I plan to write my thoughts as I progress through the book.

Principles of Good Software Design

Simplicity

Local simplicity

When you look at a class, you should be able to quickly understand how it works without having to understand other parts of the system.

Module level - understand what the module does without having to understand its individual functions

Application level - understand key modules and high-level functionality without needing to know individual class details

System level - understand top-level applications and their responsibilities without knowledge of implementation

Avoid Overengineering, Test Driven Development

In an attempt to design solutions that are easy to modify and extend, sometimes design can be overengineered. Overengineered solutions increase complexity and developer cost for functionality that may never be needed or used. Follow the YAGNI principle. Ya ain’t gonna need it.

Martin Fowler on YAGNI

Test driven development avoids unncessary code by focusing strictly on implementing system requirements.

Examples to study

Hadoop is a large and complex platform but it does a great job of hiding complexity from end users. The MapReduce whitepaper is a recommended reading.

Google Maps API is also a great example of a simple, flexible API that solve complex problems.

Loose Coupling

Coupling can be measured by how much two systems know about and depend on each other. An example of no coupling would be two systems that are completely unaware of each other. Each system can be modified independently without needing to modify the other. Systems that are tightly coupled cannot be modified independently. A change in one would require a change in other.

A loosly coupled system provides several benefits:

Practices to reduce coupling:

Don’t Repeat Yourself (DRY)

Coding to Contract

Draw Diagrams

Single Responsibility

Open-Closed Principle

Open-closed principle refers to “open for extension and closed for modification.“ It reduces the need to change existing code and makes futures changes cheaper. The best example of this principle is plugin systems provided by modern web browsers that can modify the behavior of the browser (e.g block ads) without requiring any code changes to the browser itself. Uncle Bob has a blog post discussing the open-closed principle.

Dependency Injection and Inversion of Control

James Shore:

Dependency Injection is a 25-dollar term for a 5-cent concept.

Simply, dependency injection means giving an object it’s instance variables. This makes it easy to unit test individual classes.

James Shore provides a great explanation of dependency injection on his blog.

Inversion of control is the broader principle of which dependency injection is a subclass. An answer on StackOverflow provides a great concrete example of IoC.

Designing for Scale

Adding more nodes

Adding indistinguishable components

One of the simplest way to scale is to distribute load over multiple identical servers. A load balancer would distribute traffic over multiple servers. All servers should respond identitically.

Stateless service is a term used to indicate that a service does not depend on the local state, so processing requests does not affect the way the service behaves.

Stateless services are easiest to scale this way because there is no state. By default, stateless services will always respond identifically. However, stateful services require some extra work to syncthronize the state.

Functional partitioning

Dividing the system into smaller subsystems based on functionality

Functional partitioning is a key principle of servie oriented architecture. The general idea is to split up the system into multiple loosely coupled functional groups that handle specific responsibilities. For example, you can divide the database, cache, messaging queue servers into their own subsystems. This provides the ability to scale each subsystem independently. It is even common to partition the application service (monolith) into smaller functional subsystems. This allows multiple engineer teams to more easily develop each subsystem in independent and parallel manner.

Data Partitioning

Keeping a subset of the data on each machine

Design for Self-Healing