Getting a million users is infinitely harder than scaling a system to handle a million users. Most systems could run comfortably on a Raspberry Pi
Traefik - Modern HTTP reverse proxy and load balancer that makes deploying microservices easy. (Hello World with Traefik)
Kit - Standard library for microservices written in Go.
Kong - Cloud-Native API Gateway & Service Mesh.
Disque - Distributed message broker.
Mesh - Tool for building distributed applications.
Raft - Raft distributed consensus algorithm implemented in Rust.
hraftd - Hashicorp's Raft implementation.
libp2p specification - Technical specifications for the libp2p networking stack.
Qri - Global dataset version control system (GDVCS) built on the distributed web.
Project Oak - Meaningful control of data in distributed systems.
mudb - Collection of modules for building realtime client-server networked applications.
Verdi - Framework for formally verifying distributed systems implementations in Coq.
PingCAP Talent Plan - Series of training courses about writing distributed systems in Go and Rust.
Protocol Labs - Build protocols, systems, and tools to improve internet.
Dark Crystal - Open source R&D affinity. Exploring the potential of new and existing technologies in crypto-space to encourage horizontal group collaboration.
Protozoa - Web developers, facilitators, crypto-engineers. Experts in Node.js & distributed systems.
Akka - Build highly concurrent, distributed, and resilient message-driven applications on the JVM.
Distributed Components - Provides reusable infrastructure for formally verifying distributed systems using the Coq proof assistant.
LF - Fully Decentralized Fully Replicated Key/Value Store.
Awesome Consensus - Curated selection of artisanal consensus algorithms and hand-crafted distributed lock services.
Rezolus - Tool for collecting detailed systems performance telemetry and exposing burst patterns through high-resolution telemetry.
Cadence - Distributed, scalable, durable, and highly available orchestration engine to execute asynchronous long-running business logic in a scalable and resilient way.
Pilosa - Open source, distributed bitmap index that dramatically accelerates queries across multiple, massive data sets.
Finagle - Fault tolerant, protocol-agnostic RPC system.
Chaos Monkey - Resiliency tool that helps applications tolerate random instance failures.
Faust - Python Stream Processing.
Titanoboa - Community version of fully distributed, highly scalable and fault tolerant workflow orchestration platform for JVM.
Buoyant - Helps you deploy and run Linkerd, the fully open source, ultralight service mesh.
Grappa - Runtime system for scaling irregular applications on commodity clusters.
Apache Mesos - Cluster manager that provides efficient resource isolation and sharing across distributed applications, or frameworks.
Gleam - Fast, efficient, and scalable distributed map/reduce system, DAG execution, in memory or on disk, written in pure Go, runs standalone or distributedly.
etcd - Distributed reliable key-value store for the most critical data of a distributed system.
etcdadm - Command-line tool for operating an etcd cluster. It makes it easy to create a new cluster, add a member to, or remove a member from an existing cluster.
SwarmKit - Toolkit for orchestrating distributed systems at any scale. It includes primitives for node discovery, raft-based consensus, task scheduling and more.
Golimit - Uber ringpop based distributed and decentralized rate limiter.
Awesome Scalability - Patterns of Scalable, Reliable, and Performant Large-Scale Systems.
Amazon Builders' Library - How Amazon builds and operates software.
Jepsen - Distributed Systems Safety Research.
ION - Distributed RTC system written by pure go and flutter.
Veneur - Distributed, fault-tolerant pipeline for runtime data.
Temporal - Open source microservices orchestration engine for running mission critical code at any scale. (Code) (Docs) (Why I joined Temporal) (Go SDK)
Paxakos - Rust implementation of a distributed consensus algorithm based on Leslie Lamport's Paxos.
Riemann - Network event stream processing system, in Clojure.
Submitit - Lightweight tool for submitting Python functions for computation within a Slurm cluster.
CAP FAQ
The Reactive Principles - Design Principles for Distributed Applications.
Paxi - Framework that implements WPaxos and other Paxos protocol variants.
Rafting Trip - Learn about network programming, concurrency, distributed systems, and more as you tackle the challenge of implementing the Raft distributed consensus algorithm.
Disel: Distributed Separation Logic - Separation-style logic for compositional verification of distributed systems.
raft-zero - Implementation of the Raft consensus algorithm on top of the act-zero actor framework.
raft-playground - Application to simulate and test a Raft cluster, using raft-zero.
Grafana Tempo - Open source, easy-to-use and high-scale distributed tracing backend. (Web) (Announcement) (HN)
Testing Distributed Systems - Curated list of resources on testing distributed systems. (Code)
Braft - Industrial-grade C++ implementation of the RAFT consensus algorithm.
MirBFT Library - Consensus library implementing the Mir consensus protocol.
Loading Shedding Strategies - Demonstration of load shedding and how it can make your services more resilient in outages and come back online quicker.
Meld - Decentralized shared state.
Compartmentalized Paxos - Scaling Replicated State Machines with Compartmentalization. (Tweet)
Consensus: Bridging Theory and Practice - PhD dissertation on the Raft consensus algorithm.
Jepsen - Framework for distributed systems verification, with fault injection. Clojure library.
Distributed Systems in Rust - Training course about the distributed systems in Rust.
rsraft - Raft implementation in Rust.
Porcupine - Fast linearizability checker for testing the correctness of distributed systems.
Namazu - Programmable Fuzzy Scheduler for Testing Distributed Systems.
Byztime - Byzantine-fault-tolerant protocol for synchronizing time among a group of peers, without reliance on any external time authority.
unitalk - Distributed chat system which can be used as chat rooms or state synchronization.
Maelstrom - Workbench for learning distributed systems by writing your own.