Distributed systems are hard. Even a few interacting agents can lead to tens of thousands or even millions of unique system states. At that scale, it's impossible to test for, or even reason about, every possible edge case. We need better tools not just for building systems, but for understanding them.
To truly understand distributed systems, we need to turn to software modeling, or "formal methods". A few hours of modeling catches complex bugs that would take weeks or months of development to discover. With software modeling, we can build distributed systems more confidently, quickly, and safely.
This talk will focus on one software modeling tool called TLA+, which has seen spectacular success in real-world projects. We will also demonstrate it on an example of a distributed system and closeout by discussing learning resources.