The Kubernetes project does a lot of testing, on the order of 10000 jobs per day covering everything from build and unit tests, to end-to-end testing on real clusters deployed from source all the way up to ~5000 node scalability and performance tests.
The system handling all of this leverages Kubernetes, naturally, and of-course has a number of nautically-named components. This system is Prow, and is used to manage automatic validation and merging of human-approved pull requests and to verify branch-health leading up to each release.
With Prow each job is a single-container pod, created in a dedicated build and test cluster by “plank”, a micro-service running in the services cluster. Each Prow component (roughly outlined above, along with Testgrid) is a small Go service structured around managing these one-off single-pod “ProwJobs”.
Using Kubernetes frees us from worrying about most of the resource management and scheduling / bin-packing of these jobs once they have been created and has generally been a pleasant experience.
Prow / “hook” also provides a number of GitHub automation plugins
used to provide things like issue and pull request slash commands for applying and removing labels, opening and closing issues, etc.
This has been particularly helpful since GitHub’s permissions model is not particularly granular and we’d like contributors to be able to label issues without write permissions.
If any of this sounds interesting to you come check out Prow’s source code and join our SIG Testing meetings for more.
There are many other tools that didn’t make the diagram or dicussion above, you can find these and more about everything at github.com/kubernetes/test-infra.
These are all open source, except Testgrid, which is actually a publicly hosted and configured version of an internal tool developed at Google. We hope to open source a more performant rewrite of Testgrid sometime in Spring 2018.