Use yes to reproduce flaky tests
I use yes to saturate CPUs and reproduce flaky tests failures. Pair it with a small loop helper to rerun the test until it fails.
SlateDB occasionally has a flaky test failure–one that occurs randomly. Such failures are usually time dependent. They usually crop up in a GitHub action failure. GitHub’s runners are notoriously unstable; they are usually overloaded and you have noisy neighbors.
I find that I need to saturate my laptop’s CPU to replicate the flaky test failure. To do so, I use the yes command.
NAME
yes – be repetitively affirmative
SYNOPSIS
yes [expletive]
DESCRIPTION
The yes utility outputs expletive, or, by default, “y”, forever.
SEE ALSO
jot(1), seq(1)
HISTORY
The yes command appeared in Version 7 AT&T UNIX.
I found the trick on StackOverflow a while back and have been using it ever since. In fact, I’ve added a saturate script to my .zshrc:
# With an argument, spawn N background `yes` processes writing to /dev/null.
# With no argument, print the number of running `yes` processes.
#
# Usage:
# saturate # print count
# saturate 20 # spawn 20 processes
saturate() {
if [[ $# -eq 0 ]]; then
pgrep -x yes | wc -l
return
fi
local n="$1"
local i
for ((i = 0; i < n; i++)); do
yes > /dev/null &
done
}
The comment above is pretty self-explanatory. I usually use saturate 40. Once I’m done, I do pkill -9 yes.
This command pairs nicely with a loop command that I have in .zshrc:
# Little loop helper function
# Call loop <command> to run in a loop until a non-zero exit is returned.
loop() {
local count=1
while true; do
echo "loop_iter(#$count)"
"$@" || break
count=$((count+1))
done
}
This allows me to run a test in a loop until it fails. For example, I can do loop cargo test --test flaky_test. All together, the commands look like:
saturate 40
loop cargo test --test flaky_test
pkill -9 yes