Debugging Bitcoin Core Functional Tests

Debugging Bitcoin Core Functional Tests

I was trying to improve the functional tests for bumpfee, a Bitcoin Core wallet feature that lets you increase the fee of a transaction that’s unconfirmed and stuck. Unfortunately I introduced a bug in the test, which I’m still in the process of tracking down. Every disadvantage has its advantage, so I took the opportunity to better understand the functional test framework and its powerful debugging tools.

Thanks to everyone who pointed me in the right direction on IRC (as well as for tips on how to use IRC without going insane).

You can view the changes I made, including the bug, in this pull request. Caveat: this is a PR to my own fork: don’t make pull requests like this to It’s generally a bad idea to change so many things at the same time, if only because it’s too much burden for code reviewers.

To reproduce the error yourself (assuming OSX or Linux, see here for Windows build instructions at the time this post was written):

The test will start running and you’ll see a log message “Running tests”.

Log statements

There’s not much information between the “Running tests” and “Assertion failed” log messages. To see more, switch the log level from the default INFO to DEBUG:

./ --loglevel=DEBUG

As I was investigating, I added several self.log.debug statements to the test file, to get a better sense of what was going on before the error. Now my log looked something like this:

I used self.log.debug(rbftx)to print information about the RBF transaction that the test generated, self.log.debug(rbf_node.getrawmempool()) to show that the transaction made it into the mempool of the node that created it.

self.log.debug(peer_node.getrawmempool()) shows that it didn’t propagate to the other node. At least not immedidately, which makes sense: synchronisation of both mempools is not expected to happen until the next statement sync_mempools((rbf_node, peer_node)). This test helper function waits for both mempols to be identical and fails otherwise.

A good way to learn more about what’s going on is to intentionally break things. For example if I modify the spend_one_input() helper function to pay a 1 BTC miner fee, the test predicatably fails in a different way:

Print RPC commands

The functional tests work by sending commands to the test nodes via RPC. You can log these commands and their responses using ./ --tracerpc. Now you can clearly see sync_mempools in action:

sync_mempools works by sending both nodes the getrawmempool command and comparing the result. After a while it gives up and throws an error.

We still don’t know why the mempools aren’t synchronizing. So let’s dig deeper.

View node log files

So far we’ve been looking at the test logs. But there’s more: each test node has its own directory which includes a log file that you can inspect.

Note the path given after “Initializing test directory”, in this example /var/…/test4ivt2s9y. The logs for the node that created the test transaction are in /var/…/test4ivt2s9y/node1/regtest/debug.log.

You can change the level of detail in these blogs by adding "-debug=all" to self.extra_args in set_test_params(). More fine-grained log options can be found via ../../src/bitcoind --help:

Output debugging information (default: 0, supplying <category> is
optional). If <category> is not supplied or if <category> = 1,
output all debugging information. <category> can be: net, tor,
mempool, http, bench, zmq, db, rpc, estimatefee, addrman,
selectcoins, reindex, cmpctblock, rand, prune, proxy, mempoolrej,
libevent, coindb, qt, leveldb.

You can even combine the log files of all test nodes, in order to get a chronological picture of what happened:

test/functional/ /var/.../test4ivt2s9y > combined.log

I published my combined logs here:

It’s clear that node1 sent the transaction to node0 and there’s no obvious error message. One interesting observation is that node1 broadcast the transaction twice, even though the test was only run once.

View other node artifacts

Both node directories contain a file mempool.dat. Although you need other tools to really inspect their contents, in my case it was trivial to see that this file was empty for node0 and not empty for node1, consistent with the--tracerpc output above.

Use Python debugger

So we still don’t know what went wrong. Perhaps through manually interacting with the test nodes we can find out more. One way to do that is to use --pdbonfailure.

This gives you a Python console, where you ask things like:self.nodes[0].getrawmempool() and notice it’s still empty.

Let’s try a manual broadcast:

Aha, that’s interesting! After this manual broadcast, we find that our transaction finally made it to node0. This still doesn’t solve our mystery, but at least provides another clue.

Inspect running nodes

If things get really desperate, you can leave the nodes running after the test, using--noshutdown. That way you can poke at it using some other tool.

Use serendipity

No, that’s not a tool, it’s a proccess. Just go do other stuff. Eventually you might run into a solution. It turned out the test nodes thought they were still in IBD (Initial Blockchain Download), during which process they don’t synchronize their mempools. To tell the test nodes IBD is over, you need to mine an additional block using peer_node.generate(1). So I broke the tests by removing peer_node.generate(110).

More tests welcome

There’s still plenty of tests to write and improve in Bitcoin Core. Some integration tests, like the one in this article, are written in Python. Those could be a good place to start, until you’re a bit more familiar with the codebase. Please follow recommended practices. There are also C++ integration tests, as well as unit tests.

Leave a Reply

Your email address will not be published. Required fields are marked *