Investigation: failing faster when validating txs with missing inputs #1358
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In various common scenarios, the mempool will (re)apply a tx to a ledger state that does not contain all of the necessary inputs (usually because the tx has already been applied before). Therefore, it is desirable for tx (re) application to fail fast in these cases.
The purpose of this PR is just to summarize effort to look into improvements here. Ideally, no modification to Consensus would be necessary in the end; Ledger should be able to do everything on their side.
Specifically, this PR tries two things:
Try to use
reapplyTx
instead ofapplyTx
when the tx is already in the mempool. The idea here is to avoid crypto work that will be unnecessary as tx validation will fail due to missing inputs eventually anyways.This tx is a concrete example for a tx where this saves time.
Try to restrict the UTxO set before calling the ledger to the potentially needed inputs of the tx. That's exactly what UTxO HD does, so this PR branch is targeting UTxO-HD targeting
main
#1267.To test/benchmark this, this PR modifies the
--repro-mempool-and-forge
db-analyser pass. When processing a block, it adds all of its transactions to the mempool, and then individuall tries to add each tx again, measuring how long it takes until the mempool is rejected. The output data is stored inreadd-txs.csv
.I ran
db-analyser
like this(takes ~30min) and used the mean of two runs.
Plots are created using a very ad-hoc script.
Here, GHC mutator time is used, and
mut_{baseline,patched}
refer to versions against currentmain
(ie without UTxO HD), andmut_utxohd_{baseline,patched}
to versions in this branch._baseline
means that the mempool hasn't been patched to usereapplyTx
, whereas_patched
means exactly that.Show plot 📈
Short summary of the data (in microseconds):
Correlation table:
Thoughts:
mut_baseline
vsmut_utxohd_baseline
" and "mut_patched
vsmut_utxohd_patched
".A next step would be to profile individual txs to understand why they are currently relatively "slow to fail". I did this already for a few txs (that's e.g. how I saw that the tx mentioned above is slow to fail due to crypto without the mempool patch) like this:
db-analyser
pass with--num-blocks-to-process 1
, and make use of{start,stop}ProfTimer
(see the code for an example).+RTS -pj
).