Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Illegal tx detection in context of whole bundle #2263

Merged
merged 8 commits into from
Dec 12, 2023

Conversation

ParthDesai
Copy link
Contributor

@ParthDesai ParthDesai commented Nov 22, 2023

Description

This PR updates domain runtime as follows in first commit:

  1. Modification of check_transaction_validity runtime api: The modification allows it to validate unsigned and SelfContained evm txs along with signed txs.
  2. Addition of check_transaction_validity_and_do_pre_dispatch api: This api validates the transaction as per check_transaction_validity + also preform pre_dispatch on the extrinsic to have changes like nonce updation persist in the buffer of RuntimeApiImpl.
  3. Modification of storage_keys_for_verifying_transaction_validity: The modification changes function signature to make passing AccountId optional.
  4. Addition of check_bundle_extrinsics_validity: This runtime api internally uses check_transaction_validity_and_do_pre_dispatch to validates extrinsic in the bundle and in case any illegal tx is detected it returns Err.

Changes to the domain client is as follows in second commit:

  1. In domain proposer: While gathering tx to build a bundle, we utilize a single runtime instance to maintain context over multiple runtime calls to check_transaction_validity_and_do_pre_dispatch. In case an illegal tx is detected, it is ignored and its changes are rolled back from the context saved in that runtime instance. the call to the check_transaction_validity_and_do_pre_dispatch is shifted last to make sure that tx's changes do not persist if any check failes
  2. In block preprocessor: We are using single runtime instance and check_transaction_validity_and_do_pre_dispatch to maintain context when iterating through the bundle transactions. This check is after inherent extrinsic as inherent extrinsic will always fail the validation check.

Code contributor checklist:

@ParthDesai ParthDesai force-pushed the illegal-tx-detection-first-draft branch 2 times, most recently from fad1133 to 8d24955 Compare November 24, 2023 06:55
@ParthDesai ParthDesai force-pushed the illegal-tx-detection-first-draft branch from 8d24955 to 11a2f8a Compare November 24, 2023 16:11
domains/client/block-preprocessor/src/lib.rs Show resolved Hide resolved
domains/client/block-preprocessor/src/lib.rs Outdated Show resolved Hide resolved
domains/client/block-preprocessor/src/stateless_runtime.rs Outdated Show resolved Hide resolved
domains/primitives/runtime/src/lib.rs Outdated Show resolved Hide resolved
storage_keys: storage_keys_for_verifying_tx_validity_inner(signer, block_number, Self::extrinsic_era(uxt)),
let maybe_signer = check_transaction_validity_inner(&lookup, uxt, block_number, block_hash)?;

let storage_keys = storage_keys_for_verifying_tx_validity_inner(maybe_signer, block_number, Self::extrinsic_era(uxt));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we still need to return these storage keys here, since the generation of fraud proof happens when there is mismatch of ER::inbo_bundles instead of when this check return error.

let transaction_validity_result =
runtime_api_instance.execute_in_transaction(|api| {
let transaction_validity_result = api
.check_transaction_validity_and_do_pre_dispatch(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are doing this check on bundle level, should we do this check on individual extrinsics above?
Would they go through the same approach making individual extrinsic check redundant ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need the individual extrinsic level check IMHO since the bundle level is a superset of it. Ideally, we should perform the check on the block level (like the consensus chain does when producing block) but bundle level is the best we can do.

domains/runtime/evm/src/lib.rs Outdated Show resolved Hide resolved
// We invoke `pre_dispatch` in addition to `validate_transaction`(even though the validation is almost same) as that will add the side effect of SignedExtension in the storage buffer
// which would help to maintain context across multiple transaction validity check against same
// runtime instance.
match xt.signed {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it is desired to perform a more exhaustive validation here to cover different types of tx and checks, the storage_keys only contains a few storages used in the SignedExtra check which may not be exhaustive, i.e. if the pre_dispatch of a tx requires access to another storage which is not included in the storage_keys then we are unable to construct a valid fraud proof. I think this is a broader problem we need to think on cc @vedhavyas

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NingLin-P I have found a way. Idea is to use ProofRecorder to get the keys accessed during runtime execution. These keys are not public at the moment. I have opened a PR here: paritytech/polkadot-sdk#2561

Once that is in, to create storage proof, we simply have to execute desired runtime call and get the accessed keys from ProofRecorder.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ParthDesai You should also patch this in our fork so that we use it right away instead of waiting until we upgrade substrate

let transaction_validity_result =
runtime_api_instance.execute_in_transaction(|api| {
let transaction_validity_result = api
.check_transaction_validity_and_do_pre_dispatch(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need the individual extrinsic level check IMHO since the bundle level is a superset of it. Ideally, we should perform the check on the block level (like the consensus chain does when producing block) but bundle level is the best we can do.

Copy link
Member

@vedhavyas vedhavyas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused with so many new API being introduced/changed but we would just need one API to check the bundle validity here 🤔

// Using `check_transaction_and_do_pre_dispatch` instead of `check_transaction_validity`
// to maintain side-effect in the storage buffer.
let is_legal_tx = if api_version == 2 {
runtime_api.check_transaction_and_do_pre_dispatch(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are doing a full bundle check during the bundle generation, shouldn't we use the same here and mark bunlde invalid if bundle check fails?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do that. If this check fails it is marked as invalid.

crates/sp-domains/src/lib.rs Show resolved Hide resolved
domains/client/block-preprocessor/src/stateless_runtime.rs Outdated Show resolved Hide resolved

/// New in v2.
/// Checks validity of bundle extrinsics. Internally calls `check_transaction_and_do_pre_dispatch`
fn check_bundle_extrinsics_validity(bundle_extrinsics: Vec<<Block as BlockT>::Extrinsic>, block_number: NumberFor<Block>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont see this being used during bundle proposal. I'm confused why there are so many api when all we need is this API to check the bundle validity and return the extrinsic index that fails the precheck and the storage proof until this extrinsic. This also make me think how big can the storage proof be

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also make me think how big can the storage proof be

This depends on the exact storage accessed during the pre_dispatch of the illegal tx, but the size should be guarded by the excessive transaction check since pre_dispatch is also part of the execution of the tx.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vedhavyas Reason why we have used check_transaction_and_do_pre_dispatch and not check_bundle_extrinsics_validity both during bundle proposal and bundle validation is because they both are equivalent in the sense bundle one deals with vector of extrinsic and the other one deal with single extrinsic.

I think, I am starting to see how it is easy to be confused by this. Let me remove this api. We can always add it back if it is required.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vedhavyas Regarding storage proof, idea is that you only check for extrinsic signed by same signer. Which covers both nonce and tx payment check. This would mean storage proof is bounded to storage keys used during validation of all extrinsics signed by same signer.

@ParthDesai
Copy link
Contributor Author

I'm confused with so many new API being introduced/changed but we would just need one API to check the bundle validity here 🤔

At the time of writing this PR, idea was to have two equivalent apis one operate on single tx and other one operate on set of tx. Looking back, I realized that check_bundle_extrinsics is not needed. We can always use check_transaction_and_do_pre_dispatch instead. I will be removing that.

Copy link
Member

@NingLin-P NingLin-P left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me in general and I'm okay to move forward and leave the storage keys collect in a separate PR.

Some nits are left, mostly just some cleanup and the test (PTAL at how other tests using the helper function to reduce the size of the test)

fn check_transaction_validity(
uxt: &<Block as BlockT>::Extrinsic,
block_number: NumberFor<Block>,
block_hash: <Block as BlockT>::Hash
) -> Result<(), CheckTxValidityError>;

/// Checks the validity of extrinsic
/// v2 can check validity of unsigned and self contained extrinsics as well.
fn check_transaction_validity(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is essentially a wrapper of TaggedTransactionQueue::validate_transaction with additional storage keys returned in the error, since we will get the storage keys in other ways I think this runtime API can be removed and the CheckTxValidityError can be replaced with TransactionValidityError

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I can remove the new version of the API. But, as stated in another comment, we cannot deprecate a runtime api without adding newer version of that runtime api. So, old version need to stay as it is till we upgrade to new network.

fn storage_keys_for_verifying_transaction_validity(
account_id: opaque::AccountId,
block_number: NumberFor<Block>,
tx_era: Option<Era>,
) -> Result<Vec<Vec<u8>>, VerifyTxValidityError>;

/// Returns the storage keys of states accessed in the API `check_transaction_validity`
/// In v2, signer account id is optional to support getting storage keys for unsigned tx as well.
fn storage_keys_for_verifying_transaction_validity(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, this runtime API is used to return a fixed set of storage key should also deprecate, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problem is, we cannot deprecate a runtime api without adding newer version of that runtime api in its place. As far as I understand, what this translates to is there is no clear upgrade path on how to remove/deprecate an api without introducing a new one.

Ref: https://paritytech.github.io/polkadot-sdk/master/sp_api/macro.decl_runtime_apis.html

I guess we need to remove it in next version of the network.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are going to have a maintenance branch for gemini-3g soon, so the compatibility with gemini-3g won't be a concern anymore. cc @jfrank-summit

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If breaking changes are introduced, PR should be marked with corresponding label

Copy link
Contributor Author

@ParthDesai ParthDesai Dec 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. Label added. Will remove those apis as well as api versioning requirement as well.

Comment on lines 987 to 992
// Initializing block related storage required for validation
System::initialize(
&(System::block_number() + BlockNumber::one()),
&block_hash,
&Default::default(),
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will call multiple times for each extrinsic and the result will be preserved as we use the same runtime API instance for all extrinsic. From the System::initialize implementation it seems fine though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We deliberately invoke System::initialize here as it is required to validate the SignedExtension. This is something I imported from Executive::validate_transaction.

crates/sp-domains-fraud-proof/src/tests.rs Outdated Show resolved Hide resolved
crates/sp-domains-fraud-proof/src/tests.rs Outdated Show resolved Hide resolved
crates/sp-domains-fraud-proof/src/tests.rs Outdated Show resolved Hide resolved
crates/sp-domains-fraud-proof/src/tests.rs Outdated Show resolved Hide resolved
crates/sp-domains-fraud-proof/src/tests.rs Outdated Show resolved Hide resolved
crates/sp-domains-fraud-proof/src/tests.rs Outdated Show resolved Hide resolved
}

#[tokio::test(flavor = "multi_thread")]
async fn check_bundle_validity_runtime_api_should_work() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this check is duplicated with runtime_instance_assumptions_are_correct

Copy link
Contributor Author

@ParthDesai ParthDesai Dec 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests are mostly similar, but the runtime assumption test is there to make sure that the assumptions we made throughout the code regarding runtime api is valid.

While this test is more concerned about testing the api.

Anyways, I have renamed runtime assumption test api to be more clear.

@ParthDesai ParthDesai added breaking-execution This PR introduces breaking changes to the execution breaking-runtime This PR introduces breaking changes to the runtime breaking-consensus This PR introduces breaking changes to the consensus and removed breaking-execution This PR introduces breaking changes to the execution breaking-consensus This PR introduces breaking changes to the consensus labels Dec 9, 2023
@ParthDesai ParthDesai force-pushed the illegal-tx-detection-first-draft branch from 8858d88 to d1ba55f Compare December 10, 2023 05:09
@ParthDesai ParthDesai force-pushed the illegal-tx-detection-first-draft branch from d1ba55f to e022889 Compare December 10, 2023 05:14
@ParthDesai
Copy link
Contributor Author

@NingLin-P I have simplified the runtime api changes now (deleted storage keys related apis and the tx validity api). And accordingly tests are also more clear and simplified with changes suggested by the review incorporated.

Copy link
Member

@NingLin-P NingLin-P left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM 👍

crates/sp-domains-fraud-proof/src/tests.rs Outdated Show resolved Hide resolved
domains/client/block-preprocessor/src/lib.rs Outdated Show resolved Hide resolved
domains/runtime/evm/src/lib.rs Outdated Show resolved Hide resolved
domains/runtime/evm/src/lib.rs Outdated Show resolved Hide resolved
Copy link
Member

@vedhavyas vedhavyas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense but why did we break compatibility here?
I apologize for any confusion but we still want to maintain compatibility for gemini-3g though

@ParthDesai
Copy link
Contributor Author

Make sense but why did we break compatibility here? I apologize for any confusion but we still want to maintain compatibility for gemini-3g though

@vedhavyas As per @NingLin-P and my conversation with @jfrank-summit this fraud proof will be part of the next iteration of the network. So, I just cleaned up legacy stuff related to storage keys and validating txs.

@vedhavyas
Copy link
Member

@vedhavyas As per @NingLin-P and my conversation with @jfrank-summit this fraud proof will be part of the next iteration of the network. So, I just cleaned up legacy stuff related to storage keys and validating txs.

I agree but we cannot upgrade on Gemini-3g at this state. You can leave a TODO to cleanup once we are prepping for the next network. We have bunch of TODOs that will be gone before next network

Copy link
Member

@vedhavyas vedhavyas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM without approval since some comments needs to be resolved

@ParthDesai
Copy link
Contributor Author

@NingLin-P @vedhavyas Comments are resolved now. If there are no further concern, can you guys approve it?

@ParthDesai ParthDesai enabled auto-merge December 12, 2023 17:14
@ParthDesai ParthDesai added this pull request to the merge queue Dec 12, 2023
Merged via the queue into main with commit 67758ca Dec 12, 2023
10 checks passed
@ParthDesai ParthDesai deleted the illegal-tx-detection-first-draft branch December 12, 2023 18:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking-execution This PR introduces breaking changes to the execution breaking-runtime This PR introduces breaking changes to the runtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants