Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of EIP-1153: Transient Storage using Disk Persistence and Lifecycle Management #1588

Merged
merged 52 commits into from
Jan 14, 2025

Conversation

snissn
Copy link
Contributor

@snissn snissn commented Nov 20, 2024

Description

This WIP PR introduces transient storage support in the Filecoin EVM, conceptually aligning with Ethereum's EIP-1153. Transient storage functionality is implemented with a tombstone mechanism for lifecycle tracking keyed by transaction identifiers (origin and nonce). While the core implementation is in place, work remains on validating the transient data's lifecycle and completing associated tests.


Current Progress

State Modifications:

  1. Persistent Transient Data:

    • A Cid variable has been added to represent transient storage.
    pub transient_state: Cid
  2. Lifecycle Tracking:

    • Introduced TransientDataLifespan to track lifecycle with origin ActorID and nonce.
    pub struct TransientDataLifespan {
        pub origin: ActorID,
        pub nonce: u64,
    }

New Operations:

  1. TLOAD: Retrieves transient data while ensuring lifecycle validity.
  2. TSTORE: Updates transient data and lifecycle.
  3. Lifecycle validation (TODO): Compare transaction metadata with stored tombstone.

Remaining TODOs

  1. Liveliness Check for Transient Data:

    • Validate that transient data is only accessible within the valid transaction context by comparing the tombstone (stored origin and nonce) with the current transaction.

    Code Location:

    • get_transient_storage and set_transient_storage in actors/evm/src/interpreter/system.rs.
    // TODO check tombstone for liveliness of data
  2. Test Coverage:

    • Write tests to verify:
      • Transient data lifecycle (e.g., expiration between transactions).
      • Liveliness checks during nested and delegate calls.
      • Behavior when accessing invalid or expired transient data.
    • Existing transient storage operations (TLOAD, TSTORE) are partially tested.

    Code Location:

    • actors/evm/src/interpreter/instructions/storage.rs (Test module)
    // TODO test transient storage lifecycle
  3. Edge Case Handling:

    • Ensure transient data is cleared appropriately in all scenarios, including contract self-destruct or re-creation.

Testing

Implemented Tests:

  • test_tload: Validates TLOAD operation when data exists.
  • test_tload_oob: Ensures out-of-bounds TLOAD returns zero.
  • test_tstore: Confirms TSTORE updates transient data correctly.

Pending Tests:

  • Liveliness and lifecycle tests (TODO in the test module).
  • System-wide validation tests for lifecycle management.

Tradeoffs and Considerations

  1. Increased Storage Overhead:

    • Adds transient_state and transient_data_lifespan to State.
  2. Gas Inefficiency:

    • Current approach persists transient data, diverging from EIP-1153's ephemeral storage design.

Checklist

  • Core implementation of TLOAD and TSTORE.
  • Integration with State and System.
  • Basic tests for TLOAD and TSTORE.
  • Lifecycle validation for transient data (TODO).
  • Comprehensive test coverage (TODO).

Next Steps


References


Review Feedback

Incorporates feedback from @Stebalien:

  • Minimize unnecessary writes when clearing transient data.
  • Keep State clean unless other updates occur.

…around comparing the liveliness of the transient data and validating that functionality in a test
@snissn snissn requested a review from Stebalien November 20, 2024 02:53
@snissn snissn self-assigned this Nov 20, 2024
actors/evm/src/state.rs Outdated Show resolved Hide resolved
actors/evm/src/interpreter/system.rs Outdated Show resolved Hide resolved
actors/evm/src/interpreter/system.rs Outdated Show resolved Hide resolved
actors/evm/src/interpreter/system.rs Outdated Show resolved Hide resolved
…bine transient_slots with transient_data_lifespan anyway.
…DO reinitialize does not properly clear state KAMT due to clone/reference issues that are being debugged
// Reinitialize the transient_slots with a fresh KAMT
//let transient_store = self.rt.store().clone();
//self.transient_slots = StateKamt::new_with_config(transient_store, KAMT_CONFIG.clone());
// TODO XXX reinitialize does not currently work due to blockstore reference issues
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be interested to know what these errors are and how you end up diagnosing the problem, for my own educational purposes because your commented code looks like it should work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because self.rt.store() doesn't implement clone, while &... does, so rust is helpfully creating a reference for you.

Note: the new function explicitly requires that the store be clonable.

Options are:

  1. Lift that requirement up to the impl level.
  2. Use self.transient_slots.into_store() to "take" the existing store.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Stebalien I've tried a few different variations of the "taking" option that you are suggesting. I didn't quite get anything to compile however.

Could you share a deeper explanation on the two options to either lift the requirement to the impl level (I'm not sure what you mean by this; which requirement? which impl?) and also if you can help me with the second option about taking the ownership of the datastore in order to satisfy the requirements of new_with_config. I'm still learning the ins and outs of this in rust, and would be very grateful. I would definitely not mind a code commit either if it's a simple lift for you. Thanks in advanced!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so, this is going to be a bit of a pain. The issue is that you can't leave transient_slots uninitialized so you need something to put there while you construct the new Kamt. E.g., you can have Option<StateKamt<...>> and put none there while you replace it.

But... doing that will also help facilitate lazy loading (you can leave it as None until the user first tries to load a transient slot). So it's not the end of the world.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're stuck, I'd address the rest of the feedback first. Then I'll see what I can do to fix this part.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'm now returning from the thanksgiving break and will look into everything this week.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have learned many things since starting this :) I see that my previous attempt was very much against the RAII ideas that rust enforces. I didn't realize that was the issue. We don't need to destroy and create a new KAMT object but simply need to clear the one we have.

As a placeholder I have implemented a loop over keys and delete implementation of clear for now until an efficient clear() can be used. I opened a PR filecoin-project/ref-fvm#2092 for an efficient implementation of clear in KAMT that I hope can be approved and it will be easy for me from there to upgrade kamt in builtin-actors to the future version that supports clear.

@snissn
Copy link
Contributor Author

snissn commented Dec 4, 2024

Remaining issues:

  • Upgrade KAMT to support a clear() method that supports efficient resetting of KAMT object
  • Comprehensive test coverage of reentry and nested contracts. Confirm if ref fvm MockRuntime tests can support delegate calls or if that test will only be able to be done at a higher abstraction like Lotus
  • Ensure lazy load on first use

@BigLep
Copy link
Member

BigLep commented Dec 4, 2024

2024-12-04 conversation between FilB and FilOz:

  • For testing, should be able to do recursive calls.
  • Look at the fevm-test-kit for some inspiration.
  • Test this in builtin-actors (better testing framework) (look at integration-tests .. evm-tests)
    • @snissn found an example test to hook on to.
  • Test this in lotus as well.
  • migrations
    • a migration does need to get written but it is straightforward w
  • discussion on clear and whether it's needed here

@rvagg
Copy link
Member

rvagg commented Dec 4, 2024

I did a quick look for a "simple" state migration that might be similar to this; I haven't found a great example yet but part of the v9 migration did introduce a couple of fields for FIP-0029 to the miner actor, BeneficiaryTerm and PendingBeneficiaryTerm onto a miner's Info, which were initialised with empty values. Unfortunately it's tangled up with a bunch of other complicated migration things, but I think the idea is present there:

@snissn
Copy link
Contributor Author

snissn commented Dec 5, 2024

following up from a sync with @rvagg @BigLep and @Stebalien with my own copy of notes:

  1. I will look into integration_tests/src/tests/evm_test.rs for remaining testing needs

  2. clear() isn't needed and the best way to implement this

Instead the determination of transient data validity should happen when the actor state is loaded

State - should contain the (lifespan and kamt root) or be null

System - only needs a valid KAMT root or null

  • tload checks for null and returns 0 early if null. otherwise is reads from KAMT without a need to check any lifecycle info
  • tstore checks null, and if null initializes a new empty KAMT. from there it saves and sets the saved_state_root to null
  1. a migration will be needed but it's a simple one

Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from a code perspective!

actors/evm/src/interpreter/system.rs Outdated Show resolved Hide resolved
Co-authored-by: Steven Allen <steven@stebalien.com>
@rvagg
Copy link
Member

rvagg commented Dec 13, 2024

https://github.com/filecoin-project/go-state-types/compare/mikers/evm-transient-data-nv17?expand=1#diff-61808733785f19d73d06ebbd0b984828297ac80b9443c2a9071d88f1fd521b89R31-R32 Nonce and TransientData will need to be flipped in order in your go-state-types representation of this state (order matters because it's encoded as an array). But otherwise that migration looks great.

actors/evm/tests/basic.rs Outdated Show resolved Hide resolved
require(retrievedValue == value, "TLOAD did not retrieve stored value within transaction");
}

function testLifecycleValidationSubsequentTransaction() public {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't this one be // Test 2.2?

Copy link
Member

@rvagg rvagg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well done, particularly on the epic test to actual-changes ratio; my suggestions are simply suggestions, up to you whether you bother

Co-authored-by: Rod Vagg <rod@vagg.org>
@snissn
Copy link
Contributor Author

snissn commented Dec 13, 2024

LGTM from a code perspective!
That's great I think

https://github.com/filecoin-project/go-state-types/compare/mikers/evm-transient-data-nv17?expand=1#diff-61808733785f19d73d06ebbd0b984828297ac80b9443c2a9071d88f1fd521b89R31-R32 Nonce and TransientData will need to be flipped in order in your go-state-types representation of this state (order matters because it's encoded as an array). But otherwise that migration looks great.

Thank you!! That could have been so frustrating to catch later on!

snissn and others added 3 commits December 12, 2024 22:05
@snissn snissn changed the title [WIP] Implementation of EIP-1153: Transient Storage using Disk Persistence and Lifecycle Management Implementation of EIP-1153: Transient Storage using Disk Persistence and Lifecycle Management Dec 14, 2024
@snissn
Copy link
Contributor Author

snissn commented Dec 14, 2024

@rvagg thanks for the feedback on the solidity tests! I ran forge fmt, updated the comments and also removed no longer used solidity test files that I wrapped back into the TranisentStorageTest file at some point during development!

@snissn
Copy link
Contributor Author

snissn commented Jan 10, 2025

@Stebalien I think this is OK to merge with FIP0097 moving from last call to accepted! I see a "merge when ready" button here on github. Before I click I want to coordinate with you. I confirm that I want this merged in to the repo to begin testing on the butter fly network with @rjan90. Can you confirm that looks good to you and either merge or confirm that I should click the "Merge when ready" button (whatever is easier)? Thanks!

@Stebalien Stebalien added this pull request to the merge queue Jan 14, 2025
Merged via the queue into master with commit fdbdfc6 Jan 14, 2025
15 checks passed
@Stebalien Stebalien deleted the feat/transient_storage branch January 14, 2025 18:04
@snissn
Copy link
Contributor Author

snissn commented Jan 15, 2025

@Stebalien thanks!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🎉 Done
Development

Successfully merging this pull request may close these issues.

4 participants