Skip to content

core/state: introduce the TransitionState object (verkle transition part 1) #31634

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

gballet
Copy link
Member

@gballet gballet commented Apr 14, 2025

This is the first part of #31532

It maintains a series of conversion maker which are to be updated by the conversion code (in a follow-up PR, this is a breakdown of a larger PR to make things easier to review). They can be used in two modes:

  • At genesis, by forcing the setting of a conversion marker as "ended". It replaces EnableVerkleAtGenesis.
  • During the conversion, by storing the conversion markers when the block has been processed. This is meant to be written in a function that isn't currently present, hence this TODO.

Part of #31583

@gballet gballet marked this pull request as ready for review April 14, 2025 10:49
@gballet gballet added the verkle label Apr 14, 2025
@gballet gballet force-pushed the rebase-master-add-transition-state branch from e3f7af8 to 86183a8 Compare July 15, 2025 10:27
@rjl493456442
Copy link
Member

rjl493456442 commented Jul 25, 2025

I have added some core changes into this PR, specifically:

  • remove the transition state pre-initialization around the Genesis, which i believe it's not the best solution
  • change the CurrentAccountAddress to CurrentAccountHash, getting rid of the pointer

Let's ship the transition state structure definition in this PR first and then pile up more stuff on top.

@rjl493456442 rjl493456442 force-pushed the rebase-master-add-transition-state branch from 20731b8 to 86183a8 Compare July 28, 2025 07:55
// TransitionState is a structure that holds the progress markers of the
// translation process.
type TransitionState struct {
CurrentAccountAddress *common.Address // addresss of the last translated account
Copy link
Member

@rjl493456442 rjl493456442 Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a pointer here is unnecessarily complex, we can replace it with the account address hash instead, reasons:

  • Verkle migration is traversed in the order of account address hashes
  • A zero hash can represent that nothing has been migrated yet
  • Using a hash allows the entire TransitionState to remain a pure value type, with no pointers involved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have already discussed that multiple times. I could confirm that not using the pointer was breaking the changes owing to some side effect somewhere else that I don't remember just now, I'll have to debug when I get back to Berlin - but this PR has already been sitting for >7 months and it's time to be productive and merge it. Once I am able to perform transitions with master, we can redesign this since we'd have a reference.

Regarding the hash, this means that if you have a storage tree for which the conversion takes more than one block, you need to resolve the address from the hash to open the storage tree. that, in turns, means that you need something like the preimage store, which is very slow. What we use is some side file that has been generated to be in enumeration order, and we would have to backtrack that file for an unknown amount of bytes (well, we could store that offset as well, but then what's the difference with a pointer?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the hash, this means that if you have a storage tree for which the conversion takes more than one block, you need to resolve the address from the hash to open the storage tree. that, in turns, means that you need something like the preimage store, which is very slow.

I don't agree with this. The entire conversion is hash-based, we traverse in trie order, which has nothing to do with the account address. I do agree it's quite annoying that the ordering is determined by the hash.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's time to be productive and merge it

I totally agree. But the process of moving things forward involves integrating codes piece by piece, in the proper way. Unfortunately, some necessary changes are inevitable sometimes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could confirm that not using the pointer was breaking the changes owing to some side effect somewhere else that I don't remember just now

Yeah, maybe I miss something. I can live with the current workaround. But I will always keep my suggestions to simplify the code.

}

// Copy returns a deep copy of the TransitionState object.
func (ts *TransitionState) Copy() *TransitionState {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function can be removed if the AccountAddress is replaced with AccountAddressHash

@@ -184,6 +236,7 @@ func (db *CachingDB) Reader(stateRoot common.Hash) (Reader, error) {
readers = append(readers, newFlatReader(snap))
}
}
ts := db.LoadTransitionState(stateRoot)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's very expensive before the Verkle activation;

The TransitionState won't be existent before the verkle migration, each lookup ends up with an non-existent database lookup, which traverses the entire LSM Tree layers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a fair point. and I see that your (now deleted) changes serve as a guard. But it lacks the ability to distinguish between the transition and post-transition phases. That might still save some hits though.

Copy link
Member

@rjl493456442 rjl493456442 Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I suggested previously, we should expose IsVerkle information to the state reader, so that we can easily distinguish these scenarios respectively:

  • Before the verkle activation: IsVerkle == false

  • The first verkle block: IsVerkle == true && rawdb.ReadTransitionState(stateRoot) == null

  • The blocks within the conversion:

    • the conversion hasn't started: IsVerkle == true && rawdb.ReadTransitionState(stateRoot) != null && state.Start == false && state.End == false
    • the conversion has started: IsVerkle == true && rawdb.ReadTransitionState(stateRoot) != null && state.Start == true && state.End == false
  • The blocks after the conversion: IsVerkle == true && rawdb.ReadTransitionState(stateRoot) != null && state.End == true

#32084 this change simplifies the thing, exposing the IsVerkle without passing the flag around the entire codebase.


The last unsolved puzzle for me is: how to get rid of TransitionState checking
after the conversion. We can probably store a singleton entry, tracking the block
number and block hash for the last block of conversion.

If deep reorg occurs and rewinds the chain back to conversion, this flag should
be removed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benefit is: we can skip the entire TransitionState checking before the verkle, it's the main blocker from merging this change.

@@ -223,7 +276,11 @@ func (db *CachingDB) ReadersWithCacheStats(stateRoot common.Hash) (ReaderWithSta

// OpenTrie opens the main account trie at a specific root hash.
func (db *CachingDB) OpenTrie(root common.Hash) (Trie, error) {
if db.triedb.IsVerkle() {
ts := db.LoadTransitionState(root)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's very expensive before the Verkle activation;

The TransitionState won't be existent before the verkle migration, each lookup ends up with an non-existent database lookup, which traverses the entire LSM Tree layers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the one I don't think we can replace with db.triedb.IsVerkle() because we won't be able to tell the difference between a transition and a post-transition. This being said, it might be enough to guard a call to db.LoadTransitionState...

The only part I'm not clear on, is how you would check that the db is loaded? don't you have to do some kind of empty marker reading either way?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the one I don't think we can replace with db.triedb.IsVerkle() because we won't be able to tell the difference between a transition and a post-transition.

The idea for checking db.triedb.IsVerkle first is to skip the entire TransitionState checking before the verkle.
This state is essential to distinguish between a transition and a post-transition.

@@ -235,10 +292,11 @@ func (db *CachingDB) OpenTrie(root common.Hash) (Trie, error) {

// OpenStorageTrie opens the storage trie of an account.
func (db *CachingDB) OpenStorageTrie(stateRoot common.Hash, address common.Address, root common.Hash, self Trie) (Trie, error) {
ts := db.LoadTransitionState(stateRoot)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's very expensive before the Verkle activation;

The TransitionState won't be existent before the verkle migration, each lookup ends up with an non-existent database lookup, which traverses the entire LSM Tree layers.

@rjl493456442 rjl493456442 added this to the 1.16.3 milestone Aug 5, 2025
@rjl493456442 rjl493456442 merged commit cf50026 into ethereum:master Aug 5, 2025
2 of 3 checks passed
rjl493456442 added a commit that referenced this pull request Aug 15, 2025
…32366)

This add some of the changes that were missing from #31634. It
introduces the `TransitionTrie`, which is a façade pattern between the
current MPT trie and the overlay tree.

---------

Signed-off-by: Guillaume Ballet <[email protected]>
Co-authored-by: rjl493456442 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants