-
Notifications
You must be signed in to change notification settings - Fork 6
Aura node architecture
The Aura node can be divided into four primary components: Ingester, Synchronizer, Backfiller, and API. These core workers operate continuously to ensure data consistency and availability.
-
Ingester and Backfiller: These components are responsible for populating the main database—RocksDB. The ingester handles real-time data ingestion, while the backfiller ensures that any missed data is retroactively inserted into RocksDB.
-
Synchronizer: This worker moves data from RocksDB to PostgreSQL, enabling more complex API queries that require relational data handling and advanced querying capabilities.
-
API: The API layer is responsible for serving data to external consumers, providing easy access to indexed information.
Additional Workers
In addition to the primary workers, there are six auxiliary workers available for specialized tasks:
-
Migrator
-
RawBackfiller
-
RawBackup
-
ColumnCopier
-
ColumnRemover
-
ForkDetector
This module receives the latest updates from the Solana blockchain, indexes them, and stores them in the primary database—RocksDB.
The core function of this module is to retrieve updates from the data source (either Redis or TCP stream), parse the data, and save the relevant transaction and account information into RocksDB.
It can source the latest transactions and account updates via Redis or a direct TCP connection, depending on the configuration of the message_source parameter. While both options are supported, using Redis has a significant advantage: during Aura node updates, no data will be lost. Redis stores all updates during downtime, eliminating the need to re-parse the entire accounts snapshot to restore the current network state after an update.
The Aura node processes both account updates and transactions. Account updates are essential for handling NFTs created with the MPL Metadata or MPL Core Solana programs. On the other hand, transactions are needed for processing compressed NFTs (cNFTs), as all cNFT data is stored within instruction arguments and events.
The system uses two key components for parsing and storing data into RocksDB:
-
AccountsProcessor: Responsible for parsing and saving account-related data.
-
BubblegumTxProcessor: Handles the processing of compressed NFT transactions, saving relevant data to RocksDB. Both of these workers exclusively save data to RocksDB.
Additionally, there is a BatchMintPersister, which processes batch minting operations through a separate queue from accounts and transactions. Its task is to download a JSON file containing the assets to be minted, reconstruct the local Merkle tree, and verify that the provided data in the JSON file is accurate. If everything checks out, it stores the assets from the file into RocksDB.
In addition to the main ingester task, which collects the latest NFT updates, there are several auxiliary workers responsible for identifying and filling gaps in the data. These workers include SignatureFetcher, SequenceConsistentGapFiller, and ForkCleaner.
-
SignatureFetcher: This worker fetches all transaction signatures for the Bubblegum program using the Solana RPC. As new transactions are processed (from sources like TCP or Redis), their signatures are stored in RocksDB. SignatureFetcher compares the signatures in RocksDB with those returned by the RPC. If discrepancies are found, it indicates a gap in the data. When a gap is detected, SignatureFetcher retrieves the missing transactions from the RPC and processes them to maintain data consistency.
-
SequenceConsistentGapFiller: This worker detects gaps in Merkle tree sequences. Each cNFT update action (e.g., transfer or update) increments the tree sequence, and these sequences are tracked in RocksDB for each tree. If a sequence is missing (e.g., seq n+1 is skipped), the worker signals the need to reprocess blocks in the range where the gap occurred (i.e., from seq n to seq n+2). It pushes the affected block numbers to the force_reingestable_slots column family. Another worker then iterates over this column family, downloading and parsing the required blocks to fill the sequence gap.
-
ForkCleaner: This worker periodically checks the LeafSignature column family in RocksDB, which stores signatures for all processed transactions. ForkCleaner looks for signatures that exist in forks, typically indicated by identical signatures with different slots and Merkle tree sequences. Upon detecting a fork, ForkCleaner removes entries from CLItems that correspond to forked sequences and deletes those sequences from the TreeSeqIdx column family. This allows SequenceConsistentGapFiller to identify the missing sequences and trigger the reprocessing of affected transactions.
Backfiller is a background job which can do different tasks, depends on it's mode. Some of them are one-time jobs and some continious.
It has four modes of operation. Below, you can find a brief description for each of them.
This is a one-time job.
The consumer is the DirectBlockParser, which is a struct with a Bubblegum transactions parser.
The produced item is the BackfillSource. The inner object can either be a BigTable client or an RPC client.
It launches the SlotsCollector with parameters to start from and parse until to collect slots (u64 numbers) for a pubkey.
The SlotsCollector saves slots to the BubblegumSlots Rocks CF.
Then the TransactionsParser is launched. It uses the BubblegumSlotGetter to get slots to process from the BubblegumSlots CF.
The block producer here is the BackfillSource.
It processes blocks, saves transaction results, and then drops the numbers of processed slots from the Rocks BubblegumSlots and adds them to the IngestableSlots CFs.
It doesn’t save any parameters and doesn’t save raw blocks.
This is a one-time job.
The consumer is RocksDB.
The producer is the BackfillSource (either BigTable or RPC).
Slots to start from and parse until will be taken from the config.
The SlotsCollector collects slots and saves them to the BubblegumSlots Rocks CF.
The TransactionsParser is launched to get the block by slot number and persists it to the Rocks RawBlock CF.
Once a block is persisted, its number is dropped from BubblegumSlots and added to the IngestableSlots Rocks CF.
It doesn’t save any parameters.
This is a one-time job.
The consumer is the DirectBlockParser, which is a struct with a Bubblegum transactions parser.
The producer is RocksDB.
At the beginning, the TransactionsParser takes the slot to start from. It can take this value either from the config or it will start the iteration from the beginning of the raw_blocks_cbor Rocks CF.
For the DirectBlockParser, the already_processed_slot function always returns false, so it will parse everything.
The block is extracted from the producer - RocksDB.
The consumer receives the block and processes it. More specifically, it parses transactions, calls get_ingest_transaction_results() to get TransactionResult, and saves it to the Rocks.
Once it has parsed all the blocks, it saves the maximum slot number to the LastFetchedSlot RocksDB parameter. This allows us to restart the backfiller in PersistAndIngest mode and it will start collecting new slots and blocks we don't have yet in the DB.
Once it finishes its job, it will not do any post backfill jobs.
This is a continuous job.
Three workers are running in this mode: slot collector, block fetcher and saver, and block parsing.
The consumer is Rocks.
The producer for slots is the BackfillSource (BigTable or RPC).
It takes the parse_until slot from RockDB LastFetchedSlot parameter. If there is no value, it takes it from the config.
Slot numbers are saved to the BubblegumSlots Rocks CF.
The consumer is RocksDB.
The producer is the BackfillSource (BigTable or RPC).
From the BubblegumSlots Rocks CF, slots are extracted, and then the block is downloaded with the help of the BackfillSource.
Once the block is downloaded and saved, the slot is dropped from the BubblegumSlots CF and also this slot is added to the IngestableSlots CF so the next worker could parse it.
The consumer is the DirectBlockParser, which is a struct with a Bubblegum transactions parser.
The producer is RocksDB.
The IngestableSlotGetter returns slots from the IngestableSlots CF, then blocks are extracted from the Rocks.
Once a block is received, it’s parsed, and the slot is dropped from the IngestableSlots CF.
This worker is responsible for downloading JSON files. During database synchronization, the Synchronizer assigns tasks to download any missing JSONs. The JSON Processor handles these tasks by retrieving them from PostgreSQL and then storing the downloaded JSON files into RocksDB.
Coming soon...
Coming soon...
Since the Aura node uses two databases—RocksDB and PostgreSQL—a tool is needed to ensure data consistency between them. For this, the AssetsUpdateIdx column family in RocksDB stores all asset update indexes. These indexes consist of the sequence number, slot, and account public key. The sequence is an internal counter tracking updates and is unrelated to the Merkle tree sequence. Every update that the ingester processes is saved to this column family.
On the PostgreSQL side, the same index is stored to track the latest synchronized update.
Coming soon...
Legacy utilite to migrate JSONs and tasks from one data base to another one.
This is a separate tool that performs all the functions of the Backfiller from the Ingester, except for direct ingestion. It shares the same codebase as the Ingester.
The tool is primarily used for downloading large amounts of raw blocks or parsing blocks that have already been downloaded. A separate binary was created to allow these processes to run independently. It’s typically necessary to use this tool when setting up a new node.
In the scripts/ directory, you’ll find two bash scripts to execute these processes:
-
run-ingest-persisted
(for parsing already downloaded blocks) -
run-slots-persisting
(for downloading and saving raw blocks)
This tool creates backups of raw blocks and JSONs. It works by iterating through all blocks and JSONs in the source RocksDB and copying them to the target RocksDB. This is particularly useful when you want to create a backup of raw, non-indexed data. The backup can later be used by the Aura node itself or by other indexers that utilize different data structures.
This tool copies column data from one RocksDB to another. The source database is opened in secondary mode. It's primarily a development tool, useful for debugging purposes.
As the name suggests, this tool is used to drop specific columns from RocksDB. Like the Column Copier, it's mainly a development tool, helpful for bug fixing and debugging various cases.
This binary is designed to detect transactions that were part of a fork, particularly identifying cNFTs that were updated in these forked transactions.
The script was necessary because the previous fork cleaner could incorrectly handle data removal when a fork occurred. Specifically, if the same asset is updated in multiple blocks (one of which is forked) and those blocks have different sequences, the cleaner doesn't properly resolve the discrepancy. It may remove one sequence but leave the other, which can lead to problems. If the sequence from the forked block (which may be higher) is dropped, the tool won’t backfill the lower sequence that was accepted by the majority of validators.
It's important to run this binary with the indexer turned off.
Once a fork is detected, the binary removes the corresponding sequences. Afterward, when the indexer is relaunched, the SequenceConsistentGapFiller identifies any gaps in the sequences and fills them appropriately.
The current version of the fork cleaner handles forks efficiently, so this tool doesn't need to be run continuously.
As mentioned earlier, the Aura node utilizes two types of storage: RocksDB and PostgreSQL. RocksDB serves as the primary storage for all processed data, while PostgreSQL functions as an index storage solution for complex API queries, such as searchAsset
.
Below is a description of the data stored in each database.
Stores static information about assets, such as immutable properties.
Key
- asset pubkey
Fields
- pubkey
- specification_asset_class
- royalty_target_type
- created_at
- edition_address
Holds dynamic details of assets.
Key
- asset pubkey
Fields
- pubkey
- is_compressible
- is_compressed
- is_frozen
- supply
- seq
- is_burnt
- was_decompressed
- onchain_data
- creators
- royalty_amount
- url
- chain_mutability
- lamports
- executable
- metadata_owner
- raw_name
- mpl_core_plugins
- mpl_core_unknown_plugins
- rent_epoch
- num_minted
- current_size
- plugins_json_version
- mpl_core_external_plugins
- mpl_core_unknown_external_plugins
Stores a mapping between metadata and the mint accounts to not calculate metadata key each time.
Key
- metadata pubkey
Fields
- pubkey
- mint_key
Stores data related to the authority or control over the asset.
Key
- asset pubkey
Fields
- pubkey
- authority
- slot_updated
- write_version
Contains data about the current owner of each asset.
Key
- asset pubkey
Fields
- pubkey
- owner
- delegate
- owner_type
- owner_delegate_seq
Stores leaf data related to assets as part of a Merkle tree structure.
Key
- asset pubkey
Fields
- pubkey
- tree_id
- leaf
- nonce
- data_hash
- creator_hash
- leaf_seq
- slot_updated
Contains collection-level data for assets that belong to specific collections.
Key
- asset pubkey
Fields
- pubkey
- collection
- is_collection_verified
- authority
Stores off-chain data associated with the assets, such as metadata(JSON file).
Key
- url
Fields
- url
- metadata
Holds CLItem data emited by Account compression program during instruction execution.
Key
- Merkle tree node id + tree pubkey
Fields
- cli_node_idx
- cli_tree_key
- cli_leaf_idx
- cli_seq
- cli_level
- cli_hash
- slot_updated
Stores leaf nodes from the Merkle tree.
Key
- Merkle tree node id + tree pubkey
Fields
- cli_leaf_idx
- cli_tree_key
- cli_node_idx
Stores slots numbers with transactions related to Bubblegum program.
Key
- slot number
Stores slots numbers that require to be processed. This column family is populated by backfiller if it works either in IngestPersisted or PersistAndIngest mode.
Key
- slot number
Stores slots numbers which we have to re-parse because of gap in data. This column family is used only if sequence_consistent_checker is active. If it found a gap in tree sequence it writes slots which has to be re-parse to this column. And then slot_force_persister is iterating over these slots, download blocks and parse them.
Key
- slot number
Stores raw block data in CBOR format.
Key
- block number
Fields
- data
Stores the index of updated assets. This column is used by synchronizer to keep RocksDB and PostgreSQL in sync.
Key
- sequence + slot + pubkey
Maps slots to asset updates.
Key
- slot + pubkey
Stores the sequence index for Merkle trees. Every sequence update is saved here.
Key
- tree pubkey + sequence
Fields
- slot
Stores pubkeys of trees which has gaps in sequences.
Key
- tree pubkey
Stores either Edition or MasterEdition information.
Key
- asset pubkey
Fields
Edition
- key
- parent
- edition
- write_version
MasterEdition
- key
- supply
- max_supply
- write_version
Contains data about token accounts
Key
- asset pubkey
Fields
- pubkey
- mint
- delegate
- owner
- frozen
- delegated_amount
- slot_updated
- amount
- write_version
Stores bool flag is wallet's token balance is zero or not.
Key
- owner wallet + token account pubkey
Fields
- is_zero_balance
- write_version
Stores bool flag is wallet's token balance is zero or not. But compared to TokenAccountOwnerIdx data sorted by mint as well.
Key
- mint + owner + token account
Fields
- is_zero_balance
- write_version
Stores compressed assets signatures.
Key
- key + leaf id + sequence
Fields
- transaction signature
- instruction name
- slot
A queue for batch mint operations to process.
Key
- file hash
Fields
- file_hash
- url
- created_at_slot
- signature
- download_attempts
- persisting_state
- staker
- collection_mint
Stores batch mints which did not pass verification.
Key
- status + file hash
Fields
- status
- file_hash
- url
- created_at_slot
- signature
- download_attempts
- staker
Stores downloaded batch mint information.
Key
- file hash
Fields
- batch_mint
- tree_id
- batch_mints
- raw_metadata_map
- max_depth
- max_buffer_size
- staker
Stores RocksDB migration version.
Key
- version number
Stores token prices in USD by its symbol.
Key
- token symbol
Fields
- price
Represents information about asset preview stored on Storage service.
Key
- asset's url hash
Fields
- size
- failed
Rocks DB column family that is used as a queue for asset URLs, to be sent to Storage service, where they are downloaded and saved as previews.
Key
- url
Fields
- timestamp
- download_attempts
Represents information about background job that can be one time job, or a scheduled job that is launched recurrently with a given interval.
Key
- job id
Fields
- job_id
- run_interval_sec
- last_run_epoch_time
- last_run_status
- state
Stores information about token inscriptions.
Key
- asset pubkey
Fields
- authority
- root
- content_type
- encoding
- inscription_data_account
- order
- size
- validation_hash
- write_version
Stores raw inscription data.
Key
- asset pubkey
Fields
- pubkey
- data
- write_version
This column family contains sequence updates for each leaf in the tree.
Key
- set of
Signature+TreeId+leafId
Fields
- data: hash map with slots and sequences
Below you can find short description of tables Aura node has in PorsgreSQL.
Stores asset creators.
- pubkey
- creator
- verified
- slot_updated
Stores asset authorities.
- pubkey
- authority
- slot_updated
Stores all the asset info.
- pubkey
- specification_version
- specification_asset_class
- royalty_target_type
- royalty_amount
- slot_created
- owner
- owner_type
- delegate
- collection
- is_collection_verified
- is_burnt
- is_compressible
- is_compressed
- is_frozen
- supply
- metadata_url_id
- slot_updated
- authority_fk
Batch mints queue.
- file_name
- state
- error
- url
- tx_reward
- created_a
Stores last synced asset. Used by synchronizer for data bases synchronization.
- id
- last_synced_asset_update_key
Stores tasks for json downloader to process NFTs metadata.
- metadata_url
- status
- locked_until
- attempts
- max_attempts
- error
- id