Skip to content

Commit

Permalink
doc: add description to readme
Browse files Browse the repository at this point in the history
  • Loading branch information
n00m4d committed Oct 10, 2024
1 parent 9edb0c1 commit bb1f361
Show file tree
Hide file tree
Showing 4 changed files with 32 additions and 1 deletion.
5 changes: 4 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,10 @@

SHELL := /bin/bash

SNAPSHOTDIR=./plerkle_snapshot/snapshot/*.tar.zst
ifneq (,$(wildcard .env))
include .env
export $(shell sed 's/=.*//' .env)
endif

export IMAGE_NAME=solana-snapshot-etl

Expand Down
23 changes: 23 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ It is built on the following principles.
1. Plerkle -> Geyser Plugin that sends raw information to a message bus using Messenger
2. Messenger -> A message bus agnostic Messaging Library that sends Transaction, Account, Block and Slot updates in the Plerkle Serialization format.
3. Plerkle Serialization -> FlatBuffers based serialization code and schemas. This is the wire-format of Plerkle.
4. Plerkle Snapshot -> ETL for Solana accounts snapshot.

## Developing

Expand Down Expand Up @@ -170,3 +171,25 @@ NOTE WE DO NOT PUBLISH THE PLUGIN ANY MORE:

plerkle_messenger-https://crates.io/crates/plerkle_messenger
plerkle_serialization-https://crates.io/crates/plerkle_serialization

## Snapshot ETL

The Plerkle snapshot tool can be used for parsing Solana account snapshots. The repository already includes pre-configured geyser-config.json and etl-config.json files, which are ready to use. The only thing you might want to modify is the list of programs in geyser-config.json; otherwise, you can leave the configurations as they are.

Before running the tool, it's important to create an .env file, modeled after .env.example. In this file, you should specify the path to the directory containing the snapshots as well as the Plerkle messenger configuration.

Once everything is set up, you can build the Docker container for ETL by running:

```
make build
```

This will create a Docker container with the Geyser plugin and ETL fully built and ready to use.

The next step is to run the ETL:

```
make stream
```

This command will launch the ETL Docker container. It will load the snapshot archives, the Geyser plugin binary, and stream all the accounts from the snapshot to the plugin.
3 changes: 3 additions & 0 deletions plerkle_snapshot/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@

**`solana-snapshot-etl` efficiently extracts all accounts in a snapshot** to load them into an external system.

> [!IMPORTANT]
> This code is a fork of the [original repository](https://github.com/riptl/solana-snapshot-etl.git) and has been modified to diverge from the behavior of the original implementation.
## Motivation

Solana nodes periodically backup their account database into a `.tar.zst` "snapshot" stream.
Expand Down
2 changes: 2 additions & 0 deletions plerkle_snapshot/src/bin/solana-snapshot-etl/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,9 @@ fn _main() -> Result<(), Box<dyn std::error::Error>> {

#[derive(Deserialize)]
pub struct Config {
// path to the built Geyser binary
pub libpath: String,
// path to the Geyser config file
pub geyser_conf_path: String,
pub throttle_nanos: u64,
}
Expand Down

0 comments on commit bb1f361

Please sign in to comment.