feat: add command `download` to download public node snapshots #13598

lean-apple · 2024-12-30T18:05:50Z

Example of use :

cargo run --bin reth -- download --url https://downloads.merkle.io/reth-2024-10-23.tar.lz4

Starting snapshot download for chain: Chain::Named(Mainnet)
Target directory: "/Users/xxxx/Library/Application/reth/mainnet"
Source URL: https://downloads.merkle.io/reth-2024-10-23.tar.lz4
Downloading... 0.1%

…included

codspeed-hq · 2024-12-30T18:15:54Z

CodSpeed Performance Report

Merging #13598 will not alter performance

_{Comparing lean-apple:cli-download-public-node-snapshots (cba7ae2) with main (ac25fd8)}

Summary

✅ 77 untouched benchmarks

gakonst

Let's also maybe add a list of pre-set snapshot URLs so that the user doesn't need to find the URL themselves? And we can default to one of them, while the --help menu shows alternatives/fallbacks?

lean-apple · 2025-01-06T16:41:38Z

Let's also maybe add a list of pre-set snapshot URLs so that the user doesn't need to find the URL themselves? And we can default to one of them, while the --help menu shows alternatives/fallbacks?

What would you advise me to be able to fetch the same block height they are using in their url ? @matias-gonz @joshieDo
For example, in this url https://snapshots.publicnode.com/ethereum-holesky-reth-3087369.tar.lz4, we see it's3087369,
but if I use let latest_block = provider.get_block_number().await?; I got 3090694.

My aim is to build a default "dynamic" url with the corresponding block height with this pattern :
https://snapshots.publicnode.com/ethereum-{potential network}-reth-{block height}.tar.lz4.

crates/cli/commands/src/download.rs

joshieDo

Snapshots can be quite big (>2TB with an archival node).

The current PR would mean that we'd need double that space, since we're downloading everything first and then decompressing. Ideally we'd pipe what we download straight into the decompressing stage.

Basically replicating the following behaviour:
wget -O - https://downloads.merkle.io/reth-2025-01-06.tar.lz4 | tar -I lz4 -xvf -

joshieDo · 2025-01-07T19:49:04Z

Regarding merkle url, it's possible to find the latest archive with the following link: https://downloads.merkle.io/latest.txt

lean-apple · 2025-01-09T14:34:31Z

Snapshots can be quite big (>2TB with an archival node).

The current PR would mean that we'd need double that space, since we're downloading everything first and then decompressing. Ideally we'd pipe what we download straight into the decompressing stage.

Basically replicating the following behaviour: wget -O - https://downloads.merkle.io/reth-2025-01-06.tar.lz4 | tar -I lz4 -xvf -

Thanks, I've added this option --decompress that indeed makes the downloading not opitmized, will update the code.

lean-apple · 2025-01-13T15:38:01Z

crates/cli/commands/src/download.rs

+    #[arg(
+        long,
+        short,
+        help = "Custom URL to download the snapshot from",
+        long_help = "Specify a snapshot URL or let the command propose a default one.\n\
+        \n\
+        Available snapshot sources:\n\
+        - https://downloads.merkle.io (default, mainnet archive)\n\
+        - https://publicnode.com/snapshots (full nodes & testnets)\n\
+        \n\
+        If no URL is provided, the latest mainnet archive snapshot\n\
+        will be proposed for download from merkle.io"
+    )]


So right now @joshieDo @gakonst

reth download

will check if an url is provided,
if not, it will propose a default url built from https://downloads.merkle.io/latest.txt

no need to propose/ask for further user input. if no --url is provided then default to the merkle one

joshieDo · 2025-01-14T17:12:48Z

crates/cli/commands/src/download.rs

+fn spawn_tar_process(target_dir: &Path, lz4_stdout: Stdio) -> Result<Child> {
+    Ok(ProcessCommand::new("tar")
+        .arg("-xf")
+        .arg("-") // Read from stdin
+        .arg("-C")
+        .arg(target_dir)
+        .stdin(lz4_stdout)
+        .stderr(Stdio::inherit())
+        .spawn()?)
+}


could this be using the lz4 and tar crates instead? we shouldnt be depending on external binaries imo

joshieDo · 2025-01-14T17:17:06Z

crates/cli/commands/src/download.rs

+        );
+
+        stream_and_extract(&url, data_dir.data_dir()).await?;
+        info!("Snapshot downloaded and extracted successfully");


Suggested change

info!("Snapshot downloaded and extracted successfully");

info!(target: "reth::cli", "Snapshot downloaded and extracted successfully");

and in other places

lean-apple added 2 commits December 30, 2024 19:04

feat: add v1 for execute fn with url necessary and decompression not …

1cb2f46

…included

feat: add v1 for execute fn with url necessary

3c45e96

lean-apple added 6 commits January 3, 2025 16:17

feat: add decompression file option to cli

d82392e

docs: add command documentation

9a134cc

docs: improve code doc + cli displayed

f27faf4

fix: typo

fd8f9f2

fix: complete doc

8b05e7c

fix: complete doc

01f2206

gakonst reviewed Jan 6, 2025

View reviewed changes

lean-apple added 3 commits January 6, 2025 12:03

fix: fix doc fmt

337e819

fix: fix doc fmt

48ac0a8

docs: add download cli ref to global summary

d7ad3ee

lean-apple marked this pull request as ready for review January 6, 2025 16:37

lean-apple requested review from onbjerg and mattsse as code owners January 6, 2025 16:37

joshieDo reviewed Jan 7, 2025

View reviewed changes

crates/cli/commands/src/download.rs Outdated Show resolved Hide resolved

joshieDo reviewed Jan 7, 2025

View reviewed changes

crates/cli/commands/src/download.rs Outdated Show resolved Hide resolved

joshieDo reviewed Jan 7, 2025

View reviewed changes

crates/cli/commands/src/download.rs Outdated Show resolved Hide resolved

joshieDo requested changes Jan 7, 2025

View reviewed changes

lean-apple added 7 commits January 9, 2025 15:46

Merge branch 'main' into cli-download-public-node-snapshots

74950e9

refactor: update most of println with tracing::info except one

c7e263d

refactor: remove decompress option

3c9e7b4

chore: fmt

4f0c5c7

refactor: replace tokyo::fs by std::fs

bc6fde2

refactor: draft working version with both command threads

b4a894a

refactor: refactor stream and extract fn

7b68991

lean-apple added 5 commits January 13, 2025 10:50

refactor: make --url option with default mainnet archive snapshot url

8fe0ed3

docs: complete url long help documentation

0fe6e3e

docs: update book help

b2b146a

Merge branch 'main' into cli-download-public-node-snapshots

b1dd028

docs: remove --decompress

145f6c0

lean-apple changed the title ~~feat: add cli download to download public node snapshots~~ feat: add command download to download public node snapshots Jan 13, 2025

docs: update command description

cba7ae2

lean-apple commented Jan 13, 2025

View reviewed changes

joshieDo reviewed Jan 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add command `download` to download public node snapshots #13598

feat: add command `download` to download public node snapshots #13598

lean-apple commented Dec 30, 2024 •

edited

Loading

codspeed-hq bot commented Dec 30, 2024 •

edited

Loading

gakonst left a comment

lean-apple commented Jan 6, 2025

joshieDo left a comment

joshieDo commented Jan 7, 2025 •

edited

Loading

lean-apple commented Jan 9, 2025 •

edited

Loading

lean-apple Jan 13, 2025 •

edited

Loading

joshieDo Jan 14, 2025 •

edited

Loading

joshieDo Jan 14, 2025

joshieDo Jan 14, 2025

	info!("Snapshot downloaded and extracted successfully");
	info!(target: "reth::cli", "Snapshot downloaded and extracted successfully");

feat: add command download to download public node snapshots #13598

Are you sure you want to change the base?

feat: add command download to download public node snapshots #13598

Conversation

lean-apple commented Dec 30, 2024 • edited Loading

codspeed-hq bot commented Dec 30, 2024 • edited Loading

CodSpeed Performance Report

Merging #13598 will not alter performance

Summary

gakonst left a comment

Choose a reason for hiding this comment

lean-apple commented Jan 6, 2025

joshieDo left a comment

Choose a reason for hiding this comment

joshieDo commented Jan 7, 2025 • edited Loading

lean-apple commented Jan 9, 2025 • edited Loading

lean-apple Jan 13, 2025 • edited Loading

Choose a reason for hiding this comment

joshieDo Jan 14, 2025 • edited Loading

Choose a reason for hiding this comment

joshieDo Jan 14, 2025

Choose a reason for hiding this comment

joshieDo Jan 14, 2025

Choose a reason for hiding this comment

feat: add command `download` to download public node snapshots #13598

feat: add command `download` to download public node snapshots #13598

lean-apple commented Dec 30, 2024 •

edited

Loading

codspeed-hq bot commented Dec 30, 2024 •

edited

Loading

joshieDo commented Jan 7, 2025 •

edited

Loading

lean-apple commented Jan 9, 2025 •

edited

Loading

lean-apple Jan 13, 2025 •

edited

Loading

joshieDo Jan 14, 2025 •

edited

Loading