Reformatting Sentinel-1 burstDB for Cloud-optimized reading
Sentinel-1 Burst ID Map, version 20220530, generated by the SAR-MPC service https://sar-mpc.eu/test-data-sets/
1,127,661 polygons representing all Sentinel-1 ascending and descending IW mode bursts
Versions are added as v1 release artifacts in this repository
Below are creation notes using GDAL and basic stats
wget -nc https://sar-mpc.eu/files/S1_burstid_20220530.zip
# 506M Jan 30 2023 S1_burstid_20220530.zip
unzip S1_burstid_20220530.zip
ls -ltrh S1_burstid_20220530/IW/sqlite/burst_map_IW_000001_375887.sqlite3
# 279M Jun 8 2022 S1_burstid_20220530/IW/sqlite/burst_map_IW_000001_375887.sqlite3
gdalinfo --version
# GDAL 3.8.4, released 2024/02/08
ogrinfo S1_burstid_20220530/IW/sqlite/burst_map_IW_000001_375887.sqlite3 burst_id_map -so
INFO: Open of `S1_burstid_20220530/IW/sqlite/burst_map_IW_000001_375887.sqlite3'
using driver `SQLite' successful.
Layer name: burst_id_map
Geometry: 3D Multi Polygon
Feature Count: 1127661
Extent: (-180.000000, -78.665277) - (180.000000, 87.393141)
Layer SRS WKT:
GEOGCRS["WGS 84",
DATUM["World Geodetic System 1984",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
CS[ellipsoidal,2],
AXIS["geodetic latitude (Lat)",north,
ORDER[1],
ANGLEUNIT["degree",0.0174532925199433]],
AXIS["geodetic longitude (Lon)",east,
ORDER[2],
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4326]]
Data axis to CRS axis mapping: 2,1
FID Column = OGC_FID
Geometry Column = GEOMETRY
burst_id: Integer (0.0)
subswath_name: String (0.0)
relative_orbit_number: Integer (0.0)
time_from_anx_sec: Real (0.0)
orbit_pass: String (0.0)
Note Geometry: 3D Multi Polygon
: Multi polygons are split to deal with dateline crossing. We can save space by simplifying to 2D geometries.
Seek-Optimized Zip (SoZIP) https://github.com/sozip/sozip-spec is used for GPKG and FlatGeoBuf to reduce total storage size while also minimizing data transfer when reading directly from a URL.
time ogr2ogr -f GPKG -dim 2 burst_id_map.gpkg S1_burstid_20220530/IW/sqlite/burst_map_IW_000001_375887.sqlite3
# 6.79s user 1.07s system 102% cpu 7.693 total
# 299M Apr 12 15:38 burst_id_map.gpkg
time sozip burst_id_map.gpkg.zip burst_id_map.gpkg
# 9.85s user 0.60s system 811% cpu 1.288 total
# 147M Apr 12 15:39 burst_id_map.gpkg.zip
time ogr2ogr -f FlatGeobuf -dim 2 burst_id_map.fgb S1_burstid_20220530/IW/sqlite/burst_map_IW_000001_375887.sqlite3
# 4.81s user 1.13s system 95% cpu 6.239 total
# 339M Apr 12 15:41 burst_id_map.fgb
time sozip burst_id_map.fgb.zip burst_id_map.fgb
# 9.08s user 0.61s system 759% cpu 1.276 total
# 150M Apr 12 15:41 burst_id_map.fgb.zip
Expect much better read performance for geoparquet >= 1.1 and GDAL >= 3.9 (opengeospatial/geoparquet#188 (comment))
https://gdal.org/drivers/vector/parquet.html
conda install -c conda-forge libgdal-arrow-parquet
TODO: experiment with creation options & compression
SORT_BY_BBOX=YES
time ogr2ogr -f Parquet -dim 2 burst_id_map.parquet S1_burstid_20220530/IW/sqlite/burst_map_IW_000001_375887.sqlite3
# 2.97s user 0.43s system 57% cpu 5.954 total
# 131M Apr 12 15:53 burst_id_map.parquet