Skip to content

Commit

Permalink
Add SMCE_Basics nb with file upload
Browse files Browse the repository at this point in the history
  • Loading branch information
michele-ornl authored Oct 8, 2024
1 parent 0fb7fe0 commit 5daac50
Showing 1 changed file with 337 additions and 0 deletions.
337 changes: 337 additions & 0 deletions book/tutorials/NotebookBasics/SMCE_Basics.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,337 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "6513f2b1-40c2-48ef-a7ef-66a9b7bf7930",
"metadata": {},
"source": [
"# BioSCape SMCE Basics\n",
"## <center>BioSCape Data Skills Workshop: From the Field to the Image</center>\n",
"![Bioscape](images/121229-87.png)\n",
"\n",
"[**BioSCape**](https://www.bioscape.io/), the Biodiversity Survey of the Cape, is NASA’s first biodiversity-focused airborne and field campaign that was conducted in South Africa in 2023. BioSCape’s primary objective is to study the structure, function, and composition of the region’s ecosystems, and how and why they are changing. \n",
"\n",
"BioSCape's airborne dataset is unprecedented, with `AVIRIS-NG`, `PRISM`, and `HyTES` imaging spectrometers capturing spectral data across the UV, visible and infrared at high resolution and `LVIS` acquiring coincident full-waveform lidar. BioSCape's `field dataset` is equally impressive, with 18 PI-led projects collecting data ranging from the diversity and phylogeny of plants, kelp and phytoplankton, eDNA, landscape acoustics, plant traits, blue carbon accounting, and more\n",
"\n",
"This workshop will equip participants with the skills to find, subset, and visualize the various BioSCape field and airborne (imaging spectroscopy and full-waveform lidar) data sets. Participants will learn data skills through worked examples in terrestrial and aquatic ecosystems, including: wrangling lidar data, performing band math calculations, calculating spectral diversity metrics, machine learning and image classification, and mapping functional traits using partial least squares regression. The workshop format is a mix of expert talks and interactive coding notebooks and will be run through the BioSCape Cloud computing environment.\n",
"\n",
"**Date:** October 9 - 11, 2024 Cape Town, South Africa</center>\n",
"\n",
"**Host:** NASA’s Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC), in close collaboration with BioSCape, the South African Environmental Observation Network (SAEON), the University of Wisconsin Madison (Phil Townsend), The Nature Conservancy (Glenn Moncrieff), the University of California Merced (Erin Hestir), the University of Cape Town (Jasper Slingsby), Jet Propulsion Laboratory (Kerry Cawse-Nicholson), and UNESCO.\n",
"\n",
"**Instructors:** \n",
"- In-person contributors: Anabelle Cardoso, Erin Hestir, Phil Townsend, Henry Frye, Glenn Moncrieff, Jasper Slingsby, Michele Thornton, Rupesh Shrestha\n",
"- Virtual contributors: Kerry Cawse-Nicholson, Nico Stork, Kyle Kovach\n",
"\n",
"**Audience:** This training is primarily intended for government natural resource management agency representatives and field technicians in South Africa, as well as local academics and students, especially those connected to the BioSCape Team. \n",
"\n",
"----"
]
},
{
"cell_type": "markdown",
"id": "5b72bf55-b854-4d67-94c4-fd33966fa5df",
"metadata": {},
"source": [
"## Overview \n",
"----\n",
"This tutorial will explore BioSCape Science Managed Cloud Environment (SMCE) including how to access and explore using Python methods."
]
},
{
"cell_type": "markdown",
"id": "0e6538e7-086e-4fbc-bb7a-21663154a936",
"metadata": {},
"source": [
"### Load Python Modules"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "63e38fc4-363c-4701-ac33-a9e8bc98cf05",
"metadata": {},
"outputs": [],
"source": [
"import s3fs\n",
"s3 = s3fs.S3FileSystem(anon=False)"
]
},
{
"cell_type": "markdown",
"id": "e224b4ae-e7df-4149-bdfe-7067220046f7",
"metadata": {},
"source": [
"## Explore the BioSCape SMCE S3 data holdings\n",
"\n",
"Let's start by exploring the BioSCape Airborne data currently available on the cloud in Amazon Storage. This AWS S3 storage is specific to the SMCE that provides data access and analytics environment to the BioSCape Science Team as well as interested researchers. \n",
"We'll learn how to mount the S3 object storage on our local environment, as well as how to bring other data to the analysis.\n",
"\n",
"- **SMCE** = Science Managed Cloud Environment\n",
"- **S3** = Amazon Simple Storage Service (S3) is a cloud storage service that allows users to store and retrieve data\n",
"- **S3 Bucket** = Buckets are the basic containers that hold data. Buckets can be likened to file folders and object storage\n",
"- **S3Fs** is a `Pythonic` open source tool that mounts S3 object storage locally. S3Fs provides a filesystem-like interface for accessing objects on S3.\n",
" - The top-level class **`S3FileSystem`** holds connection information and allows typical file-system style operations like `ls`, `cp`, `mv`\n",
" - `ls` is a UNIX command to list computer files and directories"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e9d5ecaf-04b1-4ad5-b39d-a55b471bcdc2",
"metadata": {},
"outputs": [],
"source": [
"# Use S3Fs to list the BioSCape data on the BioSCape SMCE S3 storaage\n",
"files = s3.ls('bioscape-data/')\n",
"files"
]
},
{
"cell_type": "markdown",
"id": "2627ce44-b4ac-410c-8bf2-4def61ee4437",
"metadata": {},
"source": [
"## BioSCape Flight Data\n",
"\n",
"The BioSCape Campaign (Oct - Nov, 2023) flew 4 airborne instruments on 2 aircraft. \n",
"\n",
"- `Gulfstream 3`: **AVIRIS-Next Generation** and **PRISM**\n",
"- `Gulfstream 5`: **HyTES** and **LVIS**\n",
"\n",
"The NASA Jet Propulsion Laboratory provides the **BioSCape Data Portal** (https://popo.jpl.nasa.gov/mmgis-aviris/?mission=BIOSCAPE) which shows flight line visualization, quick look images, and BioSCape Team Regions of Interest (ROIs). Download access to preliminary airborne data is available."
]
},
{
"cell_type": "markdown",
"id": "575a612b-1ef5-40ec-b016-1cd2c98b2f29",
"metadata": {},
"source": [
"### Let's look deeper into each airborne folder on the SMCE\n",
"#### **AVIRIS-Next Generation (AVNG)**\n",
"- We'll spend a little more time looking into the AVIRIS-NG files as these data are a focus of many Notebooks"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "689c7069-7bb0-4fa9-a500-b8ac629bc1ec",
"metadata": {},
"outputs": [],
"source": [
"AVNG_flightlines = s3.ls('bioscape-data/AVNG/')\n",
"AVNG_flightlines[:10]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6b12d72d-f2e7-428a-9e2a-a9daeaa8d0de",
"metadata": {},
"outputs": [],
"source": [
"AVNGfl_count = len(AVNG_flightlines)\n",
"AVNGfl_count"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "df46a7e3-5846-4548-9b57-d068f223db8e",
"metadata": {},
"outputs": [],
"source": [
"# looking into the ang20231022t092801 folder\n",
"AVNG_scenes = s3.ls('bioscape-data/AVNG/ang20231022t092801')\n",
"AVNG_scenes"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c20fbbd2-b932-446d-8981-baae543bca79",
"metadata": {},
"outputs": [],
"source": [
"# Explore the AVNG folder that holds an AVNG scene's data\n",
"AVNG_scene_data = s3.ls('bioscape-data/AVNG/ang20231022t092801/ang20231022t092801_000')\n",
"AVNG_scene_data"
]
},
{
"cell_type": "markdown",
"id": "783776a6-a849-48f1-8d11-a50351817b97",
"metadata": {},
"source": [
"#### File **Naming Conventions** provide information about each flight line, scene, and data type.\n",
"\n",
"- **`L1B`** are AVIRIS-NG Surface Radiance\n",
"- **`L2A`** are AVIRIS-NG Surface Reflectance\n",
"\n",
"| Dataset | Description | \n",
"| -------- | --- |\n",
"| *_RFL_ORT_QL.tif | Reflectance Quick Look Image (3 band) |\n",
"| *-RFL_ORT | Reflectance ENVI binary file (425 band)|\n",
"| *_RFL_ORT.hdr | Reflectance ENVI header file (txt file)|\n",
"\n",
"<center><img src=\"images/ANG_naming.jpg\"/></center>"
]
},
{
"cell_type": "markdown",
"id": "b8b7d7f6-6acd-4361-8e21-c3067fdae062",
"metadata": {},
"source": [
"#### Open a file from an S3 Bucket - S3Fs\n",
"- Calling `open()` on a **`S3FileSystem`** (typically using a context manager) provides an S3File for read or write access to a particular key.\n",
"- can be used with other projects that consume the file interface like `gzip` or `pandas`."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8cd9871f-6ae7-4698-b2f6-8b3c50c71fd6",
"metadata": {},
"outputs": [],
"source": [
"# Print the first 12 lines of and ENVI header file\n",
"hdr_link = 'bioscape-data/AVNG/ang20231022t092801/ang20231022t092801_000/ang20231022t092801_000_L2A_OE_main_27577724_RFL_ORT.hdr'\n",
"with s3.open(hdr_link, mode='r') as hdr:\n",
" for i in range(12):\n",
" line = next(hdr).strip()\n",
" print(line) "
]
},
{
"cell_type": "markdown",
"id": "7fc574ee-d9d7-42d7-ad1e-37f916610f5a",
"metadata": {},
"source": [
"#### **Portable Remote Imaging Spectrometer (PRISM)**"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "daebaac1-6bdd-448b-8b4e-073767a75d12",
"metadata": {},
"outputs": [],
"source": [
"PRISM_flightlines = s3.ls('bioscape-data/PRISM')\n",
"PRISM_flightlines"
]
},
{
"cell_type": "markdown",
"id": "f8b40218-fb6e-4718-bc0f-ecd446c771f1",
"metadata": {},
"source": [
"#### There are Level 2 (L2) for PRISM data in the SCME\n",
"- ENVI file format in binary/header pairs"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0c9dc196-ece2-4b02-ae12-cad2db9460db",
"metadata": {},
"outputs": [],
"source": [
"PRISM_flightlines = s3.ls('bioscape-data/PRISM/L2')\n",
"PRISM_flightlines[:10]"
]
},
{
"cell_type": "markdown",
"id": "7c4d3699-a044-49cf-99f4-e91a1cb753ab",
"metadata": {},
"source": [
"- the PRISM are available in ENVI file formats as binary/header pairs"
]
},
{
"cell_type": "markdown",
"id": "71e4e9dc-aaca-4329-baaf-af10f79fd764",
"metadata": {},
"source": [
"#### **Land, Vegetation, and Ice Sensor (LVIS)**"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "562510bd-3502-4c4e-b52c-c124ec042cec",
"metadata": {},
"outputs": [],
"source": [
"LVIS_flightlines = s3.ls('bioscape-data/LVIS/')\n",
"LVIS_flightlines"
]
},
{
"cell_type": "markdown",
"id": "9fe61cdb-4e34-4658-8377-a216086001d9",
"metadata": {},
"source": [
"#### Let's look into the LVIS Level 2 (L2) data"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "eabb26dc-90ab-4109-b229-36cf64c70d38",
"metadata": {},
"outputs": [],
"source": [
"LVIS_flightlines_L2 = s3.ls('bioscape-data/LVIS/L2')\n",
"LVIS_flightlines_L2[:10]"
]
},
{
"cell_type": "markdown",
"id": "bdf4c503-9d69-4320-bd34-eebb5167b7a4",
"metadata": {},
"source": [
"- LVIS L2 data are ASCII Text files"
]
},
{
"cell_type": "markdown",
"id": "34273fbc-d0d3-4a82-96cb-4aed81f41974",
"metadata": {},
"source": [
"#### **Hyperspectral Thermal Emissions Spectrometer (HyTES)**\n",
"- BioSCape HyTES data is currently available from [**NASA JPL HyTES**](https://hytes.jpl.nasa.gov/order!) distribution site"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ec9707de-5bed-4b8d-ac7e-839eb29f944c",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "global-global-BioSCape",
"language": "python",
"name": "conda-env-global-global-BioSCape-py"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

0 comments on commit 5daac50

Please sign in to comment.