Skip to content

Commit

Permalink
Merge pull request #140 from nerc-project/gpu_drivers_setup
Browse files Browse the repository at this point in the history
information on how to setup GPU drivers
  • Loading branch information
jtriley authored Oct 18, 2023
2 parents d2318ec + 40f911c commit 8a0b923
Show file tree
Hide file tree
Showing 4 changed files with 75 additions and 7 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -415,7 +415,7 @@ as shown below:
!!! info "Default User name based on OS"
- **all Ubuntu images**: ubuntu
- **all CentOS images**: centos
- **all Rocky Linux images**: centos
- **all Rocky Linux images**: rocky
- **all Fedora images**: fedora
- **all Debian images**: debian
- **all RHEL images**: cloud-user
Expand Down Expand Up @@ -486,7 +486,7 @@ for more information.
!!! info "Default User name based on OS"
- **all Ubuntu images**: ubuntu
- **all CentOS images**: centos
- **all Rocky Linux images**: centos
- **all Rocky Linux images**: rocky
- **all Fedora images**: fedora
- **all Debian images**: debian
- **all RHEL images**: cloud-user
Expand Down Expand Up @@ -549,7 +549,7 @@ for more information.
!!! info "Default User name based on OS"
- **all Ubuntu images**: ubuntu
- **all CentOS images**: centos
- **all Rocky Linux images**: centos
- **all Rocky Linux images**: rocky
- **all Fedora images**: fedora
- **all Debian images**: debian
- **all RHEL images**: cloud-user
Expand Down
61 changes: 59 additions & 2 deletions docs/openstack/create-and-connect-to-the-VM/flavors.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,8 +52,8 @@ memory with default of 20 GB root disk at a rate of $0.026 / hr of wall time.
is integrated into a specialized hardware such as GPUs that produce unprecedented
performance boosts for technical computing workloads.

There are two flavors within the GPU tier, one featuring the newer
**NVidia A100s** along with **NVidia V100s**.
There are three flavors within the GPU tier, one featuring the newer
**NVidia A100s** along with **NVidia V100s** and **NVidia K80s**.

The **"gpu-su-a100"** flavor is provided from Lenovo SR670 (2x Intel 8268 2.9 GHz,
48 cores, 384 GB memory, 4x NVidia A100) servers. These latest GPUs deliver
Expand All @@ -66,6 +66,25 @@ wall time.
|gpu-su-a100.1 |1 |1 |24 |95 |20 |$1.803 |
|gpu-su-a100.2 |2 |2 |48 |190 |20 |$3.606 |

!!! note "How to setup NVIDIA driver for **"gpu-su-a100"** flavor based VM?"
After launching a VM with an **NVidia A100** GPU flavor, you will need to
setup the NVIDIA driver in order to use GPU-based codes and libraries.
Please run the following commands to setup the NVIDIA driver and CUDA
version required for these flavors in order to execute GPU-based codes.
**Note:** These commands are **ONLY** applicable for the VM based on
"**ubuntu-22.04-x86_64**" image. You might need to find corresponding
packages for your own OS of choice.

sudo apt update
sudo apt -y install nvidia-driver-495
# Just click *Enter* if any popups appear!
# Confirm and verify that you can see the NVIDIA device attached to your VM
lspci | grep -i nvidia
# 0:05.0 3D controller: NVIDIA Corporation GA100 [A100 PCIe 40GB] (rev a1)
sudo reboot
# SSH back to your VM and then you will be able to use nvidia-smi command
nvidia-smi

The **"gpu-su-v100"** flavor is provided from Dell R740xd (2x Intel Xeon Gold 6148,
40 core, 768GB memory, 1x NVidia V100) servers. The base unit is 48 vCPU, 192 GB
memory with default of 20 GB root disk at a rate of $1.214 / hr of wall time.
Expand All @@ -74,6 +93,25 @@ memory with default of 20 GB root disk at a rate of $1.214 / hr of wall time.
|---------------|-----|-----|-------|---------|-------------|-----------|
|gpu-su-v100.1 |1 |1 |48 |192 |20 |$1.214 |

!!! note "How to setup NVIDIA driver for **"gpu-su-v100"** flavor based VM?"
After launching a VM with an **NVidia V100** GPU flavor, you will need to
setup the NVIDIA driver in order to use GPU-based codes and libraries.
Please run the following commands to setup the NVIDIA driver and CUDA
version required for these flavors in order to execute GPU-based codes.
**Note:** These commands are **ONLY** applicable for the VM based on
"**ubuntu-22.04-x86_64**" image. You might need to find corresponding
packages for your own OS of choice.

sudo apt update
sudo apt -y install nvidia-driver-470
# Just click *Enter* if any popups appear!
# Confirm and verify that you can see the NVIDIA device attached to your VM
lspci | grep -i nvidia
# 00:05.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 PCIe 32GB] (rev a1)
sudo reboot
# SSH back to your VM and then you will be able to use nvidia-smi command
nvidia-smi

The **"gpu-su-k80"** flavor is provided from Supermicro X10DRG-H (2x Intel
E5-2620 2.40GHz, 24 core, 128GB memory, 4x NVidia K80) servers. The base unit
is 6 vCPU, 31 GB memory with default of 20 GB root disk at a rate of $0.463 /
Expand All @@ -85,6 +123,25 @@ hr of wall time.
|gpu-su-k80.2 |2 |2 |12 |62 |20 |$0.926 |
|gpu-su-k80.4 |4 |4 |24 |124 |20 |$1.852 |

!!! note "How to setup NVIDIA driver for **"gpu-su-k80"** flavor based VM?"
After launching a VM with an **NVidia K80** GPU flavor, you will need to
setup the NVIDIA driver in order to use GPU-based codes and libraries.
Please run the following commands to setup the NVIDIA driver and CUDA
version required for these flavors in order to execute GPU-based codes.
**Note:** These commands are **ONLY** applicable for the VM based on
"**ubuntu-22.04-x86_64**" image. You might need to find corresponding
packages for your own OS of choice.

sudo apt update
sudo apt -y install nvidia-driver-470
# Just click *Enter* if any popups appear!
# Confirm and verify that you can see the NVIDIA device attached to your VM
lspci | grep -i nvidia
# 00:05.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
sudo reboot
# SSH back to your VM and then you will be able to use nvidia-smi command
nvidia-smi

!!! question "NERC IaaS Storage Tiers Cost"
Storage both **OpenStack Swift (object storage)** and
**Cinder (block storage/ volumes)** are charged separately at a rate of
Expand Down
13 changes: 12 additions & 1 deletion docs/openstack/create-and-connect-to-the-VM/launch-a-VM.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,9 +38,20 @@ is 1.

![VM Launch Instance Source](images/launch_source.png)

!!! info "How to override the flavor's Default root disk volume size"
If you don't specify custom value for the "**Volume Size (GB)**", that will
be set to the root disk size of your selected Flavor. For more about the
default root disk size you can refer to [this documentation](flavors.md).
We can override this value by entering our own custom value (in GB) and that
is available as a Volume that is attach to the instance to enable persistent
storage.

!!! danger "Important Note"
- To create an image that uses the boot volume sized according to the flavor
ensure that "No" is selected under the "Create New Volume" section.
ensure that "No" is selected under the "Create New Volume" section. This will
**NOT** create a persistant block storage in the form of Volume can be risky
so we recommend to frequently take a snapshot of the running instance in case
you want to recover some important state of your instance.

- When you deploy a non-ephemeral instance (i.e. Creating a new volume), and
indicate "Yes" in "Delete Volume on Instance delete", then when you delete
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ Default usernames for all the base images are:

- **all Ubuntu images**: ubuntu
- **all CentOS images**: centos
- **all Rocky Linux images**: centos
- **all Rocky Linux images**: rocky
- **all Fedora images**: fedora
- **all Debian images**: debian
- **all RHEL images**: cloud-user
Expand Down

0 comments on commit 8a0b923

Please sign in to comment.