Skip to content

An R Package for the Maximum Likelihood Evaluation on Large-Scale Spatial Datasets using Many-core Systems.

License

Notifications You must be signed in to change notification settings

ecrc/exageostatR

Repository files navigation

ExaGeoStatR

ExaGeoStatR is an R-Wrapper for [ExaGeoStat framework]((https://github.com/ecrc/exageostat), a parallel high performance unified software for geostatistics on manycore systems.

ExaGeoStatR v1.2.0

  1. Large-scale synthetic Geostatistics data generator.
  2. Support exact computation of the Maximum Likelihood Estimation (MLE) function using shared-memory, GPUS, or distributed-memory systems.
  3. Support exact prediction
  4. Support approximate computation (i.e., Diagonal Super-Tile (DST) and Tile Low-Rank (TLR) of the Maximum Likelihood Estimation (MLE) function using shared-memory, GPUS, or distributed-memory systems.

Getting Started

Installation

Software dependencies

  1. BLAS/CBLAS/LAPACK/LAPACKE optimized implementation, ex., AMD Core Math Library (ACML), Arm Performance Libraries, ATLAS, Intel Math Kernel Library (MKL), or OpenBLAS.
  2. Portable Hardware Locality (hwloc).
  3. NLopt.
  4. GNU Scientific Library (GSL).
  5. StarPU.
  6. Chameleon.
  7. HiCMA.

All these dependencies are automatically installed with the package if not exist (OpenBLAS is the default BLAS library) on the system (ExaGeoStatR v1.2.0).

Prerequisites

We recommend you install these libraries before beginning to ensure you get all of them while using the R examples.

For installation, type at the R prompt:

install.packages("devtools")

For installation, type at the R prompt:

install.packages("geoR")

Or

  1. Download the latest geoR version (*.tar.gz) from http://www.leg.ufpr.br/geoR

  2. Install from the linux prompt (with root/sudo permissions) replacing "*" below by the current version number.

R CMD INSTALL geoR*.tar.gz

For installation, type at the R prompt:

install.packages("fields", dependencies = TRUE)

For installation, type at the R prompt:

install.packages("spam")

For installation, type at the R prompt:

install.packages("GpGp")

Install latest ExaGeoStatR version hosted on GitHub (parallel installation)

library("devtools")
Sys.setenv(MKLROOT="/opt/intel/mkl")
install_git(url="https://github.com/ecrc/exageostatR")

Install latest ExaGeoStatR version hosted on GitHub (sequential installation)

library("devtools")
Sys.setenv(MKLROOT="/opt/intel/mkl")
Sys.setenv(MAKE="make -j 1")
install_git(url="https://github.com/ecrc/exageostatR")

Install latest ExaGeoStatR version hosted on GitHub with GPU support

library("devtools")
Sys.setenv(MKLROOT="/opt/intel/mkl")
install_git(url="https://github.com/ecrc/exageostatR", configure.args=C('--enable-cuda'))

Install latest ExaGeoStatR version hosted on GitHub with MPI support

library("devtools")
Sys.setenv(MKLROOT="/opt/intel/mkl")
install_git(url="https://github.com/ecrc/exageostatR", configure.args=C('--enable-mpi'))

Get the latest ExaGeoStatR release hosted on GitHub)

  1. Download exageostat_1.2.0.tar.gz from release)
  2. Use R to install exageostat_1.2.0.tar.gz)
install.packages(repos=NULL, "exageostat_1.2.0.tar.gz")

Install latest ExaGeoStatR version hosted on GitHub with MPI support

library("devtools")
Sys.setenv(MKLROOT="/opt/intel/mkl")
install_git(url="https://github.com/ecrc/exageostatR", configure.args=C('--enable-mpi'))

Features of ExaGeoStatR

Operations:

  1. Generate synthetic spatial datasets (i.e., locations & environmental measurements).
  2. Maximum likelihood evaluation using dense matrices.
  3. Maximum likelihood evaluation using compressed matrices based on Tile Low-Rank(TLR).
  4. Maximum likelihood evaluation using matrices based on Diagonal Super-Tile(DST).
  5. Predicting missing values on predefined spatial locations.

Supported Covariance Functions

  1. Univariate Matérn (Gaussian/Stationary)
  2. Univariate Matérn with Nugget (Gaussian/Stationary)
  3. Flexible Bivariate Matérn (Gaussian/Stationary)
  4. Parsimonious Bivariate Matérn (Gaussian/Stationary)
  5. Parsimonious trivariate Matérn (Gaussian/Stationary)
  6. Univariate Space/Time Matérn (Gaussian/Stationary)
  7. Bivariate Space/Time Matérn (Gaussian/Stationary)
  8. Tukey g-and-h Univariate Matérn (non-Gaussian/Stationary)
  9. Tukey g-and-h Univariate Power Exponential (non-Gaussian/Stationary)

R Examples

User can find many test examples in tests directory.

Example of Batch Job Script to Submit an R Script to Distributed Environment

#!/bin/bash
#SBATCH --job-name=job_name
#SBATCH --output=output_file.txt
#SBATCH --partition=XXXX
#SBATCH --nodes=4
#SBATCH --ntasks=4
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=31
#SBATCH --time 00:30:00

# RExample.r includes one of the examples in the `tests` directory.
srun Rscript RExample.r

Known Issues

Intel MKL FATAL ERROR

Error
Intel MKL FATAL ERROR: Cannot load libmkl_avx512.so or libmkl_def.so

or

symbol lookup error: /opt/PATH_TO_MKL/lib/intel64/libmkl_intel_thread.so: undefined symbol: __kmpc_global_thread_num

Solution

Intel community solution

We recommend the following solution:

  1. Navigate to installed mkl directory and find the exact path of libmkl_def.so
  2. Export the following command.
export LD_PRELOAD=/PATH_TO_MKL/lib/intel64/libmkl_def.so:/PATH_TO_MKL/lib/intel64/libmkl_avx2.so:/PATH_TO_MKL/lib/intel64/libmkl_core.so:/PATH_TO_MKL/lib/intel64/libmkl_intel_lp64.so:/PATH_TO_MKL/lib/intel64/libmkl_intel_thread.so:/PATH_TO_MKL/lib/intel64/libiomp5.so

Notes

  1. The data directory includes datasets from the "Competition on Spatial Statistics for Large Datasets" manuscript (here).
  2. GeoR, Fields, spam, and GpGp packages are only required to run some examples related to the benchmarking framework.

About

An R Package for the Maximum Likelihood Evaluation on Large-Scale Spatial Datasets using Many-core Systems.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages