Skip to content

phenology/infrastructure-AWS-EMR

Repository files navigation

Amazon EMR with Dask

Scripts and documentation on how to deploy an Amazon EMR cluster with Dask for running the high spatial resolution modelling can be found here.

  • spark-emr-hello-world is a quick-start example on setting up an Amazon EMR cluster with Spark using AWS CLI and running a test job.
  • dask-emr-hello-world is a quick-start example on setting up an Amazon EMR cluster with Dask using AWS CLI and running a test job.
  • conda-pack Setting up an Amazon EMR cluster with pre-installed packages
  • dask-ecs Setting up Dask with Amazon ECS and Docker
  • dask-emr-terraform Terraform module for Amazon EMR with Dask
  • cgc-test contains a test analysis notebook and illustrates how to set up the environment.
aws ce get-cost-and-usage --time-period Start=2021-03-01,End=2021-03-31 --metrics "BlendedCost" "UnblendedCost" "UsageQuantity" --granularity MONTHLY

Example pricing:

  • The price for an on-demand m5.xlarge EC2 instance is 0.192 USD/hour. There is an additional EMR price of 0.048 USD/hour. For a 3-node EMR cluster, the total price is 0.72 USD/hour or 17.28 USD/day.
  • The price for an on-demand m5.24xlarge EC2 instance is 4.608 USD/hour. There is an additional EMR price of 0.27 USD/hour. For a 3-node EMR cluster, the total price is 14.63 USD/hour or 351.22 USD/day.

About

Scripts to setup Amazon EMR cluster

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published