Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add Dockerfile and .yml for building conda env #528

Merged
merged 13 commits into from
Dec 21, 2023
10 changes: 6 additions & 4 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@ FROM --platform=linux/x86_64 continuumio/miniconda3:23.10.0-1

# Add files
ADD ./tutorials /home/deeprank2/tutorials
ADD ./env/environment.yml /home
ADD ./env/requirements.txt /home
ADD ./env/environment.yml /home/deeprank2
ADD ./env/requirements.txt /home/deeprank2

# Install
RUN \
Expand All @@ -17,8 +17,10 @@ RUN \
mv mkdssp-4.4.0-linux-x64 /usr/local/bin/mkdssp && \
chmod a+x /usr/local/bin/mkdssp && \
## Conda and pip deps
conda env create -f /home/environment.yml && \
conda env create -f /home/deeprank2/environment.yml && \
## Get the data for running the tutorials
if [ -d "/home/deeprank2/tutorials/data_raw" ]; then rm -Rf /home/deeprank2/tutorials/data_raw; fi && \
if [ -d "/home/deeprank2/tutorials/data_processed" ]; then rm -Rf /home/deeprank2/tutorials/data_processed; fi && \
gcroci2 marked this conversation as resolved.
Show resolved Hide resolved
wget https://zenodo.org/records/8349335/files/data_raw.zip && \
unzip data_raw.zip -d data_raw && \
mv data_raw /home/deeprank2/tutorials
Expand All @@ -31,4 +33,4 @@ ENV PATH /opt/conda/envs/deeprank2/bin:$PATH
WORKDIR /home/deeprank2

# Define default command
CMD ["bash"]
CMD ["jupyter", "notebook", "--ip=0.0.0.0", "--NotebookApp.token=''","--NotebookApp.password=''", "--allow-root"]
63 changes: 36 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,12 +36,14 @@ DeepRank2 extensive documentation can be found [here](https://deeprank2.rtfd.io/
- [Deeprank2](#deeprank2)
- [Overview](#overview)
- [Table of contents](#table-of-contents)
- [Installation](#installation)
- [Dockerfile](#dockerfile)
- [Non-pythonic dependencies](#non-pythonic-dependencies)
- [Pythonic dependencies](#pythonic-dependencies)
- [Installations](#installations)
- [Containerized Installation](#containerized-installation)
- [Local/remote installation](#localremote-installation)
- [Non-pythonic dependencies](#non-pythonic-dependencies)
- [Pythonic dependencies](#pythonic-dependencies)
- [Install DeepRank2](#install-deeprank2)
- [Test installation](#test-installation)
- [Contributing](#contributing)
- [Contributing](#contributing)
- [Data generation](#data-generation)
- [Datasets](#datasets)
- [GraphDataset](#graphdataset)
Expand All @@ -50,36 +52,39 @@ DeepRank2 extensive documentation can be found [here](https://deeprank2.rtfd.io/
- [Computational performances](#computational-performances)
- [Package development](#package-development)

## Installation
## Installations

Note that the package officially supports ubuntu-latest OS only, whose functioning is widely tested through the continuous integration workflows.
gcroci2 marked this conversation as resolved.
Show resolved Hide resolved

### Dockerfile
You can either install DeepRank2 in a [dockerized container](#containerized-installation), which will allow you to run our [tutorial notebooks](https://github.com/DeepRank/deeprank2/tree/main/tutorials), or you can [install the package locally](#localremote-installation).

In order to try out the package without worrying about your OS and without the need of installing all the required dependencies, we created a `Dockerfile` that can be used for taking care of everything in a suitable container. After having cloned the repository and installed [Docker](https://docs.docker.com/engine/install/), run the following commands from the root of the repository.
### Containerized Installation

In order to try out the package without worrying about your OS and without the need of installing all the required dependencies, we created a `Dockerfile` that can be used for taking care of everything in a suitable container. After having cloned the repository and installed [Docker](https://docs.docker.com/engine/install/), run the following commands (you may need to have sudo permission) from the root of the repository.

Build the Docker image:

```bash
docker build -t deeprank2 .
```

SSH to a running container:
Run the Docker container:

```bash
docker run -it --expose 3000 -p 3000:3000 deeprank2
docker run -p 8888:8888 deeprank2
```

Run the tutorials' notebooks from within the running container:
```bash
cd tutorials
jupyter notebook --ip 0.0.0.0 --no-browser --allow-root --port 3000
```
This assumes that your application inside the container is listening on port 8888, and you want to map it to port 8888 on your host machine. Open a browser and go to `http://localhost:8888` to access the application running inside the Docker container and run the tutorials' notebooks.

gcroci2 marked this conversation as resolved.
Show resolved Hide resolved
Now you can run the tutorials' notebook. More details about their content can be found [here](https://github.com/DeepRank/deeprank2/blob/main/tutorials/TUTORIAL.md). Note that in the docker container only the raw PDB files are downloaded, needed as a starting point for the tutorials. You can obtain the processed HDF5 files by running the `data_generation_xxx.ipynb` notebooks. Because Docker containers are limited in memory resources, we limit the number of data points processed in the tutorials'. Please install the package locally to fully leverage its capabilities.
More details about the tutorials' content can be found [here](https://github.com/DeepRank/deeprank2/blob/main/tutorials/TUTORIAL.md). Note that in the docker container only the raw PDB files are downloaded, needed as a starting point for the tutorials. You can obtain the processed HDF5 files by running the `data_generation_xxx.ipynb` notebooks. Because Docker containers are limited in memory resources, we limit the number of data points processed in the tutorials'. Please install the package locally to fully leverage its capabilities.

### Non-pythonic dependencies
After running the tutorials, you may want to remove the (quite large) Docker image from your machine. In this case, remember to [stop the container](https://docs.docker.com/engine/reference/commandline/stop/) and then [remove the image](https://docs.docker.com/engine/reference/commandline/image_rm/). More general information about Docker can be found on the [official website docs](https://docs.docker.com/get-started/).

Instructions are updated as of 14/09/2023.
### Local/remote installation

#### Non-pythonic dependencies

Instructions are up to date as of 14/09/2023.
gcroci2 marked this conversation as resolved.
Show resolved Hide resolved

Before installing deeprank2 you need to install some dependencies:
gcroci2 marked this conversation as resolved.
Show resolved Hide resolved

Expand All @@ -90,34 +95,39 @@ Before installing deeprank2 you need to install some dependencies:
* [GCC](https://gcc.gnu.org/install/)
* Check if gcc is installed: `gcc --version`. If this gives an error, run `sudo apt-get install gcc`.

### Pythonic dependencies
#### Pythonic dependencies

Instructions are updated as of 14/09/2023.
Instructions are up to date as of 14/09/2023.
gcroci2 marked this conversation as resolved.
Show resolved Hide resolved

Then, you can use the YML file we provide for creating a [conda environment](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html) containing the latest stable release of the package and all the other necessary conda and pip dependencies (CPU only, Python 3.10):

```bash
# Ensure you are in your base environment
conda activate
# Create the environment
conda env create -f env/environment.yml
# Activate the environment
conda activate deeprank2
```

Alternatively, if you are a MacOS user, if the .YML file installation is not successfull, or if you want to use CUDA or Python 3.11, you can install each dependency separately, and then the latest stable release of the package using the PyPi package manager. Also in this case, we advise to use a [conda environment](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html). In case of issues during installation you should always refer to the official documentation which is linked below:
Alternatively, if you are a MacOS user, if the YML file installation is not successfull, or if you want to use CUDA or Python 3.11, you can install each dependency separately, and then the latest stable release of the package using the PyPi package manager. Also in this case, we advise to use a [conda environment](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html). In case of issues during installation, please refer to the official documentation for each package (linked below), as our instructions may be out of date:

* [MSMS](https://anaconda.org/bioconda/msms): `conda install -c bioconda msms`.
* [Here](https://ssbio.readthedocs.io/en/latest/instructions/msms.html) for MacOS with M1 chip users.
* [PyTorch](https://pytorch.org/get-started/locally/)
* [PyTorch Geometric](https://pytorch-geometric.readthedocs.io/en/latest/install/installation.html) and its optional dependencies: `torch_scatter`, `torch_sparse`, `torch_cluster`, `torch_spline_conv`.
* [PyTorch Geometric](https://pytorch-geometric.readthedocs.io/en/latest/install/installation.html) `conda install pyg -c pyg`
* Also install all [optional additions to PyTorch Geometric](https://pytorch-geometric.readthedocs.io/en/latest/install/installation.html#installation-from-wheels), namely: `torch_scatter`, `torch_sparse`, `torch_cluster`, `torch_spline_conv`.
* For MacOS with M1 chip users only install [the conda version of PyTables](https://www.pytables.org/usersguide/installation.html).

#### Install DeepRank2

Finally do:
gcroci2 marked this conversation as resolved.
Show resolved Hide resolved

```bash
pip install deeprank2
```

Alternatively, get all the new developments by cloning the repo and installing the editable version of the package with:
Alternatively, get the latest updates by cloning the repo and installing the editable version of the package with:

```bash
git clone https://github.com/DeepRank/deeprank2
Expand All @@ -129,17 +139,16 @@ The `test` extra is optional, and can be used to install test-related dependenci

#### Test installation

If you have installed the package from a cloned repository (the latter option above), you can check that all components were installed correctly, using pytest.
If you have installed the package from a cloned repository (the latter option above), you can check that all components were installed correctly, using pytest (run `pip install pytest` if you did not install it above).
The quick test should be sufficient to ensure that the software works, while the full test (a few minutes) will cover a much broader range of settings to ensure everything is correct.

Run `pytest tests/test_integration.py` for the quick test or just `pytest` for the full test (expect a few minutes to run).

### Contributing
## Contributing

If you would like to contribute to the package in any way, please see [our guidelines](CONTRIBUTING.rst).

The following section serves as a first guide to start using the package, using protein-protein Interface (PPI) queries
as example. For an enhanced learning experience, we provide in-depth [tutorial notebooks](https://github.com/DeepRank/deeprank2/tree/main/tutorials) for generating PPI data, generating SVR data, and for the training pipeline.
The following section serves as a first guide to start using the package, using protein-protein Interface (PPI) queries as example. For an enhanced learning experience, we provide in-depth [tutorial notebooks](https://github.com/DeepRank/deeprank2/tree/main/tutorials) for generating PPI data, generating SVR data, and for the training pipeline.
For more details, see the [extended documentation](https://deeprank2.rtfd.io/).

## Data generation
Expand Down
51 changes: 29 additions & 22 deletions docs/installation.md
Original file line number Diff line number Diff line change
@@ -1,33 +1,36 @@
# Installation
# Installations

Note that the package officially supports ubuntu-latest OS only, whose functioning is widely tested through the continuous integration workflows.

## Dockerfile
You can either install DeepRank2 in a [dockerized container](#containerized-installation), which will allow you to run our [tutorial notebooks](https://github.com/DeepRank/deeprank2/tree/main/tutorials), or you can [install the package locally](#localremote-installation).

In order to try out the package without worrying about your OS and without the need of installing all the required dependencies, we created a `Dockerfile` that can be used for taking care of everything in a suitable container. After having cloned the repository and installed [Docker](https://docs.docker.com/engine/install/), run the following commands from the root of the repository.
## Containerized Installation

In order to try out the package without worrying about your OS and without the need of installing all the required dependencies, we created a `Dockerfile` that can be used for taking care of everything in a suitable container. After having cloned the repository and installed [Docker](https://docs.docker.com/engine/install/), run the following commands (you may need to have sudo permission) from the root of the repository.

Build the Docker image:

```bash
docker build -t deeprank2 .
```

SSH to a running container:
Run the Docker container:

```bash
docker run -it --expose 3000 -p 3000:3000 deeprank2
docker run -p 8888:8888 deeprank2
```

Run the tutorials' notebooks from within the running container:
```bash
cd tutorials
jupyter notebook --ip 0.0.0.0 --no-browser --allow-root --port 3000
```
This assumes that your application inside the container is listening on port 8888, and you want to map it to port 8888 on your host machine. Open a browser and go to `http://localhost:8888` to access the application running inside the Docker container and run the tutorials' notebooks.

Now you can run the tutorials' notebook. More details about their content can be found [here](https://github.com/DeepRank/deeprank2/blob/main/tutorials/TUTORIAL.md). Note that in the docker container only the raw PDB files are downloaded, needed as a starting point for the tutorials. You can obtain the processed HDF5 files by running the `data_generation_xxx.ipynb` notebooks. Because Docker containers are limited in memory resources, we limit the number of data points processed in the tutorials'. Please install the package locally to fully leverage its capabilities.
More details about the tutorials' content can be found [here](https://github.com/DeepRank/deeprank2/blob/main/tutorials/TUTORIAL.md). Note that in the docker container only the raw PDB files are downloaded, needed as a starting point for the tutorials. You can obtain the processed HDF5 files by running the `data_generation_xxx.ipynb` notebooks. Because Docker containers are limited in memory resources, we limit the number of data points processed in the tutorials'. Please install the package locally to fully leverage its capabilities.

## Non-pythonic dependencies
After running the tutorials, you may want to remove the (quite large) Docker image from your machine. In this case, remember to [stop the container](https://docs.docker.com/engine/reference/commandline/stop/) and then [remove the image](https://docs.docker.com/engine/reference/commandline/image_rm/). More general information about Docker can be found on the [official website docs](https://docs.docker.com/get-started/).

Instructions are updated as of 14/09/2023.
## Local/remote installation

### Non-pythonic dependencies

Instructions are up to date as of 14/09/2023.

Before installing deeprank2 you need to install some dependencies:

Expand All @@ -38,34 +41,39 @@ Before installing deeprank2 you need to install some dependencies:
* [GCC](https://gcc.gnu.org/install/)
* Check if gcc is installed: `gcc --version`. If this gives an error, run `sudo apt-get install gcc`.

## Pythonic dependencies
### Pythonic dependencies

Instructions are updated as of 14/09/2023.
Instructions are up to date as of 14/09/2023.

Then, you can use the YML file we provide for creating a [conda environment](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html) containing the latest stable release of the package and all the other necessary conda and pip dependencies (CPU only, Python 3.10):

```bash
# Ensure you are in your base environment
conda activate
# Create the environment
conda env create -f env/environment.yml
# Activate the environment
conda activate deeprank2
```

Alternatively, if you are a MacOS user, if the .YML file installation is not successfull, or if you want to use CUDA or Python 3.11, you can install each dependency separately, and then the latest stable release of the package using the PyPi package manager. Also in this case, we advise to use a [conda environment](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html). In case of issues during installation you should always refer to the official documentation which is linked below:
Alternatively, if you are a MacOS user, if the YML file installation is not successfull, or if you want to use CUDA or Python 3.11, you can install each dependency separately, and then the latest stable release of the package using the PyPi package manager. Also in this case, we advise to use a [conda environment](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html). In case of issues during installation, please refer to the official documentation for each package (linked below), as our instructions may be out of date:

* [MSMS](https://anaconda.org/bioconda/msms): `conda install -c bioconda msms`.
* [Here](https://ssbio.readthedocs.io/en/latest/instructions/msms.html) for MacOS with M1 chip users.
* [PyTorch](https://pytorch.org/get-started/locally/)
* [PyTorch Geometric](https://pytorch-geometric.readthedocs.io/en/latest/install/installation.html) and its optional dependencies: `torch_scatter`, `torch_sparse`, `torch_cluster`, `torch_spline_conv`.
* [PyTorch Geometric](https://pytorch-geometric.readthedocs.io/en/latest/install/installation.html) `conda install pyg -c pyg`
* Also install all [optional additions to PyTorch Geometric](https://pytorch-geometric.readthedocs.io/en/latest/install/installation.html#installation-from-wheels), namely: `torch_scatter`, `torch_sparse`, `torch_cluster`, `torch_spline_conv`.
* For MacOS with M1 chip users only install [the conda version of PyTables](https://www.pytables.org/usersguide/installation.html).

#### Install DeepRank2

Finally do:

```bash
pip install deeprank2
```

Alternatively, get all the new developments by cloning the repo and installing the editable version of the package with:
Alternatively, get the latest updates by cloning the repo and installing the editable version of the package with:

```bash
git clone https://github.com/DeepRank/deeprank2
Expand All @@ -77,15 +85,14 @@ The `test` extra is optional, and can be used to install test-related dependenci

### Test installation

If you have installed the package from a cloned repository (the latter option above), you can check that all components were installed correctly, using pytest.
If you have installed the package from a cloned repository (the latter option above), you can check that all components were installed correctly, using pytest (run `pip install pytest` if you did not install it above).
The quick test should be sufficient to ensure that the software works, while the full test (a few minutes) will cover a much broader range of settings to ensure everything is correct.

Run `pytest tests/test_integration.py` for the quick test or just `pytest` for the full test (expect a few minutes to run).

## Contributing
# Contributing

If you would like to contribute to the package in any way, please see [our guidelines](CONTRIBUTING.rst).

The following section serves as a first guide to start using the package, using protein-protein Interface (PPI) queries
as example. For an enhanced learning experience, we provide in-depth [tutorial notebooks](https://github.com/DeepRank/deeprank2/tree/main/tutorials) for generating PPI data, generating SVR data, and for the training pipeline.
The following section serves as a first guide to start using the package, using protein-protein Interface (PPI) queries as example. For an enhanced learning experience, we provide in-depth [tutorial notebooks](https://github.com/DeepRank/deeprank2/tree/main/tutorials) for generating PPI data, generating SVR data, and for the training pipeline.
For more details, see the [extended documentation](https://deeprank2.rtfd.io/).
Loading
Loading