Merge pull request #98 from christophbrgr/master

adds support for multiple inputs via json files
modelhub-ai · Sep 19, 2019 · 31fd85a · 31fd85a
2 parents 649e041 + 3d55c28
commit 31fd85a
Show file tree

Hide file tree

Showing 35 changed files with 867 additions and 356 deletions.
diff --git a/.gitignore b/.gitignore
@@ -9,3 +9,6 @@
 .vscode
 *.code-workspace
 /framework/htmlcov/
+keras-modelhub:1.0.3
+pytorch-modelhub*
+modelhub:test
diff --git a/.travis.yml b/.travis.yml
@@ -5,7 +5,7 @@ python:
 cache: pip
 install:
   - pip install flask-cors==3.0.6
-  - pip install Flask==0.12.2
+  - pip install Flask==0.12.3
   - pip install numpy
   - pip install Pillow==5.1.0
   - pip install SimpleITK==1.1.0

diff --git a/Dockerfile_modelhub b/Dockerfile_modelhub
@@ -36,4 +36,5 @@ WORKDIR /output
 WORKDIR /contrib_src
 
 # Run /data/run.py when the container launches
-CMD ["python", "run.py"]
+# If you need Python2, change this command:
+CMD ["python3", "run.py"]
diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -59,17 +59,17 @@
 
 # General information about the project.
 project = u'modelhub'
-copyright = u'2018, Ahmed Hosny, Michael Schwier'
-author = u'Ahmed Hosny, Michael Schwier'
+copyright = u'2019, Ahmed Hosny, Michael Schwier'
+author = u'Ahmed Hosny, Michael Schwier, Christoph Berger'
 
 # The version info for the project you're documenting, acts as replacement for
 # |version| and |release|, also used in various other places throughout the
 # built documents.
 #
 # The short X.Y version.
-version = u'0.3'
+version = u'0.4'
 # The full version, including alpha/beta/rc tags.
-release = u'0.3.0'
+release = u'0.4.0'
 
 # The language for content autogenerated by Sphinx. Refer to documentation
 # for a list of supported languages.

diff --git a/docs/source/contribute.md b/docs/source/contribute.md
@@ -1,12 +1,12 @@
 ## Contribute Your Model to Modelhub
 
-The following figure gives an overview of the necessary steps to packaging your model
+The following figure gives an overview of the necessary steps to package your model
 with the Modelhub framework and eventually contributing it to the Modelhub collection.
 Read further for detailed explanations of all steps.
 
 <img width="75%" alt="modelhub contribution steps" src="https://raw.githubusercontent.com/modelhub-ai/modelhub-engine/master/docs/source/images/contribution_process.png">
 
-**_HINT_** Take a look at an already integrated model to understand how it looks when finished ([AlexNet](https://github.com/modelhub-ai/AlexNet) is a good and simple example).
+**_HINT_** Take a look at an already integrated model to understand how it looks when finished ([AlexNet](https://github.com/modelhub-ai/AlexNet) is a good and simple example. If you have a more complex model with more than one input for a single inference, have a look at one of the BraTS models, e.g. [lfb-rwth](https://github.com/modelhub-ai/lfb-rwth)).
 
 ### Prerequisites
 
@@ -15,12 +15,13 @@ To package a model with our framework you need to have the following prerequisit
 - Python 2.7 or Python 3.6 (or higher)
 - [Docker](https://docs.docker.com/install/)
 - Clone of the [modelhub-engine repository](https://github.com/modelhub-ai/modelhub-engine.git) (`git clone https://github.com/modelhub-ai/modelhub-engine.git`)
+- For GPU support, you need Docker version >= 19.03 and follow the [instructions here](https://github.com/NVIDIA/nvidia-docker#quickstart).
   <br/><br/>
 
 ### 1. Prepare Docker image
 
 1.  Write a dockerfile preparing/installing all third party dependencies your model needs
-    (e.g. the deep learning library you are using). Use the `ubuntu:16.04` Docker image as base.
+    (e.g. the deep learning library you are using). Use the `ubuntu:16.04` Docker image as base. If you want to use CUDA and GPU acceleration, you can also use one of the `nvidia/cuda` images as base.
 
     You can find examples of dockerfiles for DL environments in the model repositories of
     [modelhub-ai on github](https://github.com/modelhub-ai) (e.g. for [squeezenet](https://github.com/modelhub-ai/squeezenet/blob/master/dockerfiles/caffe2)).
@@ -43,7 +44,7 @@ To package a model with our framework you need to have the following prerequisit
     (required if you want to publish your model on Modelhub, so the image can
     be found when starting a model for the first time. If you don't plan to publish on Modelhub, this step is optional).
 
-- **_NOTE_** We are planning to provide a few pre-build Docker images for the most common deep
+- **_NOTE_** We are planning to provide a few pre-built Docker images for the most common deep
   learning frameworks, so you do not have to build them yourself. For now we only have a small set.
   You can find the existing
   [pre-build images on DockerHub](https://hub.docker.com/u/modelhub/) - use the ones that end with '-modelhub' (the ones that don't end with '-modelhub' have only the pure DL environment without
@@ -69,6 +70,9 @@ To package a model with our framework you need to have the following prerequisit
 4.  Populate the configuration file _contrib_src/model/config.json_ with the relevant information about your model.
     Please refer to the [schema](https://github.com/modelhub-ai/modelhub/blob/master/config_schema.json) for
     allowed values and structure.
+    <br/>
+    Version 0.4 and up breaks the compatibility with older versions of the schema, please validate your configuration file against the current schema if you are submitting a new model. Old models are still compatible anddon't need to be changed unless you are updating the modelhub-engine version of the Docker image. For single-input models, assign the key `"single"` to your input as in the schema above. </br><br/>
+    **_HINT_** For more details on how to set up your model for various input scenarios and implement your own ImageLoader class, see the [IO Configuration documentation](https://modelhub.readthedocs.io/en/latest/modelio.html).
     <br/><br/>
 
 5.  Place your pre-trained model file(s) into the _contrib_src/model/_ folder.
@@ -80,7 +84,7 @@ To package a model with our framework you need to have the following prerequisit
 
 7.  Open _contrib_src/inference.py_ and replace the model initialization and inference with your
     model specific code. The template example shows how to integrate models in ONNX format and running
-    them in caffe2. If your are using a different model format and/or backend you have to change this.
+    them in caffe2. If you are using a different model format and/or backend you have to change this.
 
     There are only two lines you have to modify. In the `__init__` function change the following line,
     which loads the model:
@@ -90,6 +94,8 @@ To package a model with our framework you need to have the following prerequisit
     self._model = onnx.load('model/model.onnx')
     ```
 
+    If your model receives more than one file as input, the `input` argument of `infer` is a dictionary matching the input schema specified in `config.json`. You would then need to pass each individual input through the preprocessing and to your inference function. For example, accessing the input `image_pose` would look like this: `input["image_pose"]["fileurl"]`.
+    <br/>
     In the `infer` function change the following line, which runs the model prediction on the input data:
 
     ```python
@@ -239,7 +245,8 @@ To package a model with our framework you need to have the following prerequisit
 
     2. Run `python start.py YOUR_MODEL_FOLDER_NAME` and check if the web app for your model looks and
        works as expected. **TODO:** Add info on how to use the web app, because the command just
-       starts the REST API, which the web frontend is accessing.
+       starts the REST API, which the web frontend is accessing. <br/>
+       **_NOTE_** If your code uses CUDA on a GPU, you have to add the `-g` flag to `start.py` to enforce the use of the GPU version of Docker. This is only required for testing, once your model is added to the index, the right mode (GPU or CPU) is automatically queried. Run `python start.py -h` for more info.
        <br/><br/>
 
     3. Run `python start.py YOUR_MODEL_FOLDER_NAME -e` and check if the jupyter notebook _contrib_src/sandbox.ipynb_
@@ -270,7 +277,7 @@ To package a model with our framework you need to have the following prerequisit
 1.  `git clone https://github.com/modelhub-ai/modelhub.git` (or update if you cloned already).
     <br/><br/>
 
-2.  Add your model to the model index list _models.json_.
+2.  Add your model to the model index list _models.json_. If your model needs a GPU to run, add `"gpu" : true` to the parameters for your model. This tells the start script to run the model with GPU acceleration.
     <br/><br/>
 
 3.  Send us a pull request.

diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -6,12 +6,12 @@
 Welcome to Modelhub's documentation!
 ====================================
 
-Crowdsourced through contributions by the scientific research community, 
-modelhub is a repository of deep learning models pretrained for a wide variety 
-of medical applications. Modelhub highlights recent trends in deep learning 
+Crowdsourced through contributions by the scientific research community,
+modelhub is a repository of deep learning models pretrained for a wide variety
+of medical applications. Modelhub highlights recent trends in deep learning
 applications, enables transfer learning approaches and promotes reproducible science.
 
-.. note::  This documentation should contain all essential technical information 
+.. note::  This documentation should contain all essential technical information
            about the Modelhub project and how to contribute models. It is, however,
            still work-in-progress, so possibly you need to be a little patient
            and persistent. If you find anything unclear, need help, or have
@@ -32,6 +32,7 @@ Contents:
    quickstart
    overview
    contribute
+   modelio
    modelhubapi
    modelhublib
 

diff --git a/docs/source/modelio.md b/docs/source/modelio.md
@@ -0,0 +1,74 @@
+## Modelhub IO Configuration
+
+### Input Configuration for Single Inputs
+
+#### As a User
+If the model only requires a single image or other type of file for inference, you can simply pass a URL or a path to a local file to the API. For example, you can detect objects using YOLO-v3 by running `python start.py yolo-v3` and then use the API like this:
+```
+http://localhost:80/api/predict?fileurl=http://example.org/cutedogsandcats.jpg
+```
+The API then returns the prediction in the specified format. For a thorough description of the API, have a look at its [documentation](https://modelhub.readthedocs.io/en/latest/modelhubapi.html). <br/><br/>
+
+#### As a Collaborator submitting a new Model
+For single inputs, please create a configuration for your model according to the [example configuration](https://github.com/modelhub-ai/modelhub/blob/master/example_config_single_input.json). It is important that you keep the key `"single"` in the config, as the API uses this for accessing the dimension constraints when loading an image. Populate the rest of the configuration file as stated in the contribution guide and the [schema](https://github.com/modelhub-ai/modelhub/blob/master/config_schema.json). Validate your config file against our config schema with a JSON validator, e.g. [this one](https://www.jsonschemavalidator.net).<br/>
+Take care to choose the right MIME type for your input, this format will be checked by the API when users call the predict function and load a file. We support a few extra MIME types in addition to the standard MIME types:
+<table>
+<thead>
+  <tr>
+  <th>MIME type&emsp;
+  <th>File extension&emsp;
+  <th>Description&emsp;
+</thead>
+<tr>
+  <td> "application/nii"&emsp;
+  <td> .nii           &emsp;&emsp;
+  <td> Nifti-1 image&emsp;
+<tr>
+  <td> "application/nii-gzip"&emsp;
+  <td> .nii.gz        &emsp;&emsp;
+  <td> Compressed Nifti-1 image&emsp;
+<tr>
+  <td> "application/nrrd"&emsp;
+  <td> .nrrd          &emsp;&emsp;
+  <td> NRRD image&emsp;
+<tr>
+  <td> "application/octet-stream"&emsp;
+  <td> .npy           &emsp;&emsp;
+  <td> Numpy Array File&emsp;
+</table>
+
+
+<br/><br/>
+If you need other types not supported in the standard MIME types and by our extension, please open an [issue on Github](https://github.com/modelhub-ai/modelhub/issues).
+<br/><br/>
+
+### Input Configuration for Multiple Inputs
+
+#### As a User
+When you use a model that needs more than a single input file for a prediction, you have to pass a JSON file with all the inputs needed for that model. You can have a look at an example [here](https://github.com/modelhub-ai/modelhub/blob/master/example_input_file_multiple_inputs.json). <br/>
+The important points to keep in mind are:
+- There has to be a `format` key with `"application/json"` so that the API can handle the file
+- Each of the other keys describes one input and has to have a `format` (see the MIME types above) and a `fileurl`
+
+<br/><br/>
+The `fileurl` can contain a path to a local file (which has to be accessible by the Docker container running the model) or can contain a URL to a file on the web. The REST API can handle both and a mixture of local and web links while the Python API can only access local paths. <br/>
+Passing an input file to the REST API would then look like this:
+```
+http://localhost:80/api/predict?fileurl=http://example.org/fourimagesofdogs.json
+```
+For a thorough description of the API, have a look at its [documentation](https://modelhub.readthedocs.io/en/latest/modelhubapi.html).
+<br/><br/>
+
+
+#### As a Collaborator submitting a new Model
+For multiple inputs, please create a configuration for your model according to the [example configuration](https://github.com/modelhub-ai/modelhub/blob/master/example_config_multiple_inputs.json). The `format` key has to be present at the `input` level and must be equal to `application/json` as all input files will be passed in a json to the API.
+<br/>
+The other keys stand for one input file each and must contain a valid format (e.g. `application/dicom`) and dimensions. You can additionally add a description for the input.
+<br/>
+Populate the rest of the configuration file as stated in the contribution guide and the [schema](https://github.com/modelhub-ai/modelhub/blob/master/config_schema.json). Validate your config file against our config schema with a JSON validator, e.g. [this one](https://www.jsonschemavalidator.net).<br/><br/>
+To access the files passed to your model in the `infer` function, use the keys you specified in the configuration and in the input json file. For example, suppose you have an input with key `t1`: You can access the path the the file in `infer` by using the passed dictionary: `input["t1"]["fileurl"]`. This way you can always be sure that you are accessing the right file. <br/><br/>
+**_HINT_** You can implement additional classes for the loading of your images by adding your own class that extends the `ImageLoader` class and add it to the chain of responsibility for loading the images. One good example is the [lfb-rwth-brats](https://github.com/modelhub-ai/lfb-rwth-brats) model.
+<br/><br/>
+Additionally, mismatches between the config file and the input file the user passes to the API are automatically checked before the input is passed to your model.
+<br/><br/>
+**_HINT_** Check out existing models with multiple inputs to see how they implemented the input handling of multiple inputs, for example one of the BraTS models, e.g. [lfb-rwth-brats](https://github.com/modelhub-ai/lfb-rwth-brats).
diff --git a/docs/source/quickstart.md b/docs/source/quickstart.md
@@ -7,7 +7,8 @@ But since you are here, follow these steps to get modelhub running on your local
 1. **Install Docker** (if not already installed)
 
    Follow the [official Docker instructions](https://docs.docker.com/install/) to install Docker CE.
-   Docker is required to run models.
+   Docker is required to run models.<br/>
+   **GPU Support**: If you want to run models that require GPU acceleration, please use Docker version >= 19.03 and follow the installation instructions for the [Nvidia-Docker Toolkit here](https://github.com/NVIDIA/nvidia-docker#quickstart).
    <br/><br/>
 
 2. **Install Python 2.7 or 3.6 (or higher)** (if not already installed)

diff --git a/framework/modelhubapi/pythonapi.py b/framework/modelhubapi/pythonapi.py
@@ -91,6 +91,9 @@ def predict(self, input_file_path, numpyToFile=True, url_root=""):
 
         Args:
             input_file_path (str): Path to input file to run inference on.
+                Either a direct input file or a json containing paths to all
+                input files needed for the model to predict. The appropriate
+                structure for the json can be found in the documentation.
             numpyToFile (bool): Only effective if prediction is a numpy array.
                 Indicates if numpy outputs should be saved and a path to it is
                 returned. If false, a json-serializable list representation of
@@ -108,7 +111,8 @@ def predict(self, input_file_path, numpyToFile=True, url_root=""):
         try:
             config = self.get_config()
             start = time.time()
-            output = self.model.infer(input_file_path)
+            input = self._unpack_inputs(input_file_path)
+            output = self.model.infer(input)
             output = self._correct_output_list_wrapping(output, config)
             end = time.time()
             output_list = []
@@ -134,12 +138,46 @@ def predict(self, input_file_path, numpyToFile=True, url_root=""):
                         }
                     }
         except Exception as e:
+            print(e)
             return {'error': repr(e)}
 
 
     # -------------------------------------------------------------------------
     # Private helper functions
     # -------------------------------------------------------------------------
+
+    def _unpack_inputs(self, file_path):
+        """
+        This utility function returns a dictionary with the inputs if a
+        json file with multiple input files is specified, otherwise it just
+        returns the file_path unchanged for single inputs
+        It also converts the fileurl to a valid string (avoids html escaping)
+        """
+        if file_path.lower().endswith('.json'):
+            input_dict = self._load_json(file_path)
+            for key, value in input_dict.items():
+                if key == "format":
+                    continue
+                input_dict[key]["fileurl"] = str(value["fileurl"])
+            return self._check_input_compliance(input_dict)
+        else:
+            return file_path
+
+
+    def _check_input_compliance(self, input_dict):
+        """
+        Checks if the input dictionary has all the files needed as specified
+        in the model config file and returns an error if not.
+        * TODO: Check the other way round?
+        """
+        config = self.get_config()["model"]["io"]["input"]
+        for key in config.keys():
+            if key not in input_dict:
+                raise IOError("The input json does not match the input schema in the " \
+                                "configuration file")
+        return input_dict
+
+
     def _load_txt_as_dict(self, file_path, return_key):
         try:
             with io.open(file_path, mode='r', encoding='utf-8') as f:
@@ -157,6 +195,12 @@ def _load_json(self, file_path):
         except Exception as e:
             return {'error': str(e)}
 
+    def _write_json(self, file_path, output_dict):
+        try:
+            with open(file_path, mode='w') as f:
+                json.dump(output_dict, f, ensure_ascii=False)
+        except Exception as e:
+            return {'error': str(e)}
 
     def _correct_output_list_wrapping(self, output, config):
         if not isinstance(output, list):