-
Notifications
You must be signed in to change notification settings - Fork 58
How to setup GERBIL
This page describes the needed steps to setup a local GERBIL system. Note that you don't have to do this if you want to test an annotator or a dataset since this can be done using our online system.
- Java 1.7 or newer
- maven 2
The easiest way to get GERBIL is to download it from GitHub.
git clone -b master https://github.com/AKSW/gerbil.git
If you are using linux, you can simply run start.sh
which will download the data, extract it and starts the system listening on http://localhost:1234/gerbil .
If you are using another operation system or want to download the data to a specific folder you can get the data from https://github.com/AKSW/gerbil/releases/download/v1.2.5/gerbil_data.zip Please extract the data and configure the system.
If you have used the start.sh
script, the system should already be configured. Otherwise you should open src/main/properties/gerbil.properties
and set org.aksw.gerbil.DataPath
to the folder containing the data of the extracted gerbil.zip
file.
GERBIL can simply be started by running the start.sh
script or running
mvn clean tomcat:run
Note that while starting GERBIL can show the following warning:
[main] WARN [org.aksw.gerbil.datasets.datahub.DatahubNIFLoader] - <Couldn't get any datasets with the gerbil tag from DataHubIO. Exception: org.springframework.web.client.HttpClientErrorException: 404 Not Found>
This warning shouldn't cause any problems and can be ignored.
Due to licensing restrictions, we are not allowed to upload the following datasets:
- AIDA/CoNLL
- Microposts 2013
- Microposts 2014
Regarding the annotators, it is possible that a key or registration is needed to use them without limitations. Please take a look at this wiki page: https://github.com/AKSW/gerbil/wiki/How-to-get-API-keys
We provide a Docker image (dicegroup/gerbil
) and a Docker compose file to run GERBIL easily. The compose file can be easily run by checking out the git project and executing the following:
docker-compose up
The following directories might be interesting for mounting them to a local directory
Directory | Description |
---|---|
/usr/local/tomcat/gerbil_properties |
The directory contains all properties files used to configure GEBRIL. |
/usr/local/tomcat/database |
The directory contains GERBIL's database. It should be mounted to persist experiment results. |
/usr/local/tomcat/cache |
The directory contains caches. Mounting it can speed up future experiments. |
/usr/local/tomcat/datasets |
The directory containing datasets. |
/usr/local/tomcat/indexes |
The GERBIL image comes without the two indexes that the start.sh script would download. These indexes can be downloaded to a local directory (using scripts/download_indexes.sh ) and mounted into the container using this directory. |
The Docker container will copy properties files and the datasets, which come with the gerbil_data.zip
file into the gerbil_properties
and datasets
directories if they are not already in these directories. This behavior can be deactivated by setting the environmental variables GERBIL_COPY_DIRS
and/or GERBIL_COPY_PROPS
to false.