Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch 2.4.2 and Geoshape indexing improvements #1327

Open
wants to merge 13 commits into
base: titan11
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/elasticsearch.txt
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Elasticsearch is a flexible and powerful open source, distributed, real-time sea
Titan supports http://elasticsearch.org[Elasticsearch] as an index backend. Here are some of the Elasticsearch features supported by Titan:

* *Full-Text*: Supports all `Text` predicates to search for text properties that matches a given word, prefix or regular expression.
* *Geo*: Supports the `Geo.WITHIN` condition to search for points that fall within a given circle. Only supports points for indexing and circles for querying.
* *Geo*: Supports all `Geo` predicates to search for geo properties that are intersecting, within, disjoint to or contained in a given query geometry. Supports points, lines and polygons for indexing. Supports circles, boxes and polygons for querying point properties and all shapes for querying non-point properties.
* *Numeric Range*: Supports all numeric comparisons in `Compare`.
* *Flexible Configuration*: Supports embedded or remote operation, custom transport and discovery, and open-ended settings customization.
* *TTL*: Supports automatically expiring indexed elements.
Expand Down Expand Up @@ -404,7 +404,7 @@ Classpath or Field errors
When you see exception referring to lucene implementation details, make sure you don't have a conflicting version of Lucene on the classpath. Exception may look like this:

[source, text]
java.lang.NoSuchFieldError: LUCENE_4_10_4
java.lang.NoSuchFieldError: LUCENE_5_5_2

Optimizing Elasticsearch
~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
1 change: 1 addition & 0 deletions docs/hadoop.txt
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,7 @@ giraph.maxMessagesInMemory=100000
spark.master=local[*]
spark.executor.memory=1g
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.kryo.registrator=com.thinkaurelius.titan.hadoop.serialize.TitanKryoRegistrator
----

[source, gremlin]
Expand Down
2 changes: 1 addition & 1 deletion docs/lucene.txt
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ Feature Support
~~~~~~~~~~~~~~~

* *Full-Text*: Supports all `Text` predicates to search for text properties that matches a given word, prefix or regular expression.
* *Geo*: Supports the `Geo.WITHIN` condition to search for points that fall within a given geographic shape. Only supports points for indexing and circles and boxes for querying.
* *Geo*: Supports `Geo` predicates to search for geo properties that are intersecting, within, or contained in a given query geometry. Supports points, lines and polygons for indexing. Supports circles and boxes for querying point properties and all shapes for querying non-point properties.
* *Numeric Range*: Supports all numeric comparisons in `Compare`.
* *Temporal*: Nanosecond granularity temporal indexing.

Expand Down
20 changes: 16 additions & 4 deletions docs/searchpredicates.txt
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,14 @@ See <<text-search>> for more information about full-text and string search.
Geo Predicate
~~~~~~~~~~~~~

The `Geo` enum specifies the geo-location predicate `geoWithin` which holds true if one geometric object contains the other.
The `Geo` enum specifies geo-location predicates.

* `geoIntersect` which holds true if the two geometric objects have at least one point in common (opposite of `geoDisjoint`).
* `geoWithin` which holds true if one geometric object contains the other.
* `geoDisjoint` which holds true if the two geometric objects have no points in common (opposite of `geoIntersect`).
* `geoContains` which holds true if one geometric object is contained by the other.

See <<geo-search>> for more information about geo search.

Query Examples
~~~~~~~~~~~~~~
Expand Down Expand Up @@ -88,7 +94,7 @@ Additional data types will be supported in the future.

Geoshape Data Type
~~~~~~~~~~~~~~~~~~
The Geoshape data type supports representing a point, circle or box. However all index backends currently only support indexing points.
The Geoshape data type supports representing a point, circle, box, line, polygon, multi-point, multi-line and multi-polygon. Index backends currently support indexing points, lines and polygons. Indexing multi-point, multi-line and multi-polygon properties has not been tested.
Geospatial index lookups are only supported via mixed indexes.

To construct a Geoshape use the following methods:
Expand All @@ -100,8 +106,14 @@ Geoshape.point(37.97, 23.72)
Geoshape.circle(37.97, 23.72, 50)
//SW lat, SW lng, NE lat, NE lng
Geoshape.box(37.97, 23.72, 38.97, 24.72)
//lat1,lng1,lat2,lng2,...
Geoshape.line(37.97, 23.72, 37.97, 24.72, 38.97, 24.72, 38.97, 23.72)
Geoshape.polygon(37.97, 23.72, 37.97, 24.72, 38.97, 24.72, 38.97, 23.72, 37.97, 23.72)

Additional Geoshape constructors for building lines and polygons from a list of coordinate pairs, Spatial4j Shape or JTS Geometry are also available.
Note that, unlike above, the coordinate order is (lon,lat) when providing a list of coordinate pairs.

In addition when importing a graph via GraphSON Point may be represented by:
In addition, when importing a graph via GraphSON the geometry may be represented by GeoJSON:
[source, java]
//string
"37.97, 23.72"
Expand All @@ -124,7 +136,7 @@ In addition when importing a graph via GraphSON Point may be represented by:
"coordinates": [125.6, 10.1]
}

link:http://geojson.org/[GeoJSON] may be specified as Point, Circle or Polygon. However polygons must form a box.
link:http://geojson.org/[GeoJSON] may be specified as Point, Circle, LineString or Polygon. Polygons must be closed.
Note that unlike the Titan API GeoJSON specifies coordinates as lng lat.

Collections
Expand Down
2 changes: 1 addition & 1 deletion docs/solr.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Solr is the popular, blazing fast open source enterprise search platform from th
Titan supports http://lucene.apache.org/solr/[Solr] as an index backend. Here are some of the Solr features supported by Titan:

* *Full-Text*: Supports all `Text` predicates to search for text properties that matches a given word, prefix or regular expression.
* *Geo*: Supports the `Geo.WITHIN` condition to search for points that fall within a given circle. Only supports points for indexing and circles for querying.
* *Geo*: Supports all `Geo` predicates to search for geo properties that are intersecting, within, disjoint to or contained in a given query geometry. Supports points, lines and polygons for indexing. Supports circles, boxes and polygons for querying point properties and all shapes for querying non-point properties.
* *Numeric Range*: Supports all numeric comparisons in `Compare`.
* *TTL*: Supports automatically expiring indexed elements.
* *Temporal*: Millisecond granularity temporal indexing.
Expand Down
24 changes: 24 additions & 0 deletions docs/textsearch.txt
Original file line number Diff line number Diff line change
Expand Up @@ -84,8 +84,32 @@ summary = mgmt.makePropertyKey('booksummary').dataType(String.class).make()
mgmt.buildIndex('booksBySummary', Vertex.class).addKey(summary, Mapping.TEXTSTRING.asParameter()).buildMixedIndex("search")
mgmt.commit()


Note that the data will be stored in the index twice, once for exact matching and once for fuzzy matching.


[[geo-search]]
Geo Mapping
~~~~~~~~~~~

By default, Titan supports indexing geo properties with point type and querying geo properties by circle or box. To index a non-point geo property with support for querying by any geoshape type, specify the mapping as `Mapping.PREFIX_TREE`:

[source, gremlin]
mgmt = graph.openManagement()
name = mgmt.makePropertyKey('border').dataType(Geoshape.class).make()
mgmt.buildIndex('borderIndex', Vertex.class).addKey(name, Mapping.PREFIX_TREE.asParameter()).buildMixedIndex("search")
mgmt.commit()

Additional parameters can be specified to tune the configuration of the underlying prefix tree mapping. These optional parameters include the number of levels used in the prefix tree as well as the associated precision.

[source, gremlin]
mgmt = graph.openManagement()
name = mgmt.makePropertyKey('border').dataType(Geoshape.class).make()
mgmt.buildIndex('borderIndex', Vertex.class).addKey(name, Mapping.PREFIX_TREE.asParameter(), Parameter.of("index-geo-max-levels", 18), Parameter.of("index-geo-dist-error-pct", 0.0125)).buildMixedIndex("search")
mgmt.commit()

Note that some indexing backends (e.g. Solr) may require additional external schema configuration to support and tune indexing non-point properties.

Field Mapping
~~~~~~~~~~~~~

Expand Down
19 changes: 12 additions & 7 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -84,15 +84,15 @@
<hbase100.core.version>1.0.2</hbase100.core.version>
<hbase100.version>${hbase100.core.version}</hbase100.version>
<jackson1.version>1.9.2</jackson1.version>
<jackson2.version>2.4.4</jackson2.version>
<jackson2.version>2.6.6</jackson2.version>
<!-- ES depends on Lucene. This ES dependency can affect the
version used by the titan-lucene module. When updating
the ES version, also consider the version of Lucene, and
vice-versa. -->
<lucene.version>4.10.4</lucene.version>
<elasticsearch.version>1.5.1</elasticsearch.version>
<lucene.version>5.5.2</lucene.version>
<elasticsearch.version>2.4.2</elasticsearch.version>
<commons.beanutils.version>1.7.0</commons.beanutils.version>
<joda.version>1.6.2</joda.version>
<joda.version>2.8.2</joda.version>
<concurrentlinkedhashmap.version>1.3</concurrentlinkedhashmap.version>
<antlr2.version>2.7.7</antlr2.version>
<antlr.version>3.2</antlr.version>
Expand Down Expand Up @@ -454,6 +454,11 @@
<artifactId>jackson-annotations</artifactId>
<version>${jackson2.version}</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.module</groupId>
<artifactId>jackson-module-scala_2.10</artifactId>
<version>${jackson2.version}</version>
</dependency>
<dependency>
<groupId>joda-time</groupId>
<artifactId>joda-time</artifactId>
Expand Down Expand Up @@ -798,7 +803,7 @@
<dependency>
<groupId>commons-cli</groupId>
<artifactId>commons-cli</artifactId>
<version>1.2</version>
<version>1.3.1</version>
</dependency>
<dependency>
<groupId>org.jboss.netty</groupId>
Expand All @@ -808,14 +813,14 @@
<dependency>
<groupId>io.netty</groupId>
<artifactId>netty</artifactId>
<version>3.6.6.Final</version>
<version>3.10.5.Final</version>
</dependency>

<!-- Spatial4j -->
<dependency>
<groupId>com.spatial4j</groupId>
<artifactId>spatial4j</artifactId>
<version>0.4.1</version>
<version>0.5</version>
</dependency>

<!-- Package prefix is org.apache.commons.httpclient -->
Expand Down
11 changes: 11 additions & 0 deletions titan-core/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,11 @@
<groupId>com.spatial4j</groupId>
<artifactId>spatial4j</artifactId>
</dependency>
<dependency>
<groupId>com.vividsolutions</groupId>
<artifactId>jts</artifactId>
<version>1.13</version>
</dependency>
<dependency>
<groupId>commons-collections</groupId>
<artifactId>commons-collections</artifactId>
Expand Down Expand Up @@ -99,6 +104,12 @@
<groupId>com.google.code.findbugs</groupId>
<artifactId>jsr305</artifactId>
</dependency>
<!-- The Noggit JSON parsing library is needed for GeoJSON parsing -->
<dependency>
<groupId>org.noggit</groupId>
<artifactId>noggit</artifactId>
<version>0.6</version>
</dependency>
</dependencies>
<build>
<directory>${basedir}/target</directory>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ public TitanPredicate negate() {
},

/**
* Whether one geographic region is completely contains within another
* Whether one geographic region is completely within another
*/
WITHIN {
@Override
Expand All @@ -90,6 +90,34 @@ public boolean hasNegation() {
return false;
}

@Override
public TitanPredicate negate() {
throw new UnsupportedOperationException();
}
},

/**
* Whether one geographic region completely contains another
*/
CONTAINS {
@Override
public boolean test(Object value, Object condition) {
Preconditions.checkArgument(condition instanceof Geoshape);
if (value == null) return false;
Preconditions.checkArgument(value instanceof Geoshape);
return ((Geoshape) value).contains((Geoshape) condition);
}

@Override
public String toString() {
return "contains";
}

@Override
public boolean hasNegation() {
return false;
}

@Override
public TitanPredicate negate() {
throw new UnsupportedOperationException();
Expand Down Expand Up @@ -123,4 +151,7 @@ public static <V> P<V> geoDisjoint(final V value) {
public static <V> P<V> geoWithin(final V value) {
return new P(Geo.WITHIN, value);
}
public static <V> P<V> geoContains(final V value) {
return new P(Geo.CONTAINS, value);
}
}
Loading