Skip to content

LeveragingAvro

Zoltan Farkas edited this page Apr 29, 2019 · 29 revisions

Leveraging avro as a data format.

Avro is one of the many new serialization formats that have been created in the last 20 years. for a good introductions and also a comparison between with probably the 2 most popular alternatives see

In this demo project we use avro for:

  • wire format.
  • log format.

Why avro for wire format?

  1. Multiple encodings support:
    • binary for efficiency.
    • json for ineroperability and debugging.
    • csv for interoperability and debugging.
  2. Extensible. You can add you own metadata to the schema. (@beta, @displayName, ...)
  3. Avro schemas have a Json representation.
  4. Multiple language support.
  5. Open source.

Demo of a REST endpoint.

Start up the demo app as described at.

Let's try to get some data from:

images

As you can observe the writer schema info is provided by the content-schema HTTP header:

Content-Length: 505
content-schema: {"type":"array","items":{"$ref":"org.spf4j.demo:jaxrs-spf4j-demo-schema:0.3:2"}}
Content-Type: application/avro-x+json;charset=UTF-8

removing ?_Accept=application/json will yield the more efficient binary response:

Content-Length: 220
content-schema: {"type":"array","items":{"$ref":"org.spf4j.demo:jaxrs-spf4j-demo-schema:0.3:2"}}
Content-Type: application/avro

If we would desire the data in CSV format, since this endpoint is compliant we can use: ?_Accept=text/csv

Content-Length: 376
content-schema: {"type":"array","items":{"$ref":"org.spf4j.demo:jaxrs-spf4j-demo-schema:0.3:2"}}
Content-Type: text/csv;fmt=avro;charset=UTF-8

All the above is magically served by a endpoint definition like:

  @GET
  @Produces(value = {"application/avro", "application/avro-x+json", "application/octet-stream", "application/json", "text/csv"})
  List<DemoRecordInfo> getRecords();

This functionality is implemented by the spf4j avro feature and leverages Avro references

Why avro for logs?

  • structure. No need to write custom parsers. See for the record structure.
  • efficiency. smaller in size due to binary format, and built in compression.

An example of how to use avro for logs (leverages spf4j-logback and spf4j-jaxrs-actuator) is at.

Show latest logs in text format:

images

Show latest logs in JSON format:

images

Browse cluster log files:

images

Show all Log files from a particular node:

images

Download a log file:

images

Clone this wiki locally