Skip to content
This repository has been archived by the owner on Jan 29, 2019. It is now read-only.

Latest commit

 

History

History
28 lines (17 loc) · 941 Bytes

README.md

File metadata and controls

28 lines (17 loc) · 941 Bytes

United States Reports PDF Miner

Dump the contents of PDF documents published by the Supreme Court of the United States into JSON lists of tokens indicating document structure.

Building & Installation

You will need:

At the command prompt:

$ mvn clean install

To build a self-contained .jar file:

$ mvn package
$ java -jar usreportsminer-[VERSION]-jar-with-dependencies.jar [PDF FILE]

Where [VERSION] is the current build version and [PDF FILE] is the path to a Supreme Court opinion PDF file.

Development

The program is just enough lines of Java to jerry-rig an Apache PDFBox document renderer to a stub Graphics2D and feed Token objects to Gson for JSON output and get me out of Java-land.