United States Reports PDF Miner

Dump the contents of PDF documents published by the Supreme Court of the United States into JSON lists of tokens indicating document structure.

Building & Installation

You will need:

JDK 1.6
Apache Maven

At the command prompt:

$ mvn clean install

To build a self-contained .jar file:

$ mvn package
$ java -jar usreportsminer-[VERSION]-jar-with-dependencies.jar [PDF FILE]

Where [VERSION] is the current build version and [PDF FILE] is the path to a Supreme Court opinion PDF file.

Development

The program is just enough lines of Java to jerry-rig an Apache PDFBox document renderer to a stub Graphics2D and feed Token objects to Gson for JSON output and get me out of Java-land.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

United States Reports PDF Miner

Building & Installation

Development

Files

README.md

Latest commit

History

README.md

File metadata and controls

United States Reports PDF Miner

Building & Installation

Development