Jackson dataformat: Spreadsheet

Let’s get rid of the Spread Sh!t.

This Jackson extension library is a component that memory-efficiently reads and writes the SpreadsheetML format^[1] using Apache POI and StAX via Jackson abstraction. — Implemented as full streaming implementation, which allows full access (streaming, data-binding, and tree-model).

💬	The same abstraction is provided for the Horrible Microsoft Excel format^[2]. However, not based on Event API, so may not be suitable for large .xls files.

Table of Contents

Installation
- Maven
- Gradle
Busy Developers' Guide to Features
And more.
- Supported Types
- References
License

Installation

📙	This project is currently under development. If you are interested, you can always use the latest snapshot artifacts from OSS Sonatype.

Maven

<repository>
    <id>sonatype-snapshots</id>
    <url>https://s01.oss.sonatype.org/content/repositories/snapshots/</url>
</repository>

<dependency>
    <groupId>io.github.scndry</groupId>
    <artifactId>jackson-dataformat-spreadsheet</artifactId>
    <version>0.0.1-SNAPSHOT</version>
</dependency>

Gradle

repositories {
    maven { url "https://s01.oss.sonatype.org/content/repositories/snapshots/" }
}

dependencies {
    implementation "io.github.scndry:jackson-dataformat-spreadsheet:0.0.1-SNAPSHOT"
}

Busy Developers' Guide to Features

Want to use HSSF and XSSF read and write spreadsheets in a hurry? This guide is for you. — Apache POI’s Quick Guide.

Reading Spreadsheet

First, let’s look at input-entries.xlsx file to read.

Table 1. input-entries.xlsx

Column A	Column B
1	2
3	4

Then let’s define the Entry class annotated @DataGrid.

@DataGrid
public class Entry {
    private int a;
    private int b;
    // Constructors, Getters, Setters and toString
}

We use SpreadsheetMapper to read the file into an Entry object, so let’s set it up now.

SpreadsheetMapper mapper = new SpreadsheetMapper();
File file = new File("input-entries.xlsx");

Once we have SpreadsheetMapper configured, simply use readValue.

Entry value = mapper.readValue(file, Entry.class);
// Output: Entry(a=1, b=2)

You can use readValues to read the entire list.

List<Entry> values = mapper.readValues(file, Entry.class);
// Output: [Entry(a=1, b=2), Entry(a=3, b=4)]

By default, it reads the first sheet. If you need to find and read a specific sheet by name or index, you can use SheetInput.

String sheetName = "Entries";
SheetInput<File> input = SheetInput.source(file, sheetName);
List<Entry> values = mapper.readValues(input, Entry.class);
// Output: [Entry(a=1, b=2), Entry(a=3, b=4)]

Writing Spreadsheet

We also use SpreadsheetMapper to write an Entry object to the file. Just call writeValue.

SpreadsheetMapper mapper = new SpreadsheetMapper();
File file = new File("output-entries.xlsx");
Entry value = new Entry(1, 2);
mapper.writeValue(file, value);

The output of the above in the file will be:

Table 2. output-entries.xlsx

a	b
1	2

This time, let’s write a list.

List<Entry> values = Arrays.asList(new Entry(1, 2), new Entry(3, 4));
mapper.writeValue(file, values);

java.lang.IllegalArgumentException: `valueType` MUST be specified to write a value of a Collection or array type

Oh, Sheet!

Okay. Let’s add a valueType to the 3rd parameter.

Class<Entry> valueType = Entry.class;
mapper.writeValue(file, values, valueType);

Table 3. output-entries.xlsx

a	b
1	2
3	4

Now let’s annotate on properties with @DataColumn to write each column’s name.

@DataGrid
public class Entry {
    @DataColumn("Column A")
    private int a;
    @DataColumn("Column B")
    private int b;
    // Constructors, Getters, Setters and toString
}

Table 4. output-entries.xlsx

Column A	Column B
1	2
3	4

And also you can use SheetOutput to write the sheet name. By default, the created sheet is named "Sheet" + indexOfSheet.

String sheetName = "Entries";
SheetOutput<File> output = SheetOutput.target(file, sheetName);
mapper.writeValue(output, values, Entry.class);

Nested Object (de)Structuring

Spreadsheets are a flat data representation of a 2-dimensional structure, but most of the POJOs we have take complex nested structures. Of course, it provides (de)structuring these complex models, as you can see below:

ID	NAME	ZIPCODE	ADDRESS LINE 1	ADDRESS LINE 2	DESIGNATION	SALARY
1	John Doe	12345	123 Main St.	Anytown, USA	CEO	$300,000.00

As we are Object-Oriented Developers, you probably don’t want to manage a lot of column data in a flat model. Let’s first define the well-structured Employee class.

@DataGrid
class Employee {
    int id;
    String name;
    Address address;
    Employment employment;
    // ...
}

class Address {
    String zipcode;
    AddressLines addressLines;
    // ...
}

class AddressLines {
    String line1;
    String line2;
    // ...
}

class Employment {
    Designation designation;
    long salary;
    // ...
}

enum Designation {
    CEO, CTO, SM, ARCH, INT
}

No additional code is required when deserializing data into nested structures, you can deserialize the same way.

Employee value = mapper.readValue(input, Employee.class);

Here’s the Employee object that output will be:

Employee(
    id=1
    name=John doe
    address=Address(
        zipcode=12345
        addressLines=AddressLines(
            line1=123 Main St.
            line2=Anytown, USA
        )
    )
    employment=Emeployment(
        designation=CEO
        salary=300000
    )
)

Just as can be deserialized without any additional code, you can serialize these nested objects.

mapper.writeValue(output, values, Employee.class);

Writing the Nested List

Although limited, it supports serializing nested list structures. However, deserializing is not supported due to implementation complexity.

@DataGrid
class Outer {
    @DataColumn("A") int a;
    List<Inner> inners;
    @DataColumn("E") int e;
    // ...
}

class Inner {
    @DataColumn("B") int b;
    @DataColumn("C") int[] c;
    @DataColumn("D") int d;
    // ...
}

Outer value = new Outer(
        1, Arrays.asList(
            new Inner(2, new int[]{3, 4}, 5),
            new Inner(6, new int[]{7, 8}, 9)),
        10);
mapper.writeValue(output, value);

This will lead to the desired result like this:

A	B	C	D	E
1	2	3	5	10
`BLANK`	`BLANK`	4	`BLANK`	`BLANK`
`BLANK`	6	7	9	`BLANK`
`BLANK`	`BLANK`	8	`BLANK`	`BLANK`

Annotations and Schema

For most column schema generation requirements, @DataGrid and @DataColumn annotations will provide all you need. The two have almost similar attributes, so only listing attributes of @DataColumn.

Attributes in @DataColumn

String value(): The column’s name to write in header
String style(): The name of the cellStyle to apply to the column, it should be registered with StylesBuilder.
String headerStyle(): Same as style(), but only applied to the column’s header.
int width(): Fixed width for the column.
OptBoolean autoSize(): Adjusts the column width to fit the contents. Not accurate for fullwidth forms like the CJK characters.
int minWidth(): Minimum width of the auto-sized column.
int maxWidth(): Maximum width of the auto-sized column.

Attribute Resolution

Column schema uses particular order designed to allow for value overrides. Attributes are considered in the following order:

@DataColumn's attribute on property.
@DataGrid's attribute on the class in which the property is declared.
@DataGrid's attribute of the class of which the class in which the property is declared is a member.

For a better understanding, let’s look at the code below:

@DataGrid(autoSizeColumn = OptBoolean.FALSE)
class Foo {
    @DataColumn(style = "Foo_a_style")
    int a;
    @DataColumn(headerStyle = "Foo_b_headerStyle")
    int b;
    // ...
}

@DataGrid(columnHeaderStyle = "Bar_columnHeaderStyle")
class Bar {
    Foo foo;
    // ...
}

@DataGrid(columnStyle = "Baz_columnStyle", autoSizeColumn = OptBoolean.TRUE)
class Baz {
    Foo foo;
    // ...
}

Schema is generated and used when (de)serializing within SpreadsheetMapper, but you can generate using sheetSchemaFor.

SpreadsheetSchema foo = mapper.sheetSchemaFor(Foo.class);
SpreadsheetSchema bar = mapper.sheetSchemaFor(Bar.class);
SpreadsheetSchema baz = mapper.sheetSchemaFor(Baz.class);

Column attributes of schemas generated for each of the above types are as follows:

Column	Style	Header Style	Auto Size
foo.a	`"Foo_a_style"`	`""`	`FALSE`
foo.b	`""`	`"Foo_b_headerStyle"`	`FALSE`
bar.foo.a	`"Foo_a_style"`	`"Bar_columnHeaderStyle"`	`FALSE`
bar.foo.b	`""`	`"Foo_b_headerStyle"`	`FALSE`
baz.foo.a	`"Foo_a_style"`	`""`	`FALSE`
baz.foo.b	`"Baz_columnStyle"`	`"Foo_b_headerStyle"`	`FALSE`

Build Your Own Styles

@DataGrid(columnHeaderStyle = "baseHeader")
class Employee {
    // ...
    Address address;
    Employment employment;
    // ...
}

class Address {
    @DataColumn(value = "ZIPCODE", style = "zipcode")
    String zipcode;
    // ...
}

@DataGrid(columnHeaderStyle = "employmentHeader")
class Employment {
    // ...
    @DataColumn(value = "SALARY", style = "salary", headerStyle = "salaryHeader")
    long salary;
    // ...
}

StylesBuilder builder = new StylesBuilder()
        .cellStyle(/*name*/ "baseHeader")
            .border().thin()
            .fillForegroundColor(IndexedColors.GREY_25_PERCENT)
            .fillPattern().solidForeground()
            .font().bold().end()
            .end()
        .cellStyle(/*name*/ "employmentHeader", /*cloneStyleFrom*/ "baseHeader")
            .fillForegroundColor(198, 239, 206)
            .end()
        .cellStyle("salaryHeader", "baseHeader")
            .fillForegroundColor(0xFFC7CE)
            .font().bold().color(0x9C0006).end()
            .end()
        .cellStyle("salary")
            .dataFormat("[Red][>=100000]$#,##0.00;$#,##0.00")
            .end()
        .cellStyle("zipcode")
            .dataFormat().text()
            .end();
SpreadsheetMapper mapper = SpreadsheetMapper.builder()
        .stylesBuilder(builder)
        .build();
mapper.writeValue(output, values, type);

Built-in data formats for convenience

general(): "General"
text(): "@"
numberInt(): "0"
numberFloat(): "0.00"
numberIntWithComma(): "#,##0"
numberFloatWithComma(): "#,##0.00"
date(): "yyyy-mm-dd"
dateTime(): "yyyy-mm-dd hh:mm:ss"

SpreadsheetMapper mapper = SpreadsheetMapper.builder()
        .addModule(new ExcelDateModule())
        .build();

Change Origin Address

CellAddress address = ...;
SpreadsheetMapper mapper = SpreadsheetMapper.builder()
        .origin(address)
        .origin("B2")
        .origin(1, 1)
        .build();
mapper.writeValue(output, values, type);

`BLANK`	`BLANK`	`BLANK`
`BLANK`	Column A	Column B
`BLANK`	1	2
`BLANK`	3	4

Overwrite Column Names

SpreadsheetMapper mapper = SpreadsheetMapper.builder()
        .columnNameResolver(prop -> "Overwrite " + prop.getName().toUpperCase())
        .build();
mapper.writeValue(output, values, type);

Overwrite A	Overwrite B
1	2
3	4

enum ColumnCode {
    A("Code A"), B("Code B");
    String text;
    // ...
}

@Retention(RetentionPolicy.RUNTIME)
@interface NameOf {
    ColumnCode value();
}

@DataGrid
class Entity {
    @NameOf(ColumnCode.A)
    @DataColumn("It will be overwritten")
    int a;
    @NameOf(ColumnCode.B)
    int b;
    // ...
}

ColumnNameResolver byText = AnnotatedNameResolver.forValue(NameOf.class, ColumnCode::getText);
SpreadsheetMapper mapper = SpreadsheetMapper.builder()
        .columnNameResolver(byText)
        .build();
mapper.writeValue(output, values, type);

Code A	Code B
1	2
3	4

Reading Under Control

SpreadsheetReader reader = mapper.sheetReaderFor(Entry.class);
try (SheetMappingIterator<Entry> iterator = reader.readValues(input)) {
    while (iterator.hasNext()) {
        Entry value = iterator.next();
        SheetLocation location = iterator.getCurrentLocation();
        boolean done = ...;
        if (done) {
            break;
        }
    }
}

SpreadsheetMapper mapper = SpreadsheetMapper.builder();
        .enable(SheetParser.Feature.BREAK_ON_BLANK_ROW)
        .build();

On/Off Features in SheetParser.Feature

BLANK_ROW_AS_NULL (default true)
BREAK_ON_BLANK_ROW (default false)

And more.

Supported Types

Deserialization
- java.io.File
- java.io.InputStream
- io.github.scndry.jackson.dataformat.spreadsheet.deser.SheetInput<T>
- org.apache.poi.ss.usermodel.Sheet
Serialization
- java.io.File
- java.io.OutputStream
- io.github.scndry.jackson.dataformat.spreadsheet.ser.SheetOutput<T>
- org.apache.poi.ss.usermodel.Sheet

References

License

Project is licensed under Apache License 2.0.

1. XLSX — XSSF (XML SpreadSheet Format)

2. XLS — HSSF (Horrible SpreadSheet Format)

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.github		.github
gradle/wrapper		gradle/wrapper
src		src
.editorconfig		.editorconfig
.gitignore		.gitignore
LICENSE		LICENSE
README.adoc		README.adoc
build.gradle.kts		build.gradle.kts
gradlew		gradlew
gradlew.bat		gradlew.bat
lombok.config		lombok.config
settings.gradle.kts		settings.gradle.kts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Jackson dataformat: Spreadsheet

Installation

Maven

Gradle

Busy Developers' Guide to Features

Reading Spreadsheet

Writing Spreadsheet

Nested Object (de)Structuring

Writing the Nested List

Annotations and Schema

Attribute Resolution

Build Your Own Styles

Change Origin Address

Overwrite Column Names

Reading Under Control

And more.

Supported Types

References

License

About

Contributors 2

Languages

License

scndry/jackson-dataformat-spreadsheet

Folders and files

Latest commit

History

Repository files navigation

Jackson dataformat: Spreadsheet

Installation

Maven

Gradle

Busy Developers' Guide to Features

Reading Spreadsheet

Writing Spreadsheet

Nested Object (de)Structuring

Writing the Nested List

Annotations and Schema

Attribute Resolution

Build Your Own Styles

Change Origin Address

Overwrite Column Names

Reading Under Control

And more.

Supported Types

References

License

About

Topics

Resources

License

Stars

Watchers

Forks

Contributors 2

Languages