Let’s get rid of the Spread Sh!t.
This Jackson extension library is a component that memory-efficiently reads and writes the SpreadsheetML format[1] using Apache POI and StAX via Jackson abstraction. — Implemented as full streaming implementation, which allows full access (streaming, data-binding, and tree-model).
💬
|
The same abstraction is provided for the Horrible Microsoft Excel format[2]. However, not based on Event API, so may not be suitable for large .xls files. |
📙
|
This project is currently under development. If you are interested, you can always use the latest snapshot artifacts from OSS Sonatype. |
<repository>
<id>sonatype-snapshots</id>
<url>https://s01.oss.sonatype.org/content/repositories/snapshots/</url>
</repository>
<dependency>
<groupId>io.github.scndry</groupId>
<artifactId>jackson-dataformat-spreadsheet</artifactId>
<version>0.0.1-SNAPSHOT</version>
</dependency>
Want to use HSSF and XSSF read and write spreadsheets in a hurry? This guide is for you. — Apache POI’s Quick Guide.
First, let’s look at input-entries.xlsx file to read.
Column A | Column B |
---|---|
1 |
2 |
3 |
4 |
Then let’s define the Entry
class annotated @DataGrid
.
@DataGrid
public class Entry {
private int a;
private int b;
// Constructors, Getters, Setters and toString
}
We use SpreadsheetMapper
to read the file into an Entry
object, so let’s set it up now.
SpreadsheetMapper mapper = new SpreadsheetMapper();
File file = new File("input-entries.xlsx");
Once we have SpreadsheetMapper
configured, simply use readValue
.
Entry value = mapper.readValue(file, Entry.class);
// Output: Entry(a=1, b=2)
You can use readValues
to read the entire list.
List<Entry> values = mapper.readValues(file, Entry.class);
// Output: [Entry(a=1, b=2), Entry(a=3, b=4)]
By default, it reads the first sheet.
If you need to find and read a specific sheet by name or index, you can use SheetInput
.
String sheetName = "Entries";
SheetInput<File> input = SheetInput.source(file, sheetName);
List<Entry> values = mapper.readValues(input, Entry.class);
// Output: [Entry(a=1, b=2), Entry(a=3, b=4)]
We also use SpreadsheetMapper
to write an Entry
object to the file.
Just call writeValue
.
SpreadsheetMapper mapper = new SpreadsheetMapper();
File file = new File("output-entries.xlsx");
Entry value = new Entry(1, 2);
mapper.writeValue(file, value);
The output of the above in the file will be:
a |
b |
1 |
2 |
This time, let’s write a list.
List<Entry> values = Arrays.asList(new Entry(1, 2), new Entry(3, 4));
mapper.writeValue(file, values);
java.lang.IllegalArgumentException: `valueType` MUST be specified to write a value of a Collection or array type
Oh, Sheet!
Okay.
Let’s add a valueType
to the 3rd parameter.
Class<Entry> valueType = Entry.class;
mapper.writeValue(file, values, valueType);
a |
b |
1 |
2 |
3 |
4 |
Now let’s annotate on properties with @DataColumn
to write each column’s name.
@DataGrid
public class Entry {
@DataColumn("Column A")
private int a;
@DataColumn("Column B")
private int b;
// Constructors, Getters, Setters and toString
}
Column A |
Column B |
1 |
2 |
3 |
4 |
And also you can use SheetOutput
to write the sheet name.
By default, the created sheet is named "Sheet" + indexOfSheet
.
String sheetName = "Entries";
SheetOutput<File> output = SheetOutput.target(file, sheetName);
mapper.writeValue(output, values, Entry.class);
Spreadsheets are a flat data representation of a 2-dimensional structure, but most of the POJOs we have take complex nested structures. Of course, it provides (de)structuring these complex models, as you can see below:
ID | NAME | ZIPCODE | ADDRESS LINE 1 | ADDRESS LINE 2 | DESIGNATION | SALARY |
---|---|---|---|---|---|---|
1 |
John Doe |
12345 |
123 Main St. |
Anytown, USA |
CEO |
$300,000.00 |
As we are Object-Oriented Developers, you probably don’t want to manage a lot of column data in a flat model.
Let’s first define the well-structured Employee
class.
@DataGrid
class Employee {
int id;
String name;
Address address;
Employment employment;
// ...
}
class Address {
String zipcode;
AddressLines addressLines;
// ...
}
class AddressLines {
String line1;
String line2;
// ...
}
class Employment {
Designation designation;
long salary;
// ...
}
enum Designation {
CEO, CTO, SM, ARCH, INT
}
No additional code is required when deserializing data into nested structures, you can deserialize the same way.
Employee value = mapper.readValue(input, Employee.class);
Here’s the Employee
object that output will be:
Employee( id=1 name=John doe address=Address( zipcode=12345 addressLines=AddressLines( line1=123 Main St. line2=Anytown, USA ) ) employment=Emeployment( designation=CEO salary=300000 ) )
Just as can be deserialized without any additional code, you can serialize these nested objects.
mapper.writeValue(output, values, Employee.class);
Although limited, it supports serializing nested list structures. However, deserializing is not supported due to implementation complexity.
@DataGrid
class Outer {
@DataColumn("A") int a;
List<Inner> inners;
@DataColumn("E") int e;
// ...
}
class Inner {
@DataColumn("B") int b;
@DataColumn("C") int[] c;
@DataColumn("D") int d;
// ...
}
Outer value = new Outer(
1, Arrays.asList(
new Inner(2, new int[]{3, 4}, 5),
new Inner(6, new int[]{7, 8}, 9)),
10);
mapper.writeValue(output, value);
This will lead to the desired result like this:
A | B | C | D | E |
---|---|---|---|---|
1 |
2 |
3 |
5 |
10 |
|
|
4 |
|
|
|
6 |
7 |
9 |
|
|
|
8 |
|
|
For most column schema generation requirements, @DataGrid
and @DataColumn
annotations will provide all you need.
The two have almost similar attributes, so only listing attributes of @DataColumn
.
@DataColumn
-
String value()
: The column’s name to write in header -
String style()
: Thename
of thecellStyle
to apply to the column, it should be registered withStylesBuilder
. -
String headerStyle()
: Same asstyle()
, but only applied to the column’s header. -
int width()
: Fixed width for the column. -
OptBoolean autoSize()
: Adjusts the column width to fit the contents. Not accurate for fullwidth forms like the CJK characters. -
int minWidth()
: Minimum width of the auto-sized column. -
int maxWidth()
: Maximum width of the auto-sized column.
Column schema uses particular order designed to allow for value overrides. Attributes are considered in the following order:
-
@DataColumn
's attribute on property. -
@DataGrid
's attribute on the class in which the property is declared. -
@DataGrid
's attribute of the class of which the class in which the property is declared is a member.
For a better understanding, let’s look at the code below:
@DataGrid(autoSizeColumn = OptBoolean.FALSE)
class Foo {
@DataColumn(style = "Foo_a_style")
int a;
@DataColumn(headerStyle = "Foo_b_headerStyle")
int b;
// ...
}
@DataGrid(columnHeaderStyle = "Bar_columnHeaderStyle")
class Bar {
Foo foo;
// ...
}
@DataGrid(columnStyle = "Baz_columnStyle", autoSizeColumn = OptBoolean.TRUE)
class Baz {
Foo foo;
// ...
}
Schema is generated and used when (de)serializing within SpreadsheetMapper
, but you can generate using sheetSchemaFor
.
SpreadsheetSchema foo = mapper.sheetSchemaFor(Foo.class);
SpreadsheetSchema bar = mapper.sheetSchemaFor(Bar.class);
SpreadsheetSchema baz = mapper.sheetSchemaFor(Baz.class);
Column attributes of schemas generated for each of the above types are as follows:
Column | Style | Header Style | Auto Size |
---|---|---|---|
foo.a |
|
|
|
foo.b |
|
|
|
bar.foo.a |
|
|
|
bar.foo.b |
|
|
|
baz.foo.a |
|
|
|
baz.foo.b |
|
|
|
@DataGrid(columnHeaderStyle = "baseHeader")
class Employee {
// ...
Address address;
Employment employment;
// ...
}
class Address {
@DataColumn(value = "ZIPCODE", style = "zipcode")
String zipcode;
// ...
}
@DataGrid(columnHeaderStyle = "employmentHeader")
class Employment {
// ...
@DataColumn(value = "SALARY", style = "salary", headerStyle = "salaryHeader")
long salary;
// ...
}
StylesBuilder builder = new StylesBuilder()
.cellStyle(/*name*/ "baseHeader")
.border().thin()
.fillForegroundColor(IndexedColors.GREY_25_PERCENT)
.fillPattern().solidForeground()
.font().bold().end()
.end()
.cellStyle(/*name*/ "employmentHeader", /*cloneStyleFrom*/ "baseHeader")
.fillForegroundColor(198, 239, 206)
.end()
.cellStyle("salaryHeader", "baseHeader")
.fillForegroundColor(0xFFC7CE)
.font().bold().color(0x9C0006).end()
.end()
.cellStyle("salary")
.dataFormat("[Red][>=100000]$#,##0.00;$#,##0.00")
.end()
.cellStyle("zipcode")
.dataFormat().text()
.end();
SpreadsheetMapper mapper = SpreadsheetMapper.builder()
.stylesBuilder(builder)
.build();
mapper.writeValue(output, values, type);
-
general()
:"General"
-
text()
:"@"
-
numberInt()
:"0"
-
numberFloat()
:"0.00"
-
numberIntWithComma()
:"#,##0"
-
numberFloatWithComma()
:"#,##0.00"
-
date()
:"yyyy-mm-dd"
-
dateTime()
:"yyyy-mm-dd hh:mm:ss"
SpreadsheetMapper mapper = SpreadsheetMapper.builder()
.addModule(new ExcelDateModule())
.build();
CellAddress address = ...;
SpreadsheetMapper mapper = SpreadsheetMapper.builder()
.origin(address)
.origin("B2")
.origin(1, 1)
.build();
mapper.writeValue(output, values, type);
|
|
|
|
Column A |
Column B |
|
1 |
2 |
|
3 |
4 |
SpreadsheetMapper mapper = SpreadsheetMapper.builder()
.columnNameResolver(prop -> "Overwrite " + prop.getName().toUpperCase())
.build();
mapper.writeValue(output, values, type);
Overwrite A |
Overwrite B |
1 |
2 |
3 |
4 |
enum ColumnCode {
A("Code A"), B("Code B");
String text;
// ...
}
@Retention(RetentionPolicy.RUNTIME)
@interface NameOf {
ColumnCode value();
}
@DataGrid
class Entity {
@NameOf(ColumnCode.A)
@DataColumn("It will be overwritten")
int a;
@NameOf(ColumnCode.B)
int b;
// ...
}
ColumnNameResolver byText = AnnotatedNameResolver.forValue(NameOf.class, ColumnCode::getText);
SpreadsheetMapper mapper = SpreadsheetMapper.builder()
.columnNameResolver(byText)
.build();
mapper.writeValue(output, values, type);
Code A |
Code B |
1 |
2 |
3 |
4 |
SpreadsheetReader reader = mapper.sheetReaderFor(Entry.class);
try (SheetMappingIterator<Entry> iterator = reader.readValues(input)) {
while (iterator.hasNext()) {
Entry value = iterator.next();
SheetLocation location = iterator.getCurrentLocation();
boolean done = ...;
if (done) {
break;
}
}
}
SpreadsheetMapper mapper = SpreadsheetMapper.builder();
.enable(SheetParser.Feature.BREAK_ON_BLANK_ROW)
.build();
SheetParser.Feature
-
BLANK_ROW_AS_NULL
(defaulttrue
) -
BREAK_ON_BLANK_ROW
(defaultfalse
)
-
Deserialization
-
java.io.File
-
java.io.InputStream
-
io.github.scndry.jackson.dataformat.spreadsheet.deser.SheetInput<T>
-
org.apache.poi.ss.usermodel.Sheet
-
-
Serialization
-
java.io.File
-
java.io.OutputStream
-
io.github.scndry.jackson.dataformat.spreadsheet.ser.SheetOutput<T>
-
org.apache.poi.ss.usermodel.Sheet
-
Project is licensed under Apache License 2.0.