[Bug] Display incorrect datatype in partitioned parquet file #286

JonasTan2015 · 2024-10-17T13:31:02Z

Background

I have a parquet file and here is the schema.

id: varchar,
date: varchar

When I put the file in a partitioned directory, tst-data/date=2024-01-01/file.parquet, opened it in Tad, the UI showed the date column data type was Date.

But when I moved the same file to a non-partitioned directory, tst-data/file.parquet, Tad showed the date column data type was varchar.

This was misleading. When I viewed a partitioned parquet file generated by my Apache Spark job, I thought my Spark job output incorrect data type. And it took me some time to figure out it was Tad.

Expected Behavior

As a file viewer, TAD should only display data types as they are, and should not infer data types from partitioned directories

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Display incorrect datatype in partitioned parquet file #286

[Bug] Display incorrect datatype in partitioned parquet file #286

JonasTan2015 commented Oct 17, 2024

[Bug] Display incorrect datatype in partitioned parquet file #286

[Bug] Display incorrect datatype in partitioned parquet file #286

Comments

JonasTan2015 commented Oct 17, 2024

Background

Expected Behavior