Each attribute in `metadata` needs to exist somewhere else in the OME-Zarr file #212

tcompa · 2022-11-17T14:35:03Z

This would be the best-case scenario. Let's see whether we can enforce it strictly.

jluethi · 2022-11-22T09:56:07Z

Result from some conversation with people using OME-Zarrs:

We should not be using the OME-XML transitionary metadata. It's not meant for long-term and we don't need to get into that
We should be able to store arbitrary key-value pairs with metadata in the .zattrs. Let's think about where in the .zattrs they should go for each case and, if it's something more general, let's try to contribute it back to the spec

tcompa · 2022-11-22T10:06:25Z

We should not be using the OME-XML transitionary metadata.

Are we currently using any of those? (or: is there an easily-accessible list?)

jluethi · 2022-11-22T10:12:18Z

Are we currently using any of those? (or: is there an easily-accessible list?)

No, we aren't using it. It would have been one of the options to store additional metadata, see details here: https://ngff.openmicroscopy.org/latest/#bf2raw

I'm glad we don't need to get into this :)

tcompa · 2022-11-22T10:16:55Z

Current items that are in metadata and that we could think about moving to .zattrs:

Lists of plates/wells/images. These are already present in the .zattrs files, but it is convenient to duplicate them in the metadata to avoid a lot of parsing and string styling here and there.
Number of pyramid levels. This is also already available in the .zattrs files. If we accept that tasks should start by reading this information, then we can remove it from the metadata and only use the OME-NGFF information.
Coarsening factor. This is not directly available, but it can be retrieved with a bit of work from the scale transformations. It would be a good candidate (in my view) for being stored in the .zattrs, but I'm not sure I would suggest it as part of the specs - since it would mix with the scale information.
Channel list. This is totally fractal-custom, and it currently has an (ordered!) list like ["A01_C01", ...]. To be updated based on Refactor: How do we refer to channels? #211.

Then there are a few attributes which we use to smoothly propagate some parameters within pairs of associated tasks (server-side ref fractal-analytics-platform/fractal-server#6). This is related to #177, that is, the choice of how to split arguments into two sets: the ones to be filled in automatically by fractal (e.g. component will most likely be here) and the ones to be specified as part of args.

Currently, we use
5. original_paths, to pass the image folders from zarr-creation task to yokogawa_to_zarr
6. replicate_zarr attributes (again, there are some paths of the original file to replicate), which are then used for the MIP task. This one we could in principle refactor, since it is probably a bit redundant.

Unless I've missed something, this is the current list.
Note that some of these parameters already exist in two forms, for (non-)multiplexing cases.

tcompa · 2022-11-22T10:20:47Z

Broadly speaking, I think that adding some key-value pairs to some .zattrs could be extremely helpful, e.g., for #211 (and maybe somehow also for #199 or #200).

jluethi · 2022-11-22T11:01:04Z

From a task side, it's very attractive if it only needs a path to an OME-Zarr file and content-parameter (like a model choice for cellpose), but can get the rest of the metadata directly from the OME-Zarr file.

Things like the list of plates, wells (& images?) are not that though, because they are used by Fractal, not by the individual task, right? If it makes sense to have some of that metadata on the Fractal side, that doesn't take away anything from the generalizability of the tasks.

=> Things that are needed for the tasks to run should be read from the OME-Zarr metadata where-ever possible. Additional metadata is ok to be Fractal specific. This goes in hand with your suggestion of separation between component info and args for me, maybe there are some inputs like component that we keep Fractal specific.

Concretely:

Pyramid levels may be a nice thing to load from the metadata. It would be nice if tasks can reliably read it from the OME-Zarr. (but not urgent)
Coarsening factor: Hmm, I don't think we should use OME-Zarr metadata for things we don't envision contributing back to the spec. Fine if we think that it will make it into the spec. But for this case: Either we continue to use our metadata (default) or we process it reliably from the existing metadata in the OME-Zarr
Let's tackle the channel list! That will be an interesting question of where it should go and whether we add information like the light-path specific parts (e.g. A01_C01) or some other metadata as key-value pairs to the OME-Zarr. Strengthens the point from Refactor: How do we refer to channels? #211 again that we may need multiple ways to refer to channel, also depending on what metadata is available in the OME-Zarr.

tcompa · 2022-12-01T16:20:36Z

Note that once PR #239 is merge we won't have channel_list in metadata any more.

tcompa · 2023-09-15T11:39:41Z

Yesterday we re-discussed this issue, and we identified multiple (current) uses of metadata. Most of them should be deprecated, in favor of more specific sources of information.

Provide read/write access to component list ("read" from fractal-server, "write" from a combination of tasks and fractal-server). We plan to defer this functionality to a new task (ref Introduce import-ome-zarr task #521). TBD later if this has to be a standard task or the "init" phase of a new, more complex, task object.
Store dataset information (e.g. coarsening_xy and the number of pyramid levels). This use is now deferred to Extract attributes from ome-zarr rather than from metadata (whenever possible) #351; and then this use should be deprecated, and these parameters should not belong to Dataset.meta.
Store the dataset history -> this should not pass through meta any more - ref Move Dataset.meta["history"] into Dataset.history fractal-server#838.
Exchange information in a pair of tasks - ref How do task pairs share information? #299. This is a way of using meta as a temporary buffer.

Since each use of metadata is related to a specific issue, I'm closing this one.

jluethi added this to Fractal Project Management Nov 17, 2022

jluethi moved this to TODO in Fractal Project Management Nov 17, 2022

tcompa mentioned this issue Dec 1, 2022

Refactor: How do we refer to channels? #211

Closed

jluethi mentioned this issue Jan 20, 2023

Where does omero metadata go? #232

Closed

jluethi mentioned this issue Apr 14, 2023

When should tasks write metadata files? Should they write them to disk at all? fractal-analytics-platform/fractal-server#621

Closed

4 tasks

This was referenced Jul 10, 2023

Workflow submission scenarios fractal-analytics-platform/fractal-server#261

Closed

Extract attributes from ome-zarr rather than from metadata (whenever possible) #351

Closed

tcompa added the flexibility Support more workflow-execution use cases label Sep 15, 2023

tcompa mentioned this issue Sep 15, 2023

Introduce import-ome-zarr task #521

Closed

tcompa closed this as completed Sep 15, 2023

github-project-automation bot moved this from TODO to Done in Fractal Project Management Sep 15, 2023

jluethi removed this from Fractal Project Management Apr 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Each attribute in `metadata` needs to exist somewhere else in the OME-Zarr file #212

Each attribute in `metadata` needs to exist somewhere else in the OME-Zarr file #212

tcompa commented Nov 17, 2022

jluethi commented Nov 22, 2022

tcompa commented Nov 22, 2022

jluethi commented Nov 22, 2022

tcompa commented Nov 22, 2022 •

edited

Loading

tcompa commented Nov 22, 2022

jluethi commented Nov 22, 2022 •

edited

Loading

tcompa commented Dec 1, 2022

tcompa commented Sep 15, 2023

Each attribute in metadata needs to exist somewhere else in the OME-Zarr file #212

Each attribute in metadata needs to exist somewhere else in the OME-Zarr file #212

Comments

tcompa commented Nov 17, 2022

jluethi commented Nov 22, 2022

tcompa commented Nov 22, 2022

jluethi commented Nov 22, 2022

tcompa commented Nov 22, 2022 • edited Loading

tcompa commented Nov 22, 2022

jluethi commented Nov 22, 2022 • edited Loading

tcompa commented Dec 1, 2022

tcompa commented Sep 15, 2023

Each attribute in `metadata` needs to exist somewhere else in the OME-Zarr file #212

Each attribute in `metadata` needs to exist somewhere else in the OME-Zarr file #212

tcompa commented Nov 22, 2022 •

edited

Loading

jluethi commented Nov 22, 2022 •

edited

Loading