-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Pydantic model for Channel
s
#386
Comments
For the record, and assuming we proceed as described above: If we add a default value to one of the internal variables (e.g. `wavelength_id: str = "something"), then this is correctly written into the JSON Schema, meaning that it could be available in fractal-web. Note, however, that this will not be used by fractal-server when populating |
I guess that's a price that should be fine to pay, especially if we go the pydantic.v1 route once reasonable?
I'd even call this a potential win,
Pity, that would be neat.
That would be great to make the tasks a bit simpler, i.e. having less input validation at the start.
Can we briefly discuss this tomorrow? Not sure I fully understand this and I'd be a big fan of the workflow that's downloaded actually containing all the args that were run (though if there's no way to change the defaults on the server-side anymore, the risks here do decrease a bit) |
I'm generally OK with moving in this direction, but this also forces us to take a more strict choice out of the two following options:
from fractal_tasks_core.channels import Channel
from fractal_tasks_core.tasks.create_ome_zarr import create_ome_zarr
my_channels = [Channel(...something...)]
create_ome_zarr(..., my_channels, ...)
Up to now we were keeping pydantic on the |
Sure, let's rediscuss it. What I'm saying, briefly, is that if there is a default for a specific Channel attribute (e.g. |
As of the call this morning, we'll proceed with this approach for
@jluethi feel free to mention other cases where this approach may be relevant, apart from |
Channel
s
Given that the channel contains metadata that we put in the omero metadata and should be quite general, I'm happy with going with Channel here.
Ok, those would be useful to have as well eventually, let's track that separately.
I will keep an eye on this when reviewing the core tasks default values. One example that comes to mind: Could we make |
For the record, here is the schema for {
"type": "object",
"properties": {
"window": {
"type": "object",
"properties": {
"end": { "type": "number" },
"max": { "type": "number" },
"min": { "type": "number" },
"start": { "type": "number" }
},
"required": ["start", "min", "end", "max" ]
},
"label": { "type": "string" },
"family": { "type": "string" },
"color": { "type": "string" },
"active": { "type": "boolean" }
},
"required": [ "window", "color" ]
} to which we are for sure adding some optional first-level properties like:
There is a (minor) decision to take, concerning the class ChannelWindow(BaseModel):
min: str
max: str
start: str
end: str
class Channel(BaseModel):
wavelength_id: str
window: ChannelWindow
label: Optional[str]
... which brings us closer to the ome-ngff schema for omero channels. This choice, however, will break our current examples, because we introduced some additional first-level attributes like {
...
"start": 0,
...
} will need to become something like {
"window": {"start": 0, ...}
} I'm in favor of this change, to move our model closer to the specs, but it's not strictly needed - thus I can avoid introducing it if we are annoyed by the breaking change (@jluethi). |
Some additional context: What we are currently doing goes in the same direction of https://github.com/JaneliaSciComp/pydantic-ome-ngff (see for an example), which in my view is very relevant for long-term integration with ome-ngff through formal validation against known schemas (or Pydantic models). In the current (very preliminary!) status of pydantic-ome-ngff:
That said, I think it'll be worth mentioning this example on that repo, at some point, to show our interest and possibly provide additional motivation. To be clear, I don't think we should depend on that package any time soon, but if it is transformed into an "official" encoding of the specs (we are not there yet), then I'd be quite interested in using it. |
Some more additional context With no need for comments, see this code block from https://github.com/czbiohub/iohub/blob/main/iohub/ngff_meta.py#L231 class WindowDict(TypedDict):
"""Dictionary of contrast limit settings"""
start: float
end: float
min: float
max: float
class ChannelMeta(MetaBase):
"""Channel display settings without clear documentation from the NGFF spec.
https://docs.openmicroscopy.org/omero/5.6.1/developers/Web/WebGateway.html#imgdata
"""
active: bool = False
coefficient: float = 1.0
color: ColorType = "FFFFFF"
family: str = "linear"
inverted: bool = False
label: str = None
window: WindowDict = None |
I'm aware of
Now's the time for such breaking changes :) I'm also in favor of moving close to their model. We can't fully adopt their spec for our input though: We will always calculate the correct min & max, never ask the user (they are based on whether the image is an uint8 or uint16. Thus, we will have a slightly different input spec than their definition for what we request as inputs. I'm also not foreseeing asking the user for Other than that, big fan of moving closer to them and, in the far future, even depending on a package like that! |
We have always been using these two default values (coefficient=1 and inverted=False), since the major channel refactor. They were also present in the first examples we discussed, and they are part of the transitional OME-NGFF omero metadata. I can take them away, if there is any reason, or leave them as they are. https://docs.openmicroscopy.org/omero/5.6.1/developers/Web/WebGateway.html#imgdata |
Yes, indeed. We will have to set them as optional, and then assign them inside our own logic. But that's a reasonable trade-off, IMO. |
This will also be the case for |
I'd leave them hard-coded until we figure out if we want to do something with them
Agreed, that's a good trade-off for these things! Same for |
Worth mentioning: As of 153fb06, we have a test for This makes it clear, once again, that introducing a deeper integration of the OME-NGFF schemas will be a very interesting idea for the future. |
Very nice! |
Quick question: the Should we fail when we try processing an existing zarr file that does not comply (i.e. that doesn't have the (I guess the latter is true, or we will be a bit more strict in what we can process..) |
As far as input to a fractal tasks are concerned, the window should be something users can optionally specify. They may specify a start & end (=> affects display of the image). I thought the omero metadata supports not having the start & end specified. If they are not specified, the napari viewer just handles that as far as I know. Thus, we only optionally ask start & end from user. We always provide a window but so far, this window always included min & max, while start & end were optional |
Also, the omero metadata is labeled as transitional, see https://ngff.openmicroscopy.org/latest/index.html#omero-md Thus, I would not expect that all OME-Zarrs we ever process will have an omero metadata. I wouldn't be too strict here thus. |
What about an existing zarr? Does it have to have windows specified? EDIT: answer was in the second comment.. |
Agreed. This is the current behavior (in the PR) for when we create a zarr ourselves. |
I'm now merging #410, but another caveat is due (and we can continue in a different issue if it requires a discussion) about how to handle existing OME-Zarrs. Right now, in #410, I did use the new
For case 1, it's OK to be as strict as possible (that is, a user cannot include arbitrary metadata in a newly created OME-Zarr), but this could be an issue for case 2. What should we do if we receive an OME-Zarr created outside Fractal, that includes some non-standard metadata? With the current version of #410, all these metadata would be discarded upon reading from disk. This is not a severe issue at the moment, because we only read the channels metadata (in case 2) and never write them back to disk. It would become an issue if we have a task that also writes the channel metadata back to disk, because in that case we would be loosing some of the original non-standard metadata. |
Good points. There are 2 areas where this may become relevant:
|
Channel
sChannel
s
For the record, what was described in #386 (comment) is now tested as part of fractal-demos, with create-ome-zarr arguments that look like {
"allowed_channels": [
{
"color": "00FFFF", // renamed from `colormap` to `color`, as per specs
"wavelength_id": "A01_C01",
"label": "DAPI",
"window":{ // this nesting of `window` is new
"start": 110,
"end": 800
}
},
{
"color": "FF00FF",
"wavelength_id": "A01_C02",
"label": "nanog",
"window": {
"start": 110,
"end": 290
}
},
{
"color": "FFFF00",
"wavelength_id": "A02_C03",
"label": "Lamin B1",
"window": {
"start": 110,
"end": 1600
}
}
]
,
"coarsening_xy": 2,
"num_levels": 5,
"image_extension": "png"
} |
Branching from #377.
For the moment I could only obtain something interesting via pydantic, while I got nowhere while trying with
dataclasses
or plain classes.A basic example would be this script
that prints this schema
Notes:
channel.label
, rather thanchannel["label"]
(not a big deal, but propagating this into thefractal_tasks_core.lib
subpackage could be annoying, because suddenly any other package using ourlib
will have to depend on pydantic at runtime).The text was updated successfully, but these errors were encountered: