Workflow submission scenarios #261

jluethi · 2022-11-14T09:16:54Z

While working out the web GUI interactions, I came up with a more detailed version of workflow submissions. I'm summarizing here such that we have a centralized overview of the scenarios (see also draft of new functional specs, submission section).

A user submits a full workflow for the first time (already supported)
A user reruns a workflow from the beginning (should be supported in the 1.x release, right)
A user runs a workflow until a given step (e.g. 4 task workflow, the user runs the first 2 => running the preprocessing steps, but not the image analysis steps yet)
A user continues the run of a workflow (e.g. after 3, now runs the second part of the workflow, tasks 3 & 4)
A user tests parameters of the workflow on a subset of the data. E.g. user ran scenario 3 (tasks 1 & 2), then runs multiple options of tasks 3 on a subset of the data. These subsets are saved to temporary files, not overwriting the main OME-Zarr file => see Running a workflow on a subset of data #109
A user reruns part of a workflow (e.g. reruns tasks 3 & 4). There need to be some restrictions to what can be rerun, will cover this in a separate issue.

Here is a sketch of these scenarios:

There is a main workflow (see discussion here: #236) that the user sees & where the user edits parameters. They first submit tasks 1 & 2 (workflow submission ID1). Then, the user experiments with options for task 3 (workflow submission IDs 2-4). Finally, they submit the fitting parameter on the whole dataset & run tasks 3 & 4 (workflow submission ID5)

tcompa · 2023-05-23T11:47:05Z

I'm dumping here some meetings note concerning the "running a workflow from task A to task B" feature.

Running workflow from A to B: to be more clearly defined => what do we need to clarify? User stories? Implementation approach?

User stories:

I’m not ready for the whole workflow yet. I just want to look at the image first (=> run 0 to n)
Continue a workflow: n to m
Something failed at step 3 (because input parameters were wrong). Let me correct parameters and rerun from there
Tricky rerun part: clean up old outputs? Partial failure? => not initial scope [reserved keyword argument of the task]
I’m just testing things (trying some parameters)

Goal: Tackle user stories 1-3

Assumptions:

Current status
Datasets are little metadata layers (collection of resources => paths + metadata)
Running a workflow has an input & output. Those are for the whole workflow => no clear access to intermediate states
Use case 3 is tricky => no info on what the last valid dataset is [some history is stored, but only at the end of the workflow]
Database is never touched from beginning to end of the workflow (separation of concern between runner & db)
Get job status is retrieved from disk (from the metadata.json, which is updated after every task [at least the history is updated, metadata dict updates are optional]) => use of this file is only for job monitoring endpoint
At the end of the workflow, db is updated

New
If a workflow fails, should the metadata be updated?
What we can do now: Update to the last valid state, i.e. to the state at the end of step 2.
Writing the last valid state

Requirement: We need to select to correct task to restart from
User needs to start from the correct dataset (the output dataset)
This will require an additional check: IO compatibility of tasks: We only check for input of workflow and output of workflow, not for each task
Tasks can modify dataset types

Q: Does the user need to define the output dataset? If the task defines (& changes) the output type, not necessarily
=> Tasks define their input/output

Runner has no access to the db => monitoring & updates goes via the metadata file

tcompa · 2023-07-10T12:39:39Z

Use case 6:
Switch back and forth between 3D and 2D OME-Zarrs, e.g. go back to 3D dataset after performing MIP.

Likely this could go through a more structured Dataset.meta attribute. Instead of just a list parallelization_level -> list_of_components, we could have something a bit more structured (possibly mimicking some part of the OME-Zarr structure).
This work would happen mostly on the task side, and possibly lead to some limited fractal-server updates.

The guiding principle in this should be that "Each attribute in metadata needs to exist somewhere else in the OME-Zarr file", and we would also need to rely on "Bootstrap metadata from an existing OME-Zarr file".

Other related fractal-tasks-core issues:

jluethi · 2023-09-27T12:47:14Z

Reviewed. Functionality of 1-4 is already implemented, 5 is indeed covered with:
fractal-analytics-platform/fractal-tasks-core#342
fractal-analytics-platform/fractal-tasks-core#279

jluethi added this to the 1) Improve Workflow Flexibility & Data Lifecycle milestone Nov 14, 2022

jluethi added the Overview label Nov 14, 2022

jluethi added this to Fractal Project Management Nov 14, 2022

jluethi moved this to TODO in Fractal Project Management Nov 14, 2022

jluethi mentioned this issue Nov 14, 2022

Restrictions for rerunning partial workflows #262

Closed

jluethi added the High Priority Current Priorities & Blocking Issues label Dec 14, 2022

jluethi added Priority Important, but not the highest priority and removed High Priority Current Priorities & Blocking Issues labels Jan 11, 2023

tcompa added the High Priority Current Priorities & Blocking Issues label Mar 7, 2023

jluethi removed the High Priority Current Priorities & Blocking Issues label Mar 15, 2023

tcompa added High Priority Current Priorities & Blocking Issues and removed Priority Important, but not the highest priority labels Jul 6, 2023

This was referenced Jul 6, 2023

Simplify the function that executes a WorkflowTask list #777

Closed

Support execution of a workflow subset #783

Closed

Review I/O of tasks #785

Closed

Support "resume workflow execution" #788

Closed

tcompa mentioned this issue Jul 10, 2023

Extract attributes from ome-zarr rather than from metadata (whenever possible) fractal-analytics-platform/fractal-tasks-core#351

Closed

tcompa mentioned this issue Sep 14, 2023

UNKNOWN ERROR Original error: 'image' when rerunning a failed Cellpose task #842

Closed

tcompa added the flexibility Support more workflow-execution use cases label Sep 15, 2023

jluethi closed this as completed Sep 27, 2023

github-project-automation bot moved this from TODO to Done in Fractal Project Management Sep 27, 2023

jluethi removed this from Fractal Project Management Apr 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Workflow submission scenarios #261

Workflow submission scenarios #261

jluethi commented Nov 14, 2022 •

edited

Loading

tcompa commented May 23, 2023

tcompa commented Jul 10, 2023

jluethi commented Sep 27, 2023

Workflow submission scenarios #261

Workflow submission scenarios #261

Comments

jluethi commented Nov 14, 2022 • edited Loading

tcompa commented May 23, 2023

tcompa commented Jul 10, 2023

jluethi commented Sep 27, 2023

jluethi commented Nov 14, 2022 •

edited

Loading