Delayed tasks and branched workflows

Daryll_Strauss · January 29, 2015, 8:38pm

Hi Fireworks Folks,

We’re building an application that manages data and processing within a facility. I’ve started looking at Fireworks as a possible solution to some of our workflow tasks. The philosophy and toolset seems to match really well with our development. I’ve been reading the docs and running the tutorials, but I’m left with a few questions that I’m hoping you can suggest some approaches.

The first question is how to handle “human” tasks. An example of one of our human tasks would be loading a tape. We want to execute a workflow where some of the data is currently offline. I’d like to have a task that basically pauses the workflow and asks the human to load the tape. When the tape is loaded we restart the workflow and finish the processing.

One option seems to be to fizzle the workflow and the resume it. The problem is that fizzle seems too harsh. Since any exception is shown as a fizzle it might be problematic mixing the pause with failures.

Another option would be to have a task that just went to sleep and woke up periodically to check external sources to see if the task is done. My concern there is that I don’t want to hold up a compute resource when the task is sleeping. Could I lower the priority on that task and let other tasks run? Fundamentally it feels sort of wrong to leave a task sitting idle like that.

Is there a better approach?

I’m also wondering if there’s some way to do branched workflows. For example, I might want to have one path that occurs if the task executes successfully and a different branch if the task fails. The failure path might attempt to perform some self healing, notify someone, etc, before it loops back to retry the task. Another example is a conditional task that looks at its input and decides which branch of a workflow to take. My failure path might keep a counter of the number of times a task is retried and give up if it has tried too many times.

It seems like I could create a set of fireworks templates that broke up the task to before, doit, heal, after. Then has the tasks launch before, which eventually launches doit. If doit works it launches after. If it doesn’t it launches heal, which again launches doit. If they passed around a retry counter the heal routine could eventually fail if the retry counter was too high. That just seems fairly complicated and would make monitoring much more difficult, since you’d have a bunch of separate fireworks to track for the original workflow.

Is there a way to create conditional branches in the workflow?

Thanks,

Daryll

Anubhav_Jain · January 30, 2015, 12:40am

Hi Daryll,

Thanks for reaching out -

In terms of “pausing” a workflow to wait for a human task, have you considered the “defuse” and “reignite” options (I am now coming to regret the silly names, but this is basically just pause and restart - see http://pythonhosted.org//FireWorks/reference.html#interpretation-of-state-of-fws-and-wfs). The defuse will not get mixed up with “fizzled” (failed), so you won’t be confusing real failures with the innocent pauses. A Firework can be defused manually (through the command line) or automatically after executing the previous Firework. For the manual option, see the docs:

http://pythonhosted.org/FireWorks/defuse_tutorial.html

For the automatic option, your Firework can return a FWAction with the defuse_children set to True:

http://pythonhosted.org//FireWorks/guide_to_writing_firetasks.html

Thus, in this latter case your job can automatically pause itself, and then someone can “reignite” it when ready.

There are also options for modifying the workflow automatically in response to the calculation behavior. This again involves using the FWAction object at the end of a task and the docs are here:

http://pythonhosted.org//FireWorks/guide_to_writing_firetasks.html

In particular, the “detour” option can be used for self-healing behavior. With this you can run additional Workflow(s) for the healing, but FireWorks will actually bring you back into your “main” workflow when finished. Finally, note that there is an “_allow_fizzled_parents” option in the spec which will run the next job in the Workflow even if the parent is FIZZLED, thus you can implement automatic error handling even for FIZZLED jobs, see:

http://pythonhosted.org//FireWorks/reference.html#reserved-keywords-in-fw-spec

For branching, you can use the “additions” option, and tailor your “additions” based on what branch to take. You can also terminate from the normal course of the Workflow and start using something new using a combination of defuse_children=True and additions=[new_workflows].

I hope that helps get you started, but if you run into any problems let us know.

Best,

Anubhav

···

On Thursday, January 29, 2015 at 12:38:55 PM UTC-8, Daryll Strauss wrote: