Hello,
I want to converge something in my workflow and for that I need an array that is gradually filled in the fw_spec. Naturally I used FWActions with mod_spec and ‘_push’. However, while everything works as I thought it would if multiple Firetasks that use mod_spec are put in a single Firework, I encountered unexpected behaviour if I put each Firetasks in a single Firework.
More precisely assuming I am pushing to an array located at {'mynumbers': {'numbers': array}}
:
- If I use 4 Firetasks that each push a number (1, 2, 3, 4) and then a Firetask that prints the fw_spec and put them all in a single Firework, I get the expected array:
'mynumbers': {'numbers': [1, 2, 3, 4]}}
- If I construct 4 Fireworks that each pushes a number and then print the spec via two Firetasks and put those 4 in a workflow, The results after each step are:
'mynumbers': {'numbers': [1]}}
,'mynumbers': {'numbers': [1, 2]}}
,'mynumbers': {'numbers': [2, 3]}}
,'mynumbers': {'numbers': [3, 4]}}
.
This makes some sense, since the spec is only inherited by the first child Firework, and not the second, but I still think that the behaviour is not very intuitive. Furthermore, I tried to “fix” this behaviour by adding update_spec = fw_spec
to the FWAction, so that it now reads:
-
FWAction(mod_spec = [{'_push': {out_str: number}}], update_spec = fw_spec)
However, this leads after four pushes to this array in the spec:'mynumbers': {'numbers': [1, 2, 2, 3, 3, 4]}}
I have to say that I do not understand why this would be the result, and it shows me that I really do not understand how the fw_spec and its updates/modification works.
I thought that FWActions modify or update the spec for the current Firework, with the last Firetask in the Firework modifying the spec for the next Firework, but I guess this is not correct. Can someone comment on this?
Since my workflow does not allow me to push all the values for the convergence study from a single Firework, I really need to find a way to extend an array which is shared between Fireworks, and I found the following workaround:
- Instead of using
'_push'
, I first grab the array from the fw_spec, extend it in the Firetask using a normal.append()
and then use'_set'
to exchange the previous version of the array with the new one. This then replicates the behaviour I got when using ‘_push’ in a single Firework, but as with all workarounds, I am not fully happy.
Please advise me if the workaround given in 4. is the way to go, or if I can adapt 3. in a way that I can still use '_push'
? Also please tell me if the behaviour of the mod_spec '_push'
method is actually intended to work as it does, meaning differently within a single Firework and between different Fireworks?
Here follows the code for all my examples given above, so you can reproduce the behaviour for yourselfs. Many thanks, Michael
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from fireworks import Workflow, LaunchPad, FWAction, FiretaskBase, \
Firework
from fireworks.core.rocket_launcher import rapidfire
from fireworks.utilities.fw_utilities import explicit_serialize
def GetValueFromNestedDict(dictionary, key_list):
"""Return the value of a nested dict from a list of keys."""
if len(key_list) > 1:
if key_list[0] in dictionary:
return GetValueFromNestedDict(dictionary[key_list[0]],
key_list[1:])
else:
return None
return dictionary.get(key_list[0])
@explicit_serialize
class FT_PrintSpec(FiretaskBase):
"""Prints the spec of the current Workflow to the screen."""
def run_task(self, fw_spec):
import pprint
pprint.pprint(fw_spec)
@explicit_serialize
class FT_PushNumber(FiretaskBase):
"""Push a number to the end of an array in the spec using mod_spec _push."""
required_params = ['number', 'numbers_loc']
def run_task(self, fw_spec):
out_str = '->'.join(self['numbers_loc'])
number = self['number']
return FWAction(mod_spec = [{'_push': {out_str: number}}])
@explicit_serialize
class FT_PushNumberandUpdate(FiretaskBase):
"""Push a number to the end of an array in the spec using mod_spec _push."""
required_params = ['number', 'numbers_loc']
def run_task(self, fw_spec):
out_str = '->'.join(self['numbers_loc'])
number = self['number']
return FWAction(mod_spec = [{'_push': {out_str: number}}],
update_spec = fw_spec)
@explicit_serialize
class FT_PushNumberWithSet(FiretaskBase):
"""Push a number to the end of an array in the spec using mod_spec _set."""
required_params = ['number', 'numbers_loc']
def run_task(self, fw_spec):
numbers = GetValueFromNestedDict(fw_spec, self['numbers_loc'])
out_str = '->'.join(self['numbers_loc'])
if numbers is None:
number = [self['number']]
return FWAction(mod_spec = [{'_set': {out_str: number}}])
else:
numbers.append(self['number'])
return FWAction(mod_spec = [{'_set': {out_str: numbers}}])
def FW_PushNumber(number, number_loc, spec=None):
FT1 = FT_PushNumber(number=number, numbers_loc=number_loc)
FT2 = FT_PrintSpec()
FW = Firework([FT1, FT2], spec=spec, name='Push and Print')
return FW
def FW_PushNumberAndUpdate(number, number_loc, spec=None):
FT1 = FT_PushNumberandUpdate(number=number, numbers_loc=number_loc)
FT2 = FT_PrintSpec()
FW = Firework([FT1, FT2], spec=spec, name='Push, Update and Print')
return FW
def FW_PushNumberFourTimes(number1, number2, number3, number4, number_loc,
spec=None):
FT1 = FT_PushNumber(number=number1, numbers_loc=number_loc)
FT2 = FT_PushNumber(number=number2, numbers_loc=number_loc)
FT3 = FT_PushNumber(number=number3, numbers_loc=number_loc)
FT4 = FT_PushNumber(number=number4, numbers_loc=number_loc)
FT5 = FT_PrintSpec()
FW = Firework([FT1, FT5, FT2, FT5, FT3, FT5, FT4, FT5], spec=spec,
name='Push 4 times and Print')
return FW
def FW_PushNumberWithSet(number, number_loc, spec=None):
FT1 = FT_PushNumberWithSet(number=number, numbers_loc=number_loc)
FT2 = FT_PrintSpec()
FW = Firework([FT1, FT2], spec=spec, name='Set and Print')
return FW
if __name__ == "__main__":
num_loc = ['mynumbers', 'numbers']
lpad = LaunchPad.auto_load()
wf1 = Workflow([FW_PushNumberFourTimes(1, 2, 3, 4, num_loc)],
name='Push in single FW')
lpad.add_wf(wf1)
FW1 = FW_PushNumber(1, num_loc)
FW2 = FW_PushNumber(2, num_loc)
FW3 = FW_PushNumber(3, num_loc)
FW4 = FW_PushNumber(4, num_loc)
wf2 = Workflow([FW1, FW2, FW3, FW4], {FW1: [FW2], FW2: [FW3], FW3: [FW4]},
name='Push in four separate FWs')
lpad.add_wf(wf2)
FW1 = FW_PushNumberWithSet(1, num_loc)
FW2 = FW_PushNumberWithSet(2, num_loc)
FW3 = FW_PushNumberWithSet(3, num_loc)
FW4 = FW_PushNumberWithSet(4, num_loc)
wf3 = Workflow([FW1, FW2, FW3, FW4], {FW1: [FW2], FW2: [FW3], FW3: [FW4]},
name='Push using _set in four separate FWs')
lpad.add_wf(wf3)
FW1 = FW_PushNumberAndUpdate(1, num_loc)
FW2 = FW_PushNumberAndUpdate(2, num_loc)
FW3 = FW_PushNumberAndUpdate(3, num_loc)
FW4 = FW_PushNumberAndUpdate(4, num_loc)
wf4 = Workflow([FW1, FW2, FW3, FW4], {FW1: [FW2], FW2: [FW3], FW3: [FW4]},
name='Push And Update in four separate FWs')
lpad.add_wf(wf4)
rapidfire(lpad)