Inconsinstent behaviour of mod_spec _push?

mwo · May 5, 2020, 11:16am

Hello,

I want to converge something in my workflow and for that I need an array that is gradually filled in the fw_spec. Naturally I used FWActions with mod_spec and ‘_push’. However, while everything works as I thought it would if multiple Firetasks that use mod_spec are put in a single Firework, I encountered unexpected behaviour if I put each Firetasks in a single Firework.

More precisely assuming I am pushing to an array located at {'mynumbers': {'numbers': array}}:

If I use 4 Firetasks that each push a number (1, 2, 3, 4) and then a Firetask that prints the fw_spec and put them all in a single Firework, I get the expected array: 'mynumbers': {'numbers': [1, 2, 3, 4]}}
If I construct 4 Fireworks that each pushes a number and then print the spec via two Firetasks and put those 4 in a workflow, The results after each step are: 'mynumbers': {'numbers': [1]}}, 'mynumbers': {'numbers': [1, 2]}}, 'mynumbers': {'numbers': [2, 3]}}, 'mynumbers': {'numbers': [3, 4]}}.

This makes some sense, since the spec is only inherited by the first child Firework, and not the second, but I still think that the behaviour is not very intuitive. Furthermore, I tried to “fix” this behaviour by adding update_spec = fw_spec to the FWAction, so that it now reads:

FWAction(mod_spec = [{'_push': {out_str: number}}], update_spec = fw_spec)
However, this leads after four pushes to this array in the spec: 'mynumbers': {'numbers': [1, 2, 2, 3, 3, 4]}} I have to say that I do not understand why this would be the result, and it shows me that I really do not understand how the fw_spec and its updates/modification works.

I thought that FWActions modify or update the spec for the current Firework, with the last Firetask in the Firework modifying the spec for the next Firework, but I guess this is not correct. Can someone comment on this?

Since my workflow does not allow me to push all the values for the convergence study from a single Firework, I really need to find a way to extend an array which is shared between Fireworks, and I found the following workaround:

Instead of using '_push', I first grab the array from the fw_spec, extend it in the Firetask using a normal .append() and then use '_set' to exchange the previous version of the array with the new one. This then replicates the behaviour I got when using ‘_push’ in a single Firework, but as with all workarounds, I am not fully happy.

Please advise me if the workaround given in 4. is the way to go, or if I can adapt 3. in a way that I can still use '_push'? Also please tell me if the behaviour of the mod_spec '_push' method is actually intended to work as it does, meaning differently within a single Firework and between different Fireworks?

Here follows the code for all my examples given above, so you can reproduce the behaviour for yourselfs. Many thanks, Michael

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from fireworks import Workflow, LaunchPad, FWAction, FiretaskBase, \ 
                      Firework
from fireworks.core.rocket_launcher import rapidfire
from fireworks.utilities.fw_utilities import explicit_serialize


def GetValueFromNestedDict(dictionary, key_list):
    """Return the value of a nested dict from a list of keys."""
    if len(key_list) > 1:
        if key_list[0] in dictionary:
            return GetValueFromNestedDict(dictionary[key_list[0]],
                                          key_list[1:])
        else:
            return None
    return dictionary.get(key_list[0])

@explicit_serialize
class FT_PrintSpec(FiretaskBase):
    """Prints the spec of the current Workflow to the screen."""
    
    def run_task(self, fw_spec):
        import pprint
        pprint.pprint(fw_spec)

@explicit_serialize
class FT_PushNumber(FiretaskBase):
    """Push a number to the end of an array in the spec using mod_spec _push."""
    required_params = ['number', 'numbers_loc']
    def run_task(self, fw_spec):
        out_str = '->'.join(self['numbers_loc'])
        number = self['number']
        return FWAction(mod_spec = [{'_push': {out_str: number}}])
    
@explicit_serialize
class FT_PushNumberandUpdate(FiretaskBase):
    """Push a number to the end of an array in the spec using mod_spec _push."""
    required_params = ['number', 'numbers_loc']
    def run_task(self, fw_spec):
        out_str = '->'.join(self['numbers_loc'])
        number = self['number']
        return FWAction(mod_spec = [{'_push': {out_str: number}}],
                        update_spec = fw_spec)
    
@explicit_serialize
class FT_PushNumberWithSet(FiretaskBase):
    """Push a number to the end of an array in the spec using mod_spec _set."""
    required_params = ['number', 'numbers_loc']
    def run_task(self, fw_spec):
        numbers = GetValueFromNestedDict(fw_spec, self['numbers_loc'])
        out_str = '->'.join(self['numbers_loc'])
        if numbers is None:
            number = [self['number']]
            return FWAction(mod_spec = [{'_set': {out_str: number}}])
        else:
            numbers.append(self['number'])
            return FWAction(mod_spec = [{'_set': {out_str: numbers}}])
    
def FW_PushNumber(number, number_loc, spec=None):
    FT1 = FT_PushNumber(number=number, numbers_loc=number_loc)
    FT2 = FT_PrintSpec()
    FW = Firework([FT1, FT2], spec=spec, name='Push and Print')
    return FW

def FW_PushNumberAndUpdate(number, number_loc, spec=None):
    FT1 = FT_PushNumberandUpdate(number=number, numbers_loc=number_loc)
    FT2 = FT_PrintSpec()
    FW = Firework([FT1, FT2], spec=spec, name='Push, Update and Print')
    return FW

def FW_PushNumberFourTimes(number1,  number2, number3, number4, number_loc,
                           spec=None):
    FT1 = FT_PushNumber(number=number1, numbers_loc=number_loc)
    FT2 = FT_PushNumber(number=number2, numbers_loc=number_loc)
    FT3 = FT_PushNumber(number=number3, numbers_loc=number_loc)
    FT4 = FT_PushNumber(number=number4, numbers_loc=number_loc)
    FT5 = FT_PrintSpec()
    FW = Firework([FT1, FT5, FT2, FT5, FT3, FT5, FT4, FT5], spec=spec,
                  name='Push 4 times and Print')
    return FW

def FW_PushNumberWithSet(number, number_loc, spec=None):
    FT1 = FT_PushNumberWithSet(number=number, numbers_loc=number_loc)
    FT2 = FT_PrintSpec()
    FW = Firework([FT1, FT2], spec=spec, name='Set and Print')
    return FW

if __name__ == "__main__":
    
    num_loc = ['mynumbers', 'numbers']
    lpad = LaunchPad.auto_load()
    
    wf1 = Workflow([FW_PushNumberFourTimes(1, 2, 3, 4, num_loc)],
                   name='Push in single FW')
    lpad.add_wf(wf1)
    
    FW1 = FW_PushNumber(1, num_loc)
    FW2 = FW_PushNumber(2, num_loc)
    FW3 = FW_PushNumber(3, num_loc)
    FW4 = FW_PushNumber(4, num_loc)
    wf2 = Workflow([FW1, FW2, FW3, FW4], {FW1: [FW2], FW2: [FW3], FW3: [FW4]},
                   name='Push in four separate FWs')
    lpad.add_wf(wf2)
    
    FW1 = FW_PushNumberWithSet(1, num_loc)
    FW2 = FW_PushNumberWithSet(2, num_loc)
    FW3 = FW_PushNumberWithSet(3, num_loc)
    FW4 = FW_PushNumberWithSet(4, num_loc)
    wf3 = Workflow([FW1, FW2, FW3, FW4], {FW1: [FW2], FW2: [FW3], FW3: [FW4]},
                    name='Push using _set in four separate FWs')
    lpad.add_wf(wf3)
    
    FW1 = FW_PushNumberAndUpdate(1, num_loc)
    FW2 = FW_PushNumberAndUpdate(2, num_loc)
    FW3 = FW_PushNumberAndUpdate(3, num_loc)
    FW4 = FW_PushNumberAndUpdate(4, num_loc)
    wf4 = Workflow([FW1, FW2, FW3, FW4], {FW1: [FW2], FW2: [FW3], FW3: [FW4]},
                   name='Push And Update in four separate FWs')
    lpad.add_wf(wf4)
    
    rapidfire(lpad)