Question about customized DupeFinder

Hi,

I’m trying to customize DupeFinder, but it doesn’t work.

I wrote the following script.

my_fw_for_test.py

from fireworks import explicit_serialize, FiretaskBase, FWAction, Firework

from fireworks.features.dupefinder import DupeFinderBase

path = ["/home/taka/test_duplicate_check"] # directory where script exists

@explicit_serialize

class DoNothingTask(FiretaskBase):

required_params = [“name”]

def run_task(self, fw_spec):

   return FWAction()

class ValueDupeFinder(DupeFinderBase):

_fw_name = ‘ValueDupeFinder’

def verify(self, spec1, spec2):

   """

   Method that checks whether two specs are identical enough to be considered duplicates.

   Return true if duplicated.

   Args:

   spec1 (dict)

   spec2 (dict)

   Returns:

       bool

   """

   return abs(spec1["value"] - spec2["value"]) < 10

def query(self, spec):

   """

   Given a spec, returns a database query that gives potential candidates for duplicated Fireworks.

   Args:

       spec (dict): spec to check for duplicates

   """

   return {"$and": [{"launches": {"$ne": []}}, {"spec": spec}]}

class DupTestFW(Firework):

_fw_name = “FirstFW”

def init(self, value, parents=None):

    self._task = [DoNothingTask(name="nothing")]

   super().__init__(self._task, parents=parents, name=self._fw_name,

                    spec={"_dupefinder": ValueDupeFinder(),

                          "value": value})

``

main.py

from future import division, unicode_literals, print_function

``

import sys

from fireworks import LaunchPad, Workflow

from my_fw_for_test import DupTestFW

lpad = LaunchPad.auto_load()

def add_wf(name):

first = DupTestFW(value=float(sys.argv[1]))

wf = Workflow([first], name=name)

lpad.add_wf(wf)

if name == “main”:

add_wf(“test_duplicate_check”)

I executed command as following:

lpad reset
python main.py 0 && rlaunch singleshot
python main.py 5 && rlaunch singleshot
python main.py 100 && rlaunch singleshot

``

I expect that my ValueDupeFinder regard only the second FW as duplicated FW.

However, actually neither the second nor third FW are regarded as duplicated FW by ValueDupeFinder.

I also try rewrite ValueDupeFinder.query as

return {"$and": [{“launches”: {"$ne": []}}]}

``

, then both of the second and third FW are regarded as duplicated FW by ValueDupeFinder. (I did “lpad reset” at the first.)

I read the code of _steal_launches method of fireworks.core.launchpad.py.

It seems that DupeFinder.verify is not called, then I think all matched FW by “query” will be regarded as duplicated.

Or, did I do something wrong?

I will appreciate any advice and help.

Thanks,

Akira Takahashi

Hi Akira,

I checked and you are correct: I also did not see the “verify()” function being called.

That said, the previous version of FWs definitely had this code. For example, I reverted to FWS version 1.1.4 and the code for steal_launches looks like this:

for potential_match in self.fireworks.find(m_query):
    self.m_logger.debug(
        'Verifying for duplicates, fw_ids: {}, {}'.format(thief_fw.fw_id,
                                                          potential_match['fw_id']))
    spec1 = dict(thief_fw.to_dict()['spec']) # defensive copy
    spec2 = dict(potential_match['spec']) # defensive copy
    if m_dupefinder.verify(spec1, spec2): # verify the match
        # steal the launches
        victim_fw = self.get_fw_by_id(potential_match['fw_id'])

I am currently at the MRS conference, but will look into what happened and try to produce a patch as soon as I can. It might be as simple as restoring the above code, but I would like to see when and why it was removed in the first place. You can track the progress here:

[https://github.com/materialsproject/fireworks/issues/310](https://github.com/materialsproject/fireworks/issues/310)

Thank you for pointing this out!
Anubhav
···

On Mon, Nov 26, 2018 at 6:02 AM Akira Takahashi [email protected] wrote:

Hi,

I’m trying to customize DupeFinder, but it doesn’t work.

I wrote the following script.

my_fw_for_test.py

from fireworks import explicit_serialize, FiretaskBase, FWAction, Firework

from fireworks.features.dupefinder import DupeFinderBase

path = ["/home/taka/test_duplicate_check"] # directory where script exists

@explicit_serialize

class DoNothingTask(FiretaskBase):

required_params = [“name”]

def run_task(self, fw_spec):

   return FWAction()

class ValueDupeFinder(DupeFinderBase):

_fw_name = ‘ValueDupeFinder’

def verify(self, spec1, spec2):

   """
   Method that checks whether two specs are identical enough to be considered duplicates.
   Return true if duplicated.
   Args:
   spec1 (dict)
   spec2 (dict)
   Returns:
       bool
   """
   return abs(spec1["value"] - spec2["value"]) < 10

def query(self, spec):

   """
   Given a spec, returns a database query that gives potential candidates for duplicated Fireworks.
   Args:
       spec (dict): spec to check for duplicates
   """
   return {"$and": [{"launches": {"$ne": []}}, {"spec": spec}]}

class DupTestFW(Firework):

_fw_name = “FirstFW”

def init(self, value, parents=None):

    self._task = [DoNothingTask(name="nothing")]
   super().__init__(self._task, parents=parents, name=self._fw_name,
                    spec={"_dupefinder": ValueDupeFinder(),
                          "value": value})

``

main.py

from future import division, unicode_literals, print_function

``

import sys

from fireworks import LaunchPad, Workflow

from my_fw_for_test import DupTestFW

lpad = LaunchPad.auto_load()

def add_wf(name):

first = DupTestFW(value=float(sys.argv[1]))

wf = Workflow([first], name=name)

lpad.add_wf(wf)

if name == “main”:

add_wf(“test_duplicate_check”)

I executed command as following:

lpad reset
python main.py 0 && rlaunch singleshot
python main.py 5 && rlaunch singleshot
python main.py 100 && rlaunch singleshot

``

I expect that my ValueDupeFinder regard only the second FW as duplicated FW.

However, actually neither the second nor third FW are regarded as duplicated FW by ValueDupeFinder.

I also try rewrite ValueDupeFinder.query as

return {"$and": [{“launches”: {"$ne": []}}]}

``

, then both of the second and third FW are regarded as duplicated FW by ValueDupeFinder. (I did “lpad reset” at the first.)

I read the code of _steal_launches method of fireworks.core.launchpad.py.

It seems that DupeFinder.verify is not called, then I think all matched FW by “query” will be regarded as duplicated.

Or, did I do something wrong?

I will appreciate any advice and help.

Thanks,

Akira Takahashi

You received this message because you are subscribed to the Google Groups “fireworkflows” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

Visit this group at https://groups.google.com/group/fireworkflows.

For more options, visit https://groups.google.com/d/optout.


Best,
Anubhav

I think I have fixed this, see:

and perhaps comment there if you see further issues. Thanks again for bringing this up!

···

On Monday, November 26, 2018 at 10:24:30 AM UTC-5, ajain wrote:

Hi Akira,

I checked and you are correct: I also did not see the “verify()” function being called.

That said, the previous version of FWs definitely had this code. For example, I reverted to FWS version 1.1.4 and the code for steal_launches looks like this:

for potential_match in self.fireworks.find(m_query):
    self.m_logger.debug(
        'Verifying for duplicates, fw_ids: {}, {}'.format(thief_fw.fw_id,
                                                          potential_match['fw_id']))
    spec1 = dict(thief_fw.to_dict()['spec']) # defensive copy
    spec2 = dict(potential_match['spec']) # defensive copy
    if m_dupefinder.verify(spec1, spec2): # verify the match
        # steal the launches
        victim_fw = self.get_fw_by_id(potential_match['fw_id'])


I am currently at the MRS conference, but will look into what happened and try to produce a patch as soon as I can. It might be as simple as restoring the above code, but I would like to see when and why it was removed in the first place. You can track the progress here:

[https://github.com/materialsproject/fireworks/issues/310](https://github.com/materialsproject/fireworks/issues/310)

Thank you for pointing this out!
Anubhav

On Mon, Nov 26, 2018 at 6:02 AM Akira Takahashi [email protected] wrote:

Hi,

I’m trying to customize DupeFinder, but it doesn’t work.

I wrote the following script.

my_fw_for_test.py

from fireworks import explicit_serialize, FiretaskBase, FWAction, Firework

from fireworks.features.dupefinder import DupeFinderBase

path = ["/home/taka/test_duplicate_check"] # directory where script exists

@explicit_serialize

class DoNothingTask(FiretaskBase):

required_params = [“name”]

def run_task(self, fw_spec):

   return FWAction()

class ValueDupeFinder(DupeFinderBase):

_fw_name = ‘ValueDupeFinder’

def verify(self, spec1, spec2):

   """
   Method that checks whether two specs are identical enough to be considered duplicates.
   Return true if duplicated.
   Args:
   spec1 (dict)
   spec2 (dict)
   Returns:
       bool
   """
   return abs(spec1["value"] - spec2["value"]) < 10

def query(self, spec):

   """
   Given a spec, returns a database query that gives potential candidates for duplicated Fireworks.
   Args:
       spec (dict): spec to check for duplicates
   """
   return {"$and": [{"launches": {"$ne": []}}, {"spec": spec}]}

class DupTestFW(Firework):

_fw_name = “FirstFW”

def init(self, value, parents=None):

    self._task = [DoNothingTask(name="nothing")]
   super().__init__(self._task, parents=parents, name=self._fw_name,
                    spec={"_dupefinder": ValueDupeFinder(),
                          "value": value})

``

main.py

from future import division, unicode_literals, print_function

``

import sys

from fireworks import LaunchPad, Workflow

from my_fw_for_test import DupTestFW

lpad = LaunchPad.auto_load()

def add_wf(name):

first = DupTestFW(value=float(sys.argv[1]))

wf = Workflow([first], name=name)

lpad.add_wf(wf)

if name == “main”:

add_wf(“test_duplicate_check”)

I executed command as following:

lpad reset
python main.py 0 && rlaunch singleshot
python main.py 5 && rlaunch singleshot
python main.py 100 && rlaunch singleshot

``

I expect that my ValueDupeFinder regard only the second FW as duplicated FW.

However, actually neither the second nor third FW are regarded as duplicated FW by ValueDupeFinder.

I also try rewrite ValueDupeFinder.query as

return {"$and": [{“launches”: {"$ne": []}}]}

``

, then both of the second and third FW are regarded as duplicated FW by ValueDupeFinder. (I did “lpad reset” at the first.)

I read the code of _steal_launches method of fireworks.core.launchpad.py.

It seems that DupeFinder.verify is not called, then I think all matched FW by “query” will be regarded as duplicated.

Or, did I do something wrong?

I will appreciate any advice and help.

Thanks,

Akira Takahashi

You received this message because you are subscribed to the Google Groups “fireworkflows” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

Visit this group at https://groups.google.com/group/fireworkflows.

For more options, visit https://groups.google.com/d/optout.


Best,
Anubhav

(I am sorry that I mistakenly submit reply and delete it)

Hi Anubhav,

Thanks for updating.

Isn’t the update reflected on version 1.8.1?

I checked https://github.com/materialsproject/fireworks/releases, but the commit seems not reflected.

Therefore, I cloned from GitHub directly and tested.

But actually, I have a problem.

Since my ValueDupeFinder check the “value” key of spec, in the 1527th line of fireworks/core/launchpad.py, i.e.,

m_dupefinder.verify({}, {}) # is implemented test

raises KeyError.

Further more, I think the following code might be more defensive

try:

m_dupefinder.verify({}, {}) # is implemented test

except NotImplementedError:

verified = True

            else:

spec2 = dict(potential_match[‘spec’]) # defensive copy

verified = m_dupefinder.verify(spec1, spec2)

spec1 = dict(thief_fw.to_dict()[‘spec’]) # defensive copy

``

, instead of

try:

m_dupefinder.verify({}, {}) # is implemented test

spec1 = dict(thief_fw.to_dict()[‘spec’]) # defensive copy

spec2 = dict(potential_match[‘spec’]) # defensive copy

verified = m_dupefinder.verify(spec1, spec2)

except NotImplementedError:

verified = True

since NotImplementedError may cause in the function verify(spec1, spec2).

I know it is very rare case, but it is hard to find if occurs.

Thanks,

Akira Takahashi

2018年11月29日木曜日 0時50分31秒 UTC+9 Anubhav Jain:

···

I think I have fixed this, see:

https://github.com/materialsproject/fireworks/issues/310

and perhaps comment there if you see further issues. Thanks again for bringing this up!

On Monday, November 26, 2018 at 10:24:30 AM UTC-5, ajain wrote:

Hi Akira,

I checked and you are correct: I also did not see the “verify()” function being called.

That said, the previous version of FWs definitely had this code. For example, I reverted to FWS version 1.1.4 and the code for steal_launches looks like this:

for potential_match in self.fireworks.find(m_query):
    self.m_logger.debug(
        'Verifying for duplicates, fw_ids: {}, {}'.format(thief_fw.fw_id,
                                                          potential_match['fw_id']))
    spec1 = dict(thief_fw.to_dict()['spec']) # defensive copy
    spec2 = dict(potential_match['spec']) # defensive copy
    if m_dupefinder.verify(spec1, spec2): # verify the match
        # steal the launches
        victim_fw = self.get_fw_by_id(potential_match['fw_id'])


I am currently at the MRS conference, but will look into what happened and try to produce a patch as soon as I can. It might be as simple as restoring the above code, but I would like to see when and why it was removed in the first place. You can track the progress here:

[https://github.com/materialsproject/fireworks/issues/310](https://github.com/materialsproject/fireworks/issues/310)

Thank you for pointing this out!
Anubhav

On Mon, Nov 26, 2018 at 6:02 AM Akira Takahashi [email protected] wrote:

Hi,

I’m trying to customize DupeFinder, but it doesn’t work.

I wrote the following script.

my_fw_for_test.py

from fireworks import explicit_serialize, FiretaskBase, FWAction, Firework

from fireworks.features.dupefinder import DupeFinderBase

path = ["/home/taka/test_duplicate_check"] # directory where script exists

@explicit_serialize

class DoNothingTask(FiretaskBase):

required_params = [“name”]

def run_task(self, fw_spec):

   return FWAction()

class ValueDupeFinder(DupeFinderBase):

_fw_name = ‘ValueDupeFinder’

def verify(self, spec1, spec2):

   """
   Method that checks whether two specs are identical enough to be considered duplicates.
   Return true if duplicated.
   Args:
   spec1 (dict)
   spec2 (dict)
   Returns:
       bool
   """
   return abs(spec1["value"] - spec2["value"]) < 10

def query(self, spec):

   """
   Given a spec, returns a database query that gives potential candidates for duplicated Fireworks.
   Args:
       spec (dict): spec to check for duplicates
   """
   return {"$and": [{"launches": {"$ne": []}}, {"spec": spec}]}

class DupTestFW(Firework):

_fw_name = “FirstFW”

def init(self, value, parents=None):

    self._task = [DoNothingTask(name="nothing")]
   super().__init__(self._task, parents=parents, name=self._fw_name,
                    spec={"_dupefinder": ValueDupeFinder(),
                          "value": value})

``

main.py

from future import division, unicode_literals, print_function

``

import sys

from fireworks import LaunchPad, Workflow

from my_fw_for_test import DupTestFW

lpad = LaunchPad.auto_load()

def add_wf(name):

first = DupTestFW(value=float(sys.argv[1]))

wf = Workflow([first], name=name)

lpad.add_wf(wf)

if name == “main”:

add_wf(“test_duplicate_check”)

I executed command as following:

lpad reset
python main.py 0 && rlaunch singleshot
python main.py 5 && rlaunch singleshot
python main.py 100 && rlaunch singleshot

``

I expect that my ValueDupeFinder regard only the second FW as duplicated FW.

However, actually neither the second nor third FW are regarded as duplicated FW by ValueDupeFinder.

I also try rewrite ValueDupeFinder.query as

return {"$and": [{“launches”: {"$ne": []}}]}

``

, then both of the second and third FW are regarded as duplicated FW by ValueDupeFinder. (I did “lpad reset” at the first.)

I read the code of _steal_launches method of fireworks.core.launchpad.py.

It seems that DupeFinder.verify is not called, then I think all matched FW by “query” will be regarded as duplicated.

Or, did I do something wrong?

I will appreciate any advice and help.

Thanks,

Akira Takahashi

You received this message because you are subscribed to the Google Groups “fireworkflows” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

Visit this group at https://groups.google.com/group/fireworkflows.

For more options, visit https://groups.google.com/d/optout.


Best,
Anubhav

Hi Akira,

Thanks! Your solution is indeed more correct.

It should be reflected in FWS v1.8.2, hopefully this works for you.

Best,

Anubhav

···

On Tuesday, December 4, 2018 at 12:23:58 AM UTC-8, Akira Takahashi wrote:

(I am sorry that I mistakenly submit reply and delete it)

Hi Anubhav,

Thanks for updating.

Isn’t the update reflected on version 1.8.1?

I checked https://github.com/materialsproject/fireworks/releases, but the commit seems not reflected.

Therefore, I cloned from GitHub directly and tested.

But actually, I have a problem.

Since my ValueDupeFinder check the “value” key of spec, in the 1527th line of fireworks/core/launchpad.py, i.e.,

m_dupefinder.verify({}, {}) # is implemented test

raises KeyError.

Further more, I think the following code might be more defensive

try:

m_dupefinder.verify({}, {}) # is implemented test

except NotImplementedError:

verified = True

            else:

spec1 = dict(thief_fw.to_dict()[‘spec’]) # defensive copy
spec2 = dict(potential_match[‘spec’]) # defensive copy

verified = m_dupefinder.verify(spec1, spec2)

``

, instead of

try:

m_dupefinder.verify({}, {}) # is implemented test

spec1 = dict(thief_fw.to_dict()[‘spec’]) # defensive copy

spec2 = dict(potential_match[‘spec’]) # defensive copy

verified = m_dupefinder.verify(spec1, spec2)

except NotImplementedError:

verified = True

since NotImplementedError may cause in the function verify(spec1, spec2).

I know it is very rare case, but it is hard to find if occurs.

Thanks,

Akira Takahashi

2018年11月29日木曜日 0時50分31秒 UTC+9 Anubhav Jain:

I think I have fixed this, see:

https://github.com/materialsproject/fireworks/issues/310

and perhaps comment there if you see further issues. Thanks again for bringing this up!

On Monday, November 26, 2018 at 10:24:30 AM UTC-5, ajain wrote:

Hi Akira,

I checked and you are correct: I also did not see the “verify()” function being called.

That said, the previous version of FWs definitely had this code. For example, I reverted to FWS version 1.1.4 and the code for steal_launches looks like this:

for potential_match in self.fireworks.find(m_query):
    self.m_logger.debug(
        'Verifying for duplicates, fw_ids: {}, {}'.format(thief_fw.fw_id,
                                                          potential_match['fw_id']))
    spec1 = dict(thief_fw.to_dict()['spec']) # defensive copy
    spec2 = dict(potential_match['spec']) # defensive copy
    if m_dupefinder.verify(spec1, spec2): # verify the match
        # steal the launches
        victim_fw = self.get_fw_by_id(potential_match['fw_id'])


I am currently at the MRS conference, but will look into what happened and try to produce a patch as soon as I can. It might be as simple as restoring the above code, but I would like to see when and why it was removed in the first place. You can track the progress here:

[https://github.com/materialsproject/fireworks/issues/310](https://github.com/materialsproject/fireworks/issues/310)

Thank you for pointing this out!
Anubhav

On Mon, Nov 26, 2018 at 6:02 AM Akira Takahashi [email protected] wrote:

Hi,

I’m trying to customize DupeFinder, but it doesn’t work.

I wrote the following script.

my_fw_for_test.py

from fireworks import explicit_serialize, FiretaskBase, FWAction, Firework

from fireworks.features.dupefinder import DupeFinderBase

path = ["/home/taka/test_duplicate_check"] # directory where script exists

@explicit_serialize

class DoNothingTask(FiretaskBase):

required_params = [“name”]

def run_task(self, fw_spec):

   return FWAction()

class ValueDupeFinder(DupeFinderBase):

_fw_name = ‘ValueDupeFinder’

def verify(self, spec1, spec2):

   """
   Method that checks whether two specs are identical enough to be considered duplicates.
   Return true if duplicated.
   Args:
   spec1 (dict)
   spec2 (dict)
   Returns:
       bool
   """
   return abs(spec1["value"] - spec2["value"]) < 10

def query(self, spec):

   """
   Given a spec, returns a database query that gives potential candidates for duplicated Fireworks.
   Args:
       spec (dict): spec to check for duplicates
   """
   return {"$and": [{"launches": {"$ne": []}}, {"spec": spec}]}

class DupTestFW(Firework):

_fw_name = “FirstFW”

def init(self, value, parents=None):

    self._task = [DoNothingTask(name="nothing")]
   super().__init__(self._task, parents=parents, name=self._fw_name,
                    spec={"_dupefinder": ValueDupeFinder(),
                          "value": value})

``

main.py

from future import division, unicode_literals, print_function

``

import sys

from fireworks import LaunchPad, Workflow

from my_fw_for_test import DupTestFW

lpad = LaunchPad.auto_load()

def add_wf(name):

first = DupTestFW(value=float(sys.argv[1]))

wf = Workflow([first], name=name)

lpad.add_wf(wf)

if name == “main”:

add_wf(“test_duplicate_check”)

I executed command as following:

lpad reset
python main.py 0 && rlaunch singleshot
python main.py 5 && rlaunch singleshot
python main.py 100 && rlaunch singleshot

``

I expect that my ValueDupeFinder regard only the second FW as duplicated FW.

However, actually neither the second nor third FW are regarded as duplicated FW by ValueDupeFinder.

I also try rewrite ValueDupeFinder.query as

return {"$and": [{“launches”: {"$ne": []}}]}

``

, then both of the second and third FW are regarded as duplicated FW by ValueDupeFinder. (I did “lpad reset” at the first.)

I read the code of _steal_launches method of fireworks.core.launchpad.py.

It seems that DupeFinder.verify is not called, then I think all matched FW by “query” will be regarded as duplicated.

Or, did I do something wrong?

I will appreciate any advice and help.

Thanks,

Akira Takahashi

You received this message because you are subscribed to the Google Groups “fireworkflows” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

Visit this group at https://groups.google.com/group/fireworkflows.

For more options, visit https://groups.google.com/d/optout.


Best,
Anubhav

Hi Anubhav,

Thanks for reply and update.

I’m sorry that my explanation wasn’t enough.

At the previous post, I had two problems, i.e.

  1. m_dupefinder.verify({}, {}) checks empty dicts, so in my ValueDupeFinder spec1[“value”] raises KeyError.

  2. Using try-else would be more defensive.

And the solution solved only the second problem and I didn’t write a solution of the first problem.

(Now I think that if DupeFinderBase.verify returns True instead of raising NotImplemetedError,

is_implemeted_test is not neccessary.

But I’m not sure this is a good solution…)

Now I also think that “query” condition should include {‘spec.value’: {$exists: True}},

or my customized verify method should handle KeyError like

def verify(self, spec1, spec2):

  if "value" not in spec1 or "value" not in spec2:
       return False
  return abs(spec1["value"] - spec2["value"]) < 10

``

because there might be FW whose spec does not have “value” key.

In conclusion, I will reconsider my customized query and verify to handle KeyError.

Anyway I appreciate your kind help.

Thanks,

Akira Takahashi

2018年12月7日金曜日 6時49分10秒 UTC+9 Anubhav Jain:

···

Hi Akira,

Thanks! Your solution is indeed more correct.

It should be reflected in FWS v1.8.2, hopefully this works for you.

Best,

Anubhav

On Tuesday, December 4, 2018 at 12:23:58 AM UTC-8, Akira Takahashi wrote:

(I am sorry that I mistakenly submit reply and delete it)

Hi Anubhav,

Thanks for updating.

Isn’t the update reflected on version 1.8.1?

I checked https://github.com/materialsproject/fireworks/releases, but the commit seems not reflected.

Therefore, I cloned from GitHub directly and tested.

But actually, I have a problem.

Since my ValueDupeFinder check the “value” key of spec, in the 1527th line of fireworks/core/launchpad.py, i.e.,

m_dupefinder.verify({}, {}) # is implemented test

raises KeyError.

Further more, I think the following code might be more defensive

try:

m_dupefinder.verify({}, {}) # is implemented test

except NotImplementedError:

verified = True

            else:

spec1 = dict(thief_fw.to_dict()[‘spec’]) # defensive copy
spec2 = dict(potential_match[‘spec’]) # defensive copy

verified = m_dupefinder.verify(spec1, spec2)

``

, instead of

try:

m_dupefinder.verify({}, {}) # is implemented test

spec1 = dict(thief_fw.to_dict()[‘spec’]) # defensive copy

spec2 = dict(potential_match[‘spec’]) # defensive copy

verified = m_dupefinder.verify(spec1, spec2)

except NotImplementedError:

verified = True

since NotImplementedError may cause in the function verify(spec1, spec2).

I know it is very rare case, but it is hard to find if occurs.

Thanks,

Akira Takahashi

2018年11月29日木曜日 0時50分31秒 UTC+9 Anubhav Jain:

I think I have fixed this, see:

https://github.com/materialsproject/fireworks/issues/310

and perhaps comment there if you see further issues. Thanks again for bringing this up!

On Monday, November 26, 2018 at 10:24:30 AM UTC-5, ajain wrote:

Hi Akira,

I checked and you are correct: I also did not see the “verify()” function being called.

That said, the previous version of FWs definitely had this code. For example, I reverted to FWS version 1.1.4 and the code for steal_launches looks like this:

for potential_match in self.fireworks.find(m_query):
    self.m_logger.debug(
        'Verifying for duplicates, fw_ids: {}, {}'.format(thief_fw.fw_id,
                                                          potential_match['fw_id']))
    spec1 = dict(thief_fw.to_dict()['spec']) # defensive copy
    spec2 = dict(potential_match['spec']) # defensive copy
    if m_dupefinder.verify(spec1, spec2): # verify the match
        # steal the launches
        victim_fw = self.get_fw_by_id(potential_match['fw_id'])


I am currently at the MRS conference, but will look into what happened and try to produce a patch as soon as I can. It might be as simple as restoring the above code, but I would like to see when and why it was removed in the first place. You can track the progress here:

[https://github.com/materialsproject/fireworks/issues/310](https://github.com/materialsproject/fireworks/issues/310)

Thank you for pointing this out!
Anubhav

On Mon, Nov 26, 2018 at 6:02 AM Akira Takahashi [email protected] wrote:

Hi,

I’m trying to customize DupeFinder, but it doesn’t work.

I wrote the following script.

my_fw_for_test.py

from fireworks import explicit_serialize, FiretaskBase, FWAction, Firework

from fireworks.features.dupefinder import DupeFinderBase

path = ["/home/taka/test_duplicate_check"] # directory where script exists

@explicit_serialize

class DoNothingTask(FiretaskBase):

required_params = [“name”]

def run_task(self, fw_spec):

   return FWAction()

class ValueDupeFinder(DupeFinderBase):

_fw_name = ‘ValueDupeFinder’

def verify(self, spec1, spec2):

   """
   Method that checks whether two specs are identical enough to be considered duplicates.
   Return true if duplicated.
   Args:
   spec1 (dict)
   spec2 (dict)
   Returns:
       bool
   """
   return abs(spec1["value"] - spec2["value"]) < 10

def query(self, spec):

   """
   Given a spec, returns a database query that gives potential candidates for duplicated Fireworks.
   Args:
       spec (dict): spec to check for duplicates
   """
   return {"$and": [{"launches": {"$ne": []}}, {"spec": spec}]}

class DupTestFW(Firework):

_fw_name = “FirstFW”

def init(self, value, parents=None):

    self._task = [DoNothingTask(name="nothing")]
   super().__init__(self._task, parents=parents, name=self._fw_name,
                    spec={"_dupefinder": ValueDupeFinder(),
                          "value": value})

``

main.py

from future import division, unicode_literals, print_function

``

import sys

from fireworks import LaunchPad, Workflow

from my_fw_for_test import DupTestFW

lpad = LaunchPad.auto_load()

def add_wf(name):

first = DupTestFW(value=float(sys.argv[1]))

wf = Workflow([first], name=name)

lpad.add_wf(wf)

if name == “main”:

add_wf(“test_duplicate_check”)

I executed command as following:

lpad reset
python main.py 0 && rlaunch singleshot
python main.py 5 && rlaunch singleshot
python main.py 100 && rlaunch singleshot

``

I expect that my ValueDupeFinder regard only the second FW as duplicated FW.

However, actually neither the second nor third FW are regarded as duplicated FW by ValueDupeFinder.

I also try rewrite ValueDupeFinder.query as

return {"$and": [{“launches”: {"$ne": []}}]}

``

, then both of the second and third FW are regarded as duplicated FW by ValueDupeFinder. (I did “lpad reset” at the first.)

I read the code of _steal_launches method of fireworks.core.launchpad.py.

It seems that DupeFinder.verify is not called, then I think all matched FW by “query” will be regarded as duplicated.

Or, did I do something wrong?

I will appreciate any advice and help.

Thanks,

Akira Takahashi

You received this message because you are subscribed to the Google Groups “fireworkflows” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

Visit this group at https://groups.google.com/group/fireworkflows.

For more options, visit https://groups.google.com/d/optout.


Best,
Anubhav

Hi Akira,

I have modified the code again in the latest Github commit: https://github.com/materialsproject/fireworks/commit/ae7082eb010bbbd6a7c7884a3ac9a0e3ae907978

If this one looks good to you, I can do another official release; let me know!

If t

···

On Thursday, December 6, 2018 at 10:55:29 PM UTC-8, Akira Takahashi wrote:

Hi Anubhav,

Thanks for reply and update.

I’m sorry that my explanation wasn’t enough.

At the previous post, I had two problems, i.e.

  1. m_dupefinder.verify({}, {}) checks empty dicts, so in my ValueDupeFinder spec1[“value”] raises KeyError.
  1. Using try-else would be more defensive.

And the solution solved only the second problem and I didn’t write a solution of the first problem.

(Now I think that if DupeFinderBase.verify returns True instead of raising NotImplemetedError,

is_implemeted_test is not neccessary.

But I’m not sure this is a good solution…)

Now I also think that “query” condition should include {‘spec.value’: {$exists: True}},

or my customized verify method should handle KeyError like

def verify(self, spec1, spec2):

  if "value" not in spec1 or "value" not in spec2:
       return False
  return abs(spec1["value"] - spec2["value"]) < 10

``

because there might be FW whose spec does not have “value” key.

In conclusion, I will reconsider my customized query and verify to handle KeyError.

Anyway I appreciate your kind help.

Thanks,

Akira Takahashi

2018年12月7日金曜日 6時49分10秒 UTC+9 Anubhav Jain:

Hi Akira,

Thanks! Your solution is indeed more correct.

It should be reflected in FWS v1.8.2, hopefully this works for you.

Best,

Anubhav

On Tuesday, December 4, 2018 at 12:23:58 AM UTC-8, Akira Takahashi wrote:

(I am sorry that I mistakenly submit reply and delete it)

Hi Anubhav,

Thanks for updating.

Isn’t the update reflected on version 1.8.1?

I checked https://github.com/materialsproject/fireworks/releases, but the commit seems not reflected.

Therefore, I cloned from GitHub directly and tested.

But actually, I have a problem.

Since my ValueDupeFinder check the “value” key of spec, in the 1527th line of fireworks/core/launchpad.py, i.e.,

m_dupefinder.verify({}, {}) # is implemented test

raises KeyError.

Further more, I think the following code might be more defensive

try:

m_dupefinder.verify({}, {}) # is implemented test

except NotImplementedError:

verified = True

            else:

spec1 = dict(thief_fw.to_dict()[‘spec’]) # defensive copy
spec2 = dict(potential_match[‘spec’]) # defensive copy

verified = m_dupefinder.verify(spec1, spec2)

``

, instead of

try:

m_dupefinder.verify({}, {}) # is implemented test

spec1 = dict(thief_fw.to_dict()[‘spec’]) # defensive copy

spec2 = dict(potential_match[‘spec’]) # defensive copy

verified = m_dupefinder.verify(spec1, spec2)

except NotImplementedError:

verified = True

since NotImplementedError may cause in the function verify(spec1, spec2).

I know it is very rare case, but it is hard to find if occurs.

Thanks,

Akira Takahashi

2018年11月29日木曜日 0時50分31秒 UTC+9 Anubhav Jain:

I think I have fixed this, see:

https://github.com/materialsproject/fireworks/issues/310

and perhaps comment there if you see further issues. Thanks again for bringing this up!

On Monday, November 26, 2018 at 10:24:30 AM UTC-5, ajain wrote:

Hi Akira,

I checked and you are correct: I also did not see the “verify()” function being called.

That said, the previous version of FWs definitely had this code. For example, I reverted to FWS version 1.1.4 and the code for steal_launches looks like this:

for potential_match in self.fireworks.find(m_query):
    self.m_logger.debug(
        'Verifying for duplicates, fw_ids: {}, {}'.format(thief_fw.fw_id,
                                                          potential_match['fw_id']))
    spec1 = dict(thief_fw.to_dict()['spec']) # defensive copy
    spec2 = dict(potential_match['spec']) # defensive copy
    if m_dupefinder.verify(spec1, spec2): # verify the match
        # steal the launches
        victim_fw = self.get_fw_by_id(potential_match['fw_id'])


I am currently at the MRS conference, but will look into what happened and try to produce a patch as soon as I can. It might be as simple as restoring the above code, but I would like to see when and why it was removed in the first place. You can track the progress here:

[https://github.com/materialsproject/fireworks/issues/310](https://github.com/materialsproject/fireworks/issues/310)

Thank you for pointing this out!
Anubhav

On Mon, Nov 26, 2018 at 6:02 AM Akira Takahashi [email protected] wrote:

Hi,

I’m trying to customize DupeFinder, but it doesn’t work.

I wrote the following script.

my_fw_for_test.py

from fireworks import explicit_serialize, FiretaskBase, FWAction, Firework

from fireworks.features.dupefinder import DupeFinderBase

path = ["/home/taka/test_duplicate_check"] # directory where script exists

@explicit_serialize

class DoNothingTask(FiretaskBase):

required_params = [“name”]

def run_task(self, fw_spec):

   return FWAction()

class ValueDupeFinder(DupeFinderBase):

_fw_name = ‘ValueDupeFinder’

def verify(self, spec1, spec2):

   """
   Method that checks whether two specs are identical enough to be considered duplicates.
   Return true if duplicated.
   Args:
   spec1 (dict)
   spec2 (dict)
   Returns:
       bool
   """
   return abs(spec1["value"] - spec2["value"]) < 10

def query(self, spec):

   """
   Given a spec, returns a database query that gives potential candidates for duplicated Fireworks.
   Args:
       spec (dict): spec to check for duplicates
   """
   return {"$and": [{"launches": {"$ne": []}}, {"spec": spec}]}

class DupTestFW(Firework):

_fw_name = “FirstFW”

def init(self, value, parents=None):

    self._task = [DoNothingTask(name="nothing")]
   super().__init__(self._task, parents=parents, name=self._fw_name,
                    spec={"_dupefinder": ValueDupeFinder(),
                          "value": value})

``

main.py

from future import division, unicode_literals, print_function

``

import sys

from fireworks import LaunchPad, Workflow

from my_fw_for_test import DupTestFW

lpad = LaunchPad.auto_load()

def add_wf(name):

first = DupTestFW(value=float(sys.argv[1]))

wf = Workflow([first], name=name)

lpad.add_wf(wf)

if name == “main”:

add_wf(“test_duplicate_check”)

I executed command as following:

lpad reset
python main.py 0 && rlaunch singleshot
python main.py 5 && rlaunch singleshot
python main.py 100 && rlaunch singleshot

``

I expect that my ValueDupeFinder regard only the second FW as duplicated FW.

However, actually neither the second nor third FW are regarded as duplicated FW by ValueDupeFinder.

I also try rewrite ValueDupeFinder.query as

return {"$and": [{“launches”: {"$ne": []}}]}

``

, then both of the second and third FW are regarded as duplicated FW by ValueDupeFinder. (I did “lpad reset” at the first.)

I read the code of _steal_launches method of fireworks.core.launchpad.py.

It seems that DupeFinder.verify is not called, then I think all matched FW by “query” will be regarded as duplicated.

Or, did I do something wrong?

I will appreciate any advice and help.

Thanks,

Akira Takahashi

You received this message because you are subscribed to the Google Groups “fireworkflows” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

Visit this group at https://groups.google.com/group/fireworkflows.

For more options, visit https://groups.google.com/d/optout.


Best,
Anubhav

Hi Anubhav,

I checked my dupefinder works as expected, and now there is no problem for me.

Thank you very much!

Best,

Akira Takahashi
2018年12月8日土曜日 2時23分14秒 UTC+9 Anubhav Jain:

···

Hi Akira,

I have modified the code again in the latest Github commit: https://github.com/materialsproject/fireworks/commit/ae7082eb010bbbd6a7c7884a3ac9a0e3ae907978

If this one looks good to you, I can do another official release; let me know!

If t

On Thursday, December 6, 2018 at 10:55:29 PM UTC-8, Akira Takahashi wrote:

Hi Anubhav,

Thanks for reply and update.

I’m sorry that my explanation wasn’t enough.

At the previous post, I had two problems, i.e.

  1. m_dupefinder.verify({}, {}) checks empty dicts, so in my ValueDupeFinder spec1[“value”] raises KeyError.
  1. Using try-else would be more defensive.

And the solution solved only the second problem and I didn’t write a solution of the first problem.

(Now I think that if DupeFinderBase.verify returns True instead of raising NotImplemetedError,

is_implemeted_test is not neccessary.

But I’m not sure this is a good solution…)

Now I also think that “query” condition should include {‘spec.value’: {$exists: True}},

or my customized verify method should handle KeyError like

def verify(self, spec1, spec2):

  if "value" not in spec1 or "value" not in spec2:
       return False
  return abs(spec1["value"] - spec2["value"]) < 10

``

because there might be FW whose spec does not have “value” key.

In conclusion, I will reconsider my customized query and verify to handle KeyError.

Anyway I appreciate your kind help.

Thanks,

Akira Takahashi

2018年12月7日金曜日 6時49分10秒 UTC+9 Anubhav Jain:

Hi Akira,

Thanks! Your solution is indeed more correct.

It should be reflected in FWS v1.8.2, hopefully this works for you.

Best,

Anubhav

On Tuesday, December 4, 2018 at 12:23:58 AM UTC-8, Akira Takahashi wrote:

(I am sorry that I mistakenly submit reply and delete it)

Hi Anubhav,

Thanks for updating.

Isn’t the update reflected on version 1.8.1?

I checked https://github.com/materialsproject/fireworks/releases, but the commit seems not reflected.

Therefore, I cloned from GitHub directly and tested.

But actually, I have a problem.

Since my ValueDupeFinder check the “value” key of spec, in the 1527th line of fireworks/core/launchpad.py, i.e.,

m_dupefinder.verify({}, {}) # is implemented test

raises KeyError.

Further more, I think the following code might be more defensive

try:

m_dupefinder.verify({}, {}) # is implemented test

except NotImplementedError:

verified = True

            else:

spec1 = dict(thief_fw.to_dict()[‘spec’]) # defensive copy
spec2 = dict(potential_match[‘spec’]) # defensive copy

verified = m_dupefinder.verify(spec1, spec2)

``

, instead of

try:

m_dupefinder.verify({}, {}) # is implemented test

spec1 = dict(thief_fw.to_dict()[‘spec’]) # defensive copy

spec2 = dict(potential_match[‘spec’]) # defensive copy

verified = m_dupefinder.verify(spec1, spec2)

except NotImplementedError:

verified = True

since NotImplementedError may cause in the function verify(spec1, spec2).

I know it is very rare case, but it is hard to find if occurs.

Thanks,

Akira Takahashi

2018年11月29日木曜日 0時50分31秒 UTC+9 Anubhav Jain:

I think I have fixed this, see:

https://github.com/materialsproject/fireworks/issues/310

and perhaps comment there if you see further issues. Thanks again for bringing this up!

On Monday, November 26, 2018 at 10:24:30 AM UTC-5, ajain wrote:

Hi Akira,

I checked and you are correct: I also did not see the “verify()” function being called.

That said, the previous version of FWs definitely had this code. For example, I reverted to FWS version 1.1.4 and the code for steal_launches looks like this:

for potential_match in self.fireworks.find(m_query):
    self.m_logger.debug(
        'Verifying for duplicates, fw_ids: {}, {}'.format(thief_fw.fw_id,
                                                          potential_match['fw_id']))
    spec1 = dict(thief_fw.to_dict()['spec']) # defensive copy
    spec2 = dict(potential_match['spec']) # defensive copy
    if m_dupefinder.verify(spec1, spec2): # verify the match
        # steal the launches
        victim_fw = self.get_fw_by_id(potential_match['fw_id'])


I am currently at the MRS conference, but will look into what happened and try to produce a patch as soon as I can. It might be as simple as restoring the above code, but I would like to see when and why it was removed in the first place. You can track the progress here:

[https://github.com/materialsproject/fireworks/issues/310](https://github.com/materialsproject/fireworks/issues/310)

Thank you for pointing this out!
Anubhav

On Mon, Nov 26, 2018 at 6:02 AM Akira Takahashi [email protected] wrote:

Hi,

I’m trying to customize DupeFinder, but it doesn’t work.

I wrote the following script.

my_fw_for_test.py

from fireworks import explicit_serialize, FiretaskBase, FWAction, Firework

from fireworks.features.dupefinder import DupeFinderBase

path = ["/home/taka/test_duplicate_check"] # directory where script exists

@explicit_serialize

class DoNothingTask(FiretaskBase):

required_params = [“name”]

def run_task(self, fw_spec):

   return FWAction()

class ValueDupeFinder(DupeFinderBase):

_fw_name = ‘ValueDupeFinder’

def verify(self, spec1, spec2):

   """
   Method that checks whether two specs are identical enough to be considered duplicates.
   Return true if duplicated.
   Args:
   spec1 (dict)
   spec2 (dict)
   Returns:
       bool
   """
   return abs(spec1["value"] - spec2["value"]) < 10

def query(self, spec):

   """
   Given a spec, returns a database query that gives potential candidates for duplicated Fireworks.
   Args:
       spec (dict): spec to check for duplicates
   """
   return {"$and": [{"launches": {"$ne": []}}, {"spec": spec}]}

class DupTestFW(Firework):

_fw_name = “FirstFW”

def init(self, value, parents=None):

    self._task = [DoNothingTask(name="nothing")]
   super().__init__(self._task, parents=parents, name=self._fw_name,
                    spec={"_dupefinder": ValueDupeFinder(),
                          "value": value})

``

main.py

from future import division, unicode_literals, print_function

``

import sys

from fireworks import LaunchPad, Workflow

from my_fw_for_test import DupTestFW

lpad = LaunchPad.auto_load()

def add_wf(name):

first = DupTestFW(value=float(sys.argv[1]))

wf = Workflow([first], name=name)

lpad.add_wf(wf)

if name == “main”:

add_wf(“test_duplicate_check”)

I executed command as following:

lpad reset
python main.py 0 && rlaunch singleshot
python main.py 5 && rlaunch singleshot
python main.py 100 && rlaunch singleshot

``

I expect that my ValueDupeFinder regard only the second FW as duplicated FW.

However, actually neither the second nor third FW are regarded as duplicated FW by ValueDupeFinder.

I also try rewrite ValueDupeFinder.query as

return {"$and": [{“launches”: {"$ne": []}}]}

``

, then both of the second and third FW are regarded as duplicated FW by ValueDupeFinder. (I did “lpad reset” at the first.)

I read the code of _steal_launches method of fireworks.core.launchpad.py.

It seems that DupeFinder.verify is not called, then I think all matched FW by “query” will be regarded as duplicated.

Or, did I do something wrong?

I will appreciate any advice and help.

Thanks,

Akira Takahashi

You received this message because you are subscribed to the Google Groups “fireworkflows” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

Visit this group at https://groups.google.com/group/fireworkflows.

For more options, visit https://groups.google.com/d/optout.


Best,
Anubhav