Deleting fireworks

Dear all,
How do I properly delete a fw without breaking the queue launcher, and more generally the db? I’d say the short answer is not deleting a fw rather defuse or archive it.

Imagine I have a simple wf with two fws, fw2 dependent on fw1.
Since fw1 was FIZZLED and fw2 was still WAITING, I removed it with:

lpad.delete_fws([fw2_id]) 

Now, whenever I query the database, I get a:

lpad.get_wf_by_fw_id(fw1_id) #or fw2_id
ValueError: No Firework exists with id: fw2_id

same trouble with qlaunch rapidfire:

 $: qlaunch rapidfire
2022-12-14 11:06:11,809 INFO getting queue adapter
2022-12-14 11:06:11,809 INFO Found previous block, using /RUNS/block_2022-12-14-09-11-59-823101
2022-12-14 11:06:11,829 ERROR ----|vvv|----
2022-12-14 11:06:11,829 ERROR Error with queue launcher rapid fire!
2022-12-14 11:06:11,832 ERROR Traceback (most recent call last):
  File "/home/mdi0316/miniconda3/envs/kubas/lib/python3.10/site-packages/fireworks/queue/queue_launcher.py", line 287, in rapidfire
    or (nlaunches == 0 and not launchpad.future_run_exists(fworker))
  File "/home/mdi0316/miniconda3/envs/kubas/lib/python3.10/site-packages/fireworks/core/launchpad.py", line 919, in future_run_exists
    if any(self.get_fw_dict_by_id(i)["state"] == "WAITING" for i in children):
  File "/home/mdi0316/miniconda3/envs/kubas/lib/python3.10/site-packages/fireworks/core/launchpad.py", line 919, in <genexpr>
    if any(self.get_fw_dict_by_id(i)["state"] == "WAITING" for i in children):
  File "/home/mdi0316/miniconda3/envs/kubas/lib/python3.10/site-packages/fireworks/core/launchpad.py", line 499, in get_fw_dict_by_id
    raise ValueError(f"No Firework exists with id: {fw_id}")
ValueError: No Firework exists with id: 1965

2022-12-14 11:06:11,832 ERROR ----|^^^|----

The question is three fold:

  1. how do I forbid this to happen next time - and how to remove only 1 WF without destroying my db (probably archive/defuse instead of remove)
  2. is there a way to fix my db now? can I add the missing wf by hand?
  3. I understand that if you ask for a missing fw specifically lpad is not happy. but qlauch checking the old wf in the same block is another problem. If qlauch could skip the missing fw this would also bypass the problem

Thanks
Marco

Update

qlaunch rapidfire is not “totally” broken:

2022-12-14 13:49:19,382 INFO getting queue adapter
2022-12-14 13:49:19,383 INFO Found previous block, using /mnt/lustrefs/data/mdi0316/WORK_CLUSTER/KUBAS/RUNS/block_2022-12-14-11-02-20-754912
2022-12-14 13:49:20,290 INFO Launching a rocket!
2022-12-14 13:49:20,301 INFO Created new dir /mnt/lustrefs/data/mdi0316/WORK_CLUSTER/KUBAS/RUNS/block_2022-12-14-11-02-20-754912/launcher_2022-12-14-12-49-20-299965
2022-12-14 13:49:20,302 INFO moving to launch_dir /mnt/lustrefs/data/mdi0316/WORK_CLUSTER/KUBAS/RUNS/block_2022-12-14-11-02-20-754912/launcher_2022-12-14-12-49-20-299965
2022-12-14 13:49:20,319 INFO submitting queue script
2022-12-14 13:49:20,482 INFO Job submission was successful and job_id is 28275140
2022-12-14 13:49:20,498 INFO Sleeping for 5 seconds...zzz...

--- a bunch of other successful submissions ---

2022-12-14 13:49:55,870 ERROR ----|vvv|----
2022-12-14 13:49:55,870 ERROR Error with queue launcher rapid fire!
2022-12-14 13:49:55,925 ERROR Traceback (most recent call last):
  File "/home/mdi0316/miniconda3/envs/kubas/lib/python3.10/site-packages/fireworks/queue/queue_launcher.py", line 287, in rapidfire
    or (nlaunches == 0 and not launchpad.future_run_exists(fworker))
  File "/home/mdi0316/miniconda3/envs/kubas/lib/python3.10/site-packages/fireworks/core/launchpad.py", line 919, in future_run_exists
    if any(self.get_fw_dict_by_id(i)["state"] == "WAITING" for i in children):
  File "/home/mdi0316/miniconda3/envs/kubas/lib/python3.10/site-packages/fireworks/core/launchpad.py", line 919, in <genexpr>
    if any(self.get_fw_dict_by_id(i)["state"] == "WAITING" for i in children):
  File "/home/mdi0316/miniconda3/envs/kubas/lib/python3.10/site-packages/fireworks/core/launchpad.py", line 499, in get_fw_dict_by_id
    raise ValueError(f"No Firework exists with id: {fw_id}")
ValueError: No Firework exists with id: 1965

2022-12-14 13:49:55,925 ERROR ----|^^^|----

  1. To prevent this from happening in the future, please don’t delete individual fireworks. Note that if fw2 doesn’t depend on fw1 completing (such that fw1 can simply be deleted anyway), they probably shouldn’t be put in a dependency chain in the first place. Some of the more proper ways to archive/delete are covered in: Canceling (pausing), restarting, and deleting Workflows — FireWorks 2.0.3 documentation

  2. I think you have two options … You can try to repair, but this would mean manually adding in appropriate firework docs in the firework collection and appropriate launch docs in the launches collection. It may be simpler to archive or just delete the entire workflow (if still possible) and just re-enter a new one with the fireworks you want to execute.

  3. Unsure what is happening with qlaunch, but my best guess is that is that it is identifying your orphaned firework as the next job to run and then pulling up the workflow to update its state. When doing that it runs into an error since the workflow is now incomplete. If the qlaunch pulls up any other firework to run but your orphaned one, it is likely going to run fine. One thing you can do to prevent it from trying to run the orphaned firework (apart from the defuse/archive/etc. mentioned previously) is to just set the priority of the “good” workflows to be high of the “bad” workflow to be low.

One more note - if you want fw2 execute regardless of whether fw1 competes successfully or not, there is an _allow_fizzled_parents reserved keyword you can put in the spec of fw2 and set it to True. Then fw2 will execute even if fw1 is unsuccessful. A bit more about reserved keywords here: Reference material — FireWorks 2.0.3 documentation