Queue rapidfire doesn't realize it can run jobs

I’m running a test example for submitting to a slurm queue:

from fireworks.queue.queue_launcher import rapidfire as rapidfirequeue
from fireworks.queue.queue_launcher import launch_rocket_to_queue
from fireworks.core.fworker import FWorker
import fireworks.fw_config
from fireworks.utilities.fw_serializers import load_object_from_file
from fireworks import Firework, Workflow, LaunchPad, ScriptTask
qadapter = load_object_from_file(fireworks.fw_config.QUEUEADAPTER_LOC)
launchpad = LaunchPad()
launchpad.reset(’’, require_password=False)
fw1 = Firework(ScriptTask.from_str(‘echo “hello” >> hello.txt’))
fw2 = Firework(ScriptTask.from_str(‘echo “goodbye” >> goodbye.txt’))
wf = Workflow([fw1,fw2], name=“test workflow”)
launchpad.add_wf(wf)
rapidfirequeue(launchpad,FWorker(),qadapter)

my qadapter file looks like:
_fw_name: CommonAdapter
_fw_q_type: SLURM
rocket_launch: rlaunch -w /home/mattsj/my_fworker.yaml -l /home/mattsj/my_launchpad.yaml singleshot
ntasks: 1
cpus_per_task: 8
walltime: ‘5-00:00:00’
queue: month-long-cpu
account: mattsj
job_name: null
logdir: /home/mattsj/fw_logs
pre_rocket: null
post_rocket: null

fworker file is an exact copy of the tutorial example

I’m on version 2.0.3 from conda-forge.

When I run the above python script in the FW_job.out file in the output files I get:
No FireWorks are ready to run and match query! {’$or’: [{‘spec._fworker’: {’$exists’: False}}, {‘spec._fworker’: None}, {‘spec._fworker’: ‘my first fireworker’}]}

Dumping the fireworks afterwards to dictionaries I get:

{‘spec’: {’_tasks’: [{‘script’: [‘echo “hello” >> hello.txt’],
‘use_shell’: True,
‘_fw_name’: ‘ScriptTask’}]},
‘fw_id’: 2,
‘created_on’: ‘2022-07-11T21:19:46.133113’,
‘updated_on’: ‘2022-07-11T21:19:46.138695’,
‘state’: ‘READY’,
‘name’: ‘Unnamed FW’}

and
{‘spec’: {’_tasks’: [{‘script’: [‘echo “goodbye” >> goodbye.txt’],
‘use_shell’: True,
‘_fw_name’: ‘ScriptTask’}]},
‘fw_id’: 1,
‘created_on’: ‘2022-07-11T21:19:46.133171’,
‘updated_on’: ‘2022-07-11T21:19:46.138699’,
‘state’: ‘READY’,
‘name’: ‘Unnamed FW’}

So it seems to not be running them because it thinks the current FWorker isn’t suitable.

This seems to have been an issue with configuration. When I load the my_launchpad.yaml, my_fworker.yaml and my_qadapter.yaml files manually and use those when calling rapidfire this issue went away.

Hi @mjohnson541.

It might be that you need to specify the my_fworker.yaml file when you create the FWorker obj ?
As you do in similar way for the launchpad and qadapter. In other words, it might be that when you create FWorker() it does not find your config file (supposedly /home/mattsj/my_fworker.yaml)

Not sure though, because I usually run the qlaunch via command line and not via python.

Dear all,
I am in a similar situation:
I have previously submitted a number of WF with qlaunhch rapidfire and they were running fine

After a while I had to pause some of them, and since then (over a few hours now) fireworks is still only running 5 WF in total, while there are still 16 slurm job running.

I have tried to use lpad.set_priority() but this is to modify the priority inside a WF, while I want to push a specific FW to start.

Do you have any suggestion?
Thanks
Marco

Hi Marco,
do you still have qlaunhch rapidfire running?

No, it exided on its own after 60 secs sleeping

I’m not sure why this is happening. It needs more investigation.
I would try to rerun one of the READY FW to see if it can be seen by the qlaunch again.