Configuring qlaunch

Hello,

When submitting jobs in our local cluster, I only execute a command $ qsub parallel.sh and the job already goes into our queue. I’m having problems incorporating FireWorks into this. I have reached the tutorial about launching rockets through a queue.

This is what returned after $ qlaunch singleshot

Database at /data_piglet/lansan/atomate/config/FW_config.yaml is getting selected.

Found many potential paths for LAUNCHPAD_LOC: [’/data_piglet/lansan/atomate/config/my_launchpad.yaml’, ‘/data_piglet/lansan/atomate/fw_tutorials/queue/queue_tests/my_launchpad.yaml’]

Choosing as default: /data_piglet/lansan/atomate/config/my_launchpad.yaml

Found many potential paths for FWORKER_LOC: [’/data_piglet/lansan/atomate/config/my_fworker.yaml’, ‘/data_piglet/lansan/atomate/fw_tutorials/queue/queue_tests/my_fworker.yaml’]

Choosing as default: /data_piglet/lansan/atomate/config/my_fworker.yaml

Found many potential paths for QUEUEADAPTER_LOC: [’/data_piglet/lansan/atomate/config/my_qadapter.yaml’, ‘/data_piglet/lansan/atomate/fw_tutorials/queue/queue_tests/my_qadapter.yaml’]

Choosing as default: /data_piglet/lansan/atomate/config/my_qadapter.yaml

2017-12-07 20:36:10,186 INFO moving to launch_dir /data_piglet/lansan/atomate/fw_tutorials/queue/queue_tests

/data_piglet/lansan/atomate/atomate_env/lib/python3.4/site-packages/fireworks/queue/queue_adapter.py:142: UserWarning: Key logdir has been specified in qadapter but it is not present in template, please check template (/data_piglet/lansan/atomate/atomate_env/lib/python3.4/site-packages/fireworks/user_objects/queue_adapters/SLURM_template.txt) for supported keys.

.format(subs_key, self.template_file))

2017-12-07 20:36:10,187 INFO submitting queue script

2017-12-07 20:36:10,191 ERROR ----|vvv|----

2017-12-07 20:36:10,192 ERROR Running the command: sbatch caused an error…

2017-12-07 20:36:10,194 ERROR Traceback (most recent call last):

File “/data_piglet/lansan/atomate/atomate_env/lib/python3.4/site-packages/fireworks/user_objects/queue_adapters/common_adapter.py”, line 204, in submit_to_queue

p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

File “/usr/lib64/python3.4/subprocess.py”, line 856, in init

restore_signals, start_new_session)

File “/usr/lib64/python3.4/subprocess.py”, line 1460, in _execute_child

raise child_exception_type(errno_num, err_msg)

FileNotFoundError: [Errno 2] No such file or directory: ‘sbatch’

2017-12-07 20:36:10,194 ERROR ----|^^^|----

2017-12-07 20:36:10,195 ERROR ----|vvv|----

2017-12-07 20:36:10,195 ERROR Error writing/submitting queue script!

2017-12-07 20:36:10,196 ERROR Traceback (most recent call last):

File “/data_piglet/lansan/atomate/atomate_env/lib/python3.4/site-packages/fireworks/queue/queue_launcher.py”, line 136, in launch_rocket_to_queue

raise RuntimeError('queue script could not be submitted, check queue ’

RuntimeError: queue script could not be submitted, check queue script/queue adapter/queue server status!

Seems like a lot of errors. I am fairly new to FireWorks and would really like to configure this to our cluster.

Best,

Ralph

parallel_group2.sh (1.32 KB)

Hi Ralph

There are many types of queue management software (e.g., SLURM, PBS, etc)

Based on your parallel.sh you have PBS, but based on the error message trying to execute sbatch it looks like your my_qadapter.yaml file is configured to try SLURM.

You need to:

  1. Locate your my_qadapter.yaml file. The message is telling you it’s located at: /data_piglet/lansan/atomate/config/my_launchpad.yaml. However, the messages before that indicate that you might have installed default locations for this file in several places. You may want to fix that at a later point, e.g. by deleting locations you don’t plan to use.

  2. Change the “_fw_q_type” parameter in this file to be “PBS” (probably currently says SLURM)

  3. Modify the other parameters in my_qadapter.yaml to match your desired queue settings in terms of number of nodes, walltime, etc. The types of parameters you can specify depends on which queuing system you have set in step 2. For example, for PBS the possible template variables are found in the code: https://github.com/materialsproject/fireworks/blob/master/fireworks/user_objects/queue_adapters/PBS_template.txt

These parameters should match the types of settings that are already in your parallel.sh file.

  1. Try again
···

On Thu, Dec 7, 2017 at 4:44 AM, Ralph Nicolai Nasara [email protected] wrote:

Hello,

When submitting jobs in our local cluster, I only execute a command $ qsub parallel.sh and the job already goes into our queue. I’m having problems incorporating FireWorks into this. I have reached the tutorial about launching rockets through a queue.

This is what returned after $ qlaunch singleshot

Database at /data_piglet/lansan/atomate/config/FW_config.yaml is getting selected.

Found many potential paths for LAUNCHPAD_LOC: [’/data_piglet/lansan/atomate/config/my_launchpad.yaml’, ‘/data_piglet/lansan/atomate/fw_tutorials/queue/queue_tests/my_launchpad.yaml’]

Choosing as default: /data_piglet/lansan/atomate/config/my_launchpad.yaml

Found many potential paths for FWORKER_LOC: [’/data_piglet/lansan/atomate/config/my_fworker.yaml’, ‘/data_piglet/lansan/atomate/fw_tutorials/queue/queue_tests/my_fworker.yaml’]

Choosing as default: /data_piglet/lansan/atomate/config/my_fworker.yaml

Found many potential paths for QUEUEADAPTER_LOC: [’/data_piglet/lansan/atomate/config/my_qadapter.yaml’, ‘/data_piglet/lansan/atomate/fw_tutorials/queue/queue_tests/my_qadapter.yaml’]

Choosing as default: /data_piglet/lansan/atomate/config/my_qadapter.yaml

2017-12-07 20:36:10,186 INFO moving to launch_dir /data_piglet/lansan/atomate/fw_tutorials/queue/queue_tests

/data_piglet/lansan/atomate/atomate_env/lib/python3.4/site-packages/fireworks/queue/queue_adapter.py:142: UserWarning: Key logdir has been specified in qadapter but it is not present in template, please check template (/data_piglet/lansan/atomate/atomate_env/lib/python3.4/site-packages/fireworks/user_objects/queue_adapters/SLURM_template.txt) for supported keys.

.format(subs_key, self.template_file))

2017-12-07 20:36:10,187 INFO submitting queue script

2017-12-07 20:36:10,191 ERROR ----|vvv|----

2017-12-07 20:36:10,192 ERROR Running the command: sbatch caused an error…

2017-12-07 20:36:10,194 ERROR Traceback (most recent call last):

File “/data_piglet/lansan/atomate/atomate_env/lib/python3.4/site-packages/fireworks/user_objects/queue_adapters/common_adapter.py”, line 204, in submit_to_queue

p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

File “/usr/lib64/python3.4/subprocess.py”, line 856, in init

restore_signals, start_new_session)

File “/usr/lib64/python3.4/subprocess.py”, line 1460, in _execute_child

raise child_exception_type(errno_num, err_msg)

FileNotFoundError: [Errno 2] No such file or directory: ‘sbatch’

2017-12-07 20:36:10,194 ERROR ----|^^^|----

2017-12-07 20:36:10,195 ERROR ----|vvv|----

2017-12-07 20:36:10,195 ERROR Error writing/submitting queue script!

2017-12-07 20:36:10,196 ERROR Traceback (most recent call last):

File “/data_piglet/lansan/atomate/atomate_env/lib/python3.4/site-packages/fireworks/queue/queue_launcher.py”, line 136, in launch_rocket_to_queue

raise RuntimeError('queue script could not be submitted, check queue ’

RuntimeError: queue script could not be submitted, check queue script/queue adapter/queue server status!

Seems like a lot of errors. I am fairly new to FireWorks and would really like to configure this to our cluster.

Best,

Ralph

You received this message because you are subscribed to the Google Groups “fireworkflows” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To post to this group, send email to [email protected].

Visit this group at https://groups.google.com/group/fireworkflows.

To view this discussion on the web visit https://groups.google.com/d/msgid/fireworkflows/9fd66c9e-32cf-436d-9610-693bbd3a97d9%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Best,
Anubhav