parallel processing in fireworks

Dear Fireworks support,

I tried the python example at the end of the fireworks official URL as below

https://pythonhosted.org/FireWorks/workflow_tutorial.html

with small changes " && sleep some_seconds":

task1 = ScriptTask.from_str(‘echo “Ingrid is the CEO.” && sleep 1’)

task2 = ScriptTask.from_str(‘echo “Jill is a manager.” && sleep 100’)

task3 = ScriptTask.from_str(‘echo “Jack is a manager.” && sleep 100’)

task4 = ScriptTask.from_str(‘echo “Kip is an intern.” && sleep 1’)

then I discovered that task2 and task3 are executed sequentially, not parallelly as the diagram shows.

Could you tell me how to make task2 and task3 run parallelly, in python?

Thanks in advance!

Regards,

Alex Li

Bioinformatics

Dupont Pioneer

Hi Alex,

In the diamond workflow, after running FW1, the next two jobs (FWS 2 and 3) become “READY” to run simultaneously. However, if you use a serial worker (i.e., rlaunch rapidfire), then the worker will always just be pulling one job from the database sequentially and running it. So then FWS 2 and 3 will run serially as you noted.

To run in parallel, you need to either (i) have multiple workers pulling jobs or (ii) use a launch mode that processes jobs in parallel.

For (i), simply open two terminals with “rlaunch rapidfire --nlaunches infinite” and keep those running. You might need to choose a workflow that requires some non-trivial amount of processing to see the effect. i.e. instead of just a print statement, use a task that sleeps for 100 seconds before printing out. Anyway, with two different workers (either on the same machine, or on different machines), your jobs will be pulled in parallel and executed in parallel.

For (ii), you can read about the mlaunch mode of launching:

https://pythonhosted.org/FireWorks/multi_job.html

Best,

Anubhav

···

On Thursday, June 18, 2015 at 5:44:01 PM UTC-4, Alex Li wrote:

Dear Fireworks support,

I tried the python example at the end of the fireworks official URL as below

https://pythonhosted.org/FireWorks/workflow_tutorial.html

with small changes " && sleep some_seconds":

task1 = ScriptTask.from_str(‘echo “Ingrid is the CEO.” && sleep 1’)

task2 = ScriptTask.from_str(‘echo “Jill is a manager.” && sleep 100’)

task3 = ScriptTask.from_str(‘echo “Jack is a manager.” && sleep 100’)

task4 = ScriptTask.from_str(‘echo “Kip is an intern.” && sleep 1’)

then I discovered that task2 and task3 are executed sequentially, not parallelly as the diagram shows.

Could you tell me how to make task2 and task3 run parallelly, in python?

Thanks in advance!

Regards,

Alex Li

Bioinformatics

Dupont Pioneer

Anubhav,

I took a look at the documentation you mentioned and the source code of mlaunch_run.py, and modified the way to launch the rockets in the diamond example this way:

from fireworks.features.multi_launcher import launch_multiprocess

launch_multiprocess(launchpad, FWorker(), ‘INFO’, 4, 4, 1)

but it seems that FW2 and FW3 are still run sequentially, not in parallel.

Any suggestion? I prefer doing it in python, and not using mlaunch tools.

Thank in advance again!

Alex.

···

On Friday, June 19, 2015 at 1:22:16 PM UTC-5, Anubhav Jain wrote:

Hi Alex,

In the diamond workflow, after running FW1, the next two jobs (FWS 2 and 3) become “READY” to run simultaneously. However, if you use a serial worker (i.e., rlaunch rapidfire), then the worker will always just be pulling one job from the database sequentially and running it. So then FWS 2 and 3 will run serially as you noted.

To run in parallel, you need to either (i) have multiple workers pulling jobs or (ii) use a launch mode that processes jobs in parallel.

For (i), simply open two terminals with “rlaunch rapidfire --nlaunches infinite” and keep those running. You might need to choose a workflow that requires some non-trivial amount of processing to see the effect. i.e. instead of just a print statement, use a task that sleeps for 100 seconds before printing out. Anyway, with two different workers (either on the same machine, or on different machines), your jobs will be pulled in parallel and executed in parallel.

For (ii), you can read about the mlaunch mode of launching:

https://pythonhosted.org/FireWorks/multi_job.html

Best,

Anubhav

On Thursday, June 18, 2015 at 5:44:01 PM UTC-4, Alex Li wrote:

Dear Fireworks support,

I tried the python example at the end of the fireworks official URL as below

https://pythonhosted.org/FireWorks/workflow_tutorial.html

with small changes " && sleep some_seconds":

task1 = ScriptTask.from_str(‘echo “Ingrid is the CEO.” && sleep 1’)

task2 = ScriptTask.from_str(‘echo “Jill is a manager.” && sleep 100’)

task3 = ScriptTask.from_str(‘echo “Jack is a manager.” && sleep 100’)

task4 = ScriptTask.from_str(‘echo “Kip is an intern.” && sleep 1’)

then I discovered that task2 and task3 are executed sequentially, not parallelly as the diagram shows.

Could you tell me how to make task2 and task3 run parallelly, in python?

Thanks in advance!

Regards,

Alex Li

Bioinformatics

Dupont Pioneer

Can you attach the full source code?

···

On Fri, Jun 19, 2015 at 5:06 PM, Alex Li [email protected] wrote:

Anubhav,

I took a look at the documentation you mentioned and the source code of mlaunch_run.py, and modified the way to launch the rockets in the diamond example this way:

from fireworks.features.multi_launcher import launch_multiprocess

launch_multiprocess(launchpad, FWorker(), ‘INFO’, 4, 4, 1)

but it seems that FW2 and FW3 are still run sequentially, not in parallel.

Any suggestion? I prefer doing it in python, and not using mlaunch tools.

Thank in advance again!

Alex.

On Friday, June 19, 2015 at 1:22:16 PM UTC-5, Anubhav Jain wrote:

Hi Alex,

In the diamond workflow, after running FW1, the next two jobs (FWS 2 and 3) become “READY” to run simultaneously. However, if you use a serial worker (i.e., rlaunch rapidfire), then the worker will always just be pulling one job from the database sequentially and running it. So then FWS 2 and 3 will run serially as you noted.

To run in parallel, you need to either (i) have multiple workers pulling jobs or (ii) use a launch mode that processes jobs in parallel.

For (i), simply open two terminals with “rlaunch rapidfire --nlaunches infinite” and keep those running. You might need to choose a workflow that requires some non-trivial amount of processing to see the effect. i.e. instead of just a print statement, use a task that sleeps for 100 seconds before printing out. Anyway, with two different workers (either on the same machine, or on different machines), your jobs will be pulled in parallel and executed in parallel.

For (ii), you can read about the mlaunch mode of launching:

https://pythonhosted.org/FireWorks/multi_job.html

Best,

Anubhav

On Thursday, June 18, 2015 at 5:44:01 PM UTC-4, Alex Li wrote:

Dear Fireworks support,

I tried the python example at the end of the fireworks official URL as below

https://pythonhosted.org/FireWorks/workflow_tutorial.html

with small changes " && sleep some_seconds":

task1 = ScriptTask.from_str(‘echo “Ingrid is the CEO.” && sleep 1’)

task2 = ScriptTask.from_str(‘echo “Jill is a manager.” && sleep 100’)

task3 = ScriptTask.from_str(‘echo “Jack is a manager.” && sleep 100’)

task4 = ScriptTask.from_str(‘echo “Kip is an intern.” && sleep 1’)

then I discovered that task2 and task3 are executed sequentially, not parallelly as the diagram shows.

Could you tell me how to make task2 and task3 run parallelly, in python?

Thanks in advance!

Regards,

Alex Li

Bioinformatics

Dupont Pioneer

You received this message because you are subscribed to the Google Groups “fireworkflows” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To view this discussion on the web visit https://groups.google.com/d/msgid/fireworkflows/64416604-69fa-4c35-b46c-d544073e9316%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Anubhav,

The complete source code is at the end of my reply. I think that FW2 and FW3 are run in parallel now after I adjusted the sleep time to better see the order----sorry for not doing this correctly earlier as I have several versions of the test scripts. However, after I saw “Kip is an intern” (ie: FW4 is done), my script did not exit and it keeps printing something something like endlessly:

2015-06-20 15:13:03,379 INFO Checking for FWs to run… : (Process-2)

2015-06-20 15:13:03,381 INFO Sleeping for 2 secs : (Process-2)

2015-06-20 15:13:03,388 INFO Checking for FWs to run… : (Process-3)

2015-06-20 15:13:03,389 INFO Sleeping for 2 secs : (Process-3)

2015-06-20 15:13:04,617 INFO Checking for FWs to run… : (Process-4)

2015-06-20 15:13:04,619 INFO Sleeping for 2 secs : (Process-4)

2015-06-20 15:13:04,687 INFO Checking for FWs to run… : (Process-5)

2015-06-20 15:13:04,688 INFO Sleeping for 2 secs : (Process-5)

My new questions are:

how can I write a python script to run the rockets defined in my scripts only and exit after these rocket jobs are done?

Thanks again!

Alex.

The source code.

from fireworks import Firework, Workflow, FWorker, LaunchPad, ScriptTask

from fireworks.features.multi_launcher import launch_multiprocess

launchpad = LaunchPad(port=27017)

launchpad.reset(’’, require_password=False)

define four individual FireWorks used in the Workflow

task1 = ScriptTask.from_str(‘echo “Ingrid is the CEO.”’)

task2 = ScriptTask.from_str(‘echo “Jill is a manager.” && sleep 50’)

task3 = ScriptTask.from_str(‘echo “Jack is a manager.” && sleep 50’)

task4 = ScriptTask.from_str(‘echo “Kip is an intern.”’)

fw1 = Firework(task1)

fw2 = Firework(task2)

fw3 = Firework(task3)

fw4 = Firework(task4)

assemble Workflow from FireWorks and their connections by id

workflow = Workflow([fw1, fw2, fw3, fw4], {fw1: [fw2, fw3], fw2: [fw4], fw3: [fw4]})

store workflow and launch it locally

launchpad.add_wf(workflow)

launch_multiprocess(launchpad, FWorker(), ‘INFO’, 4, 4, 2)

end of the script

···

On Friday, June 19, 2015 at 4:52:06 PM UTC-5, Anubhav Jain wrote:

Can you attach the full source code?

On Fri, Jun 19, 2015 at 5:06 PM, Alex Li [email protected] wrote:

Anubhav,

I took a look at the documentation you mentioned and the source code of mlaunch_run.py, and modified the way to launch the rockets in the diamond example this way:

from fireworks.features.multi_launcher import launch_multiprocess

launch_multiprocess(launchpad, FWorker(), ‘INFO’, 4, 4, 1)

but it seems that FW2 and FW3 are still run sequentially, not in parallel.

Any suggestion? I prefer doing it in python, and not using mlaunch tools.

Thank in advance again!

Alex.

On Friday, June 19, 2015 at 1:22:16 PM UTC-5, Anubhav Jain wrote:

Hi Alex,

In the diamond workflow, after running FW1, the next two jobs (FWS 2 and 3) become “READY” to run simultaneously. However, if you use a serial worker (i.e., rlaunch rapidfire), then the worker will always just be pulling one job from the database sequentially and running it. So then FWS 2 and 3 will run serially as you noted.

To run in parallel, you need to either (i) have multiple workers pulling jobs or (ii) use a launch mode that processes jobs in parallel.

For (i), simply open two terminals with “rlaunch rapidfire --nlaunches infinite” and keep those running. You might need to choose a workflow that requires some non-trivial amount of processing to see the effect. i.e. instead of just a print statement, use a task that sleeps for 100 seconds before printing out. Anyway, with two different workers (either on the same machine, or on different machines), your jobs will be pulled in parallel and executed in parallel.

For (ii), you can read about the mlaunch mode of launching:

https://pythonhosted.org/FireWorks/multi_job.html

Best,

Anubhav

On Thursday, June 18, 2015 at 5:44:01 PM UTC-4, Alex Li wrote:

Dear Fireworks support,

I tried the python example at the end of the fireworks official URL as below

https://pythonhosted.org/FireWorks/workflow_tutorial.html

with small changes " && sleep some_seconds":

task1 = ScriptTask.from_str(‘echo “Ingrid is the CEO.” && sleep 1’)

task2 = ScriptTask.from_str(‘echo “Jill is a manager.” && sleep 100’)

task3 = ScriptTask.from_str(‘echo “Jack is a manager.” && sleep 100’)

task4 = ScriptTask.from_str(‘echo “Kip is an intern.” && sleep 1’)

then I discovered that task2 and task3 are executed sequentially, not parallelly as the diagram shows.

Could you tell me how to make task2 and task3 run parallelly, in python?

Thanks in advance!

Regards,

Alex Li

Bioinformatics

Dupont Pioneer

You received this message because you are subscribed to the Google Groups “fireworkflows” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To view this discussion on the web visit https://groups.google.com/d/msgid/fireworkflows/64416604-69fa-4c35-b46c-d544073e9316%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

With my collegue’s help, I tried this combination and it works as expected:
launch_multiprocess(launchpad, FWorker(), ‘INFO’, 1, 4, 2)

I will do a real case testing next.

Thank much anyway!

Alex.

···

On Saturday, June 20, 2015 at 3:20:49 PM UTC-5, Alex Li wrote:

Anubhav,

The complete source code is at the end of my reply. I think that FW2 and FW3 are run in parallel now after I adjusted the sleep time to better see the order----sorry for not doing this correctly earlier as I have several versions of the test scripts. However, after I saw “Kip is an intern” (ie: FW4 is done), my script did not exit and it keeps printing something something like endlessly:

2015-06-20 15:13:03,379 INFO Checking for FWs to run… : (Process-2)

2015-06-20 15:13:03,381 INFO Sleeping for 2 secs : (Process-2)

2015-06-20 15:13:03,388 INFO Checking for FWs to run… : (Process-3)

2015-06-20 15:13:03,389 INFO Sleeping for 2 secs : (Process-3)

2015-06-20 15:13:04,617 INFO Checking for FWs to run… : (Process-4)

2015-06-20 15:13:04,619 INFO Sleeping for 2 secs : (Process-4)

2015-06-20 15:13:04,687 INFO Checking for FWs to run… : (Process-5)

2015-06-20 15:13:04,688 INFO Sleeping for 2 secs : (Process-5)

My new questions are:

how can I write a python script to run the rockets defined in my scripts only and exit after these rocket jobs are done?

Thanks again!

Alex.

The source code.

from fireworks import Firework, Workflow, FWorker, LaunchPad, ScriptTask

from fireworks.features.multi_launcher import launch_multiprocess

launchpad = LaunchPad(port=27017)

launchpad.reset(’’, require_password=False)

define four individual FireWorks used in the Workflow

task1 = ScriptTask.from_str(‘echo “Ingrid is the CEO.”’)

task2 = ScriptTask.from_str(‘echo “Jill is a manager.” && sleep 50’)

task3 = ScriptTask.from_str(‘echo “Jack is a manager.” && sleep 50’)

task4 = ScriptTask.from_str(‘echo “Kip is an intern.”’)

fw1 = Firework(task1)

fw2 = Firework(task2)

fw3 = Firework(task3)

fw4 = Firework(task4)

assemble Workflow from FireWorks and their connections by id

workflow = Workflow([fw1, fw2, fw3, fw4], {fw1: [fw2, fw3], fw2: [fw4], fw3: [fw4]})

store workflow and launch it locally

launchpad.add_wf(workflow)

launch_multiprocess(launchpad, FWorker(), ‘INFO’, 4, 4, 2)

end of the script

On Friday, June 19, 2015 at 4:52:06 PM UTC-5, Anubhav Jain wrote:

Can you attach the full source code?

On Fri, Jun 19, 2015 at 5:06 PM, Alex Li [email protected] wrote:

Anubhav,

I took a look at the documentation you mentioned and the source code of mlaunch_run.py, and modified the way to launch the rockets in the diamond example this way:

from fireworks.features.multi_launcher import launch_multiprocess

launch_multiprocess(launchpad, FWorker(), ‘INFO’, 4, 4, 1)

but it seems that FW2 and FW3 are still run sequentially, not in parallel.

Any suggestion? I prefer doing it in python, and not using mlaunch tools.

Thank in advance again!

Alex.

On Friday, June 19, 2015 at 1:22:16 PM UTC-5, Anubhav Jain wrote:

Hi Alex,

In the diamond workflow, after running FW1, the next two jobs (FWS 2 and 3) become “READY” to run simultaneously. However, if you use a serial worker (i.e., rlaunch rapidfire), then the worker will always just be pulling one job from the database sequentially and running it. So then FWS 2 and 3 will run serially as you noted.

To run in parallel, you need to either (i) have multiple workers pulling jobs or (ii) use a launch mode that processes jobs in parallel.

For (i), simply open two terminals with “rlaunch rapidfire --nlaunches infinite” and keep those running. You might need to choose a workflow that requires some non-trivial amount of processing to see the effect. i.e. instead of just a print statement, use a task that sleeps for 100 seconds before printing out. Anyway, with two different workers (either on the same machine, or on different machines), your jobs will be pulled in parallel and executed in parallel.

For (ii), you can read about the mlaunch mode of launching:

https://pythonhosted.org/FireWorks/multi_job.html

Best,

Anubhav

On Thursday, June 18, 2015 at 5:44:01 PM UTC-4, Alex Li wrote:

Dear Fireworks support,

I tried the python example at the end of the fireworks official URL as below

https://pythonhosted.org/FireWorks/workflow_tutorial.html

with small changes " && sleep some_seconds":

task1 = ScriptTask.from_str(‘echo “Ingrid is the CEO.” && sleep 1’)

task2 = ScriptTask.from_str(‘echo “Jill is a manager.” && sleep 100’)

task3 = ScriptTask.from_str(‘echo “Jack is a manager.” && sleep 100’)

task4 = ScriptTask.from_str(‘echo “Kip is an intern.” && sleep 1’)

then I discovered that task2 and task3 are executed sequentially, not parallelly as the diagram shows.

Could you tell me how to make task2 and task3 run parallelly, in python?

Thanks in advance!

Regards,

Alex Li

Bioinformatics

Dupont Pioneer

You received this message because you are subscribed to the Google Groups “fireworkflows” group.

To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].

To view this discussion on the web visit https://groups.google.com/d/msgid/fireworkflows/64416604-69fa-4c35-b46c-d544073e9316%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.