Share fireworks DB across multiple supercomputer environments.

Hi

I have checked the website and forum but don’t see an answer to this question: Is it possible to use fireworks across multiple supercomputers such as a local cluster and various other supercomputers that would normally share data via scp? I’m commonly running jobs on three different supercomputers and being able to collect all run info and keep it organized would be fantastic.

Thanks
Kirk

Hi Kirk,

Sure, we use FWS across multiple supercomputers - there is no issue as long as each machine can access the LaunchPad database. Some of the details are below:

  • If you just connect to the same LaunchPad from different supercomputers, you can run jobs from multiple places. There are some additional features like “FireWorkers” that help configure what jobs get run where. A lot of this is covered here:

http://pythonhosted.org/FireWorks/controlworker.html

http://pythonhosted.org/FireWorks/worker_tutorial.html

  • If you want to pass data between the different supercomputing machines, everything should work fine across machines as long as you are passing data via the dynamic workflows features that use the central Mongo database:

http://pythonhosted.org/FireWorks/dynamic_wf_tutorial.html

That is because the dynamic workflows method pushes the information back to the central LaunchPad database, where it can be accessed and queried by any machine. So multiple machines are not a problem.

  • If you require output files (i.e., actual files, not JSON data inside MongoDB) to be shared between machines, e.g. job 1 on machine A produces an output file that needs to be passed to job 2 on machine B, there is no currently nothing in FWS to do that for you and you will need to build at least a little custom code. There is a FileTransferTask that can help with this but it would not be sufficient. At one point I thought to have an easy way to store file data in a central GridFS database using FireWorks, but never implemented it.

Best,

Anubhav/

···

On Friday, January 29, 2016 at 7:35:23 AM UTC-8, Kirk Lewis wrote:

Hi

I have checked the website and forum but don’t see an answer to this question: Is it possible to use fireworks across multiple supercomputers such as a local cluster and various other supercomputers that would normally share data via scp? I’m commonly running jobs on three different supercomputers and being able to collect all run info and keep it organized would be fantastic.

Thanks
Kirk

Dear Anubhav

This is great news. Thanks so much for the information and how-to. I’m going to give Fireworks on multiple SCs a shot.

Best Regards
Kirk

···

On Sunday, January 31, 2016 at 11:18:36 AM UTC-5, Anubhav Jain wrote:

Hi Kirk,

Sure, we use FWS across multiple supercomputers - there is no issue as long as each machine can access the LaunchPad database. Some of the details are below:

  • If you just connect to the same LaunchPad from different supercomputers, you can run jobs from multiple places. There are some additional features like “FireWorkers” that help configure what jobs get run where. A lot of this is covered here:

http://pythonhosted.org/FireWorks/controlworker.html

http://pythonhosted.org/FireWorks/worker_tutorial.html

  • If you want to pass data between the different supercomputing machines, everything should work fine across machines as long as you are passing data via the dynamic workflows features that use the central Mongo database:

http://pythonhosted.org/FireWorks/dynamic_wf_tutorial.html

That is because the dynamic workflows method pushes the information back to the central LaunchPad database, where it can be accessed and queried by any machine. So multiple machines are not a problem.

  • If you require output files (i.e., actual files, not JSON data inside MongoDB) to be shared between machines, e.g. job 1 on machine A produces an output file that needs to be passed to job 2 on machine B, there is no currently nothing in FWS to do that for you and you will need to build at least a little custom code. There is a FileTransferTask that can help with this but it would not be sufficient. At one point I thought to have an easy way to store file data in a central GridFS database using FireWorks, but never implemented it.

Best,

Anubhav/

On Friday, January 29, 2016 at 7:35:23 AM UTC-8, Kirk Lewis wrote:

Hi

I have checked the website and forum but don’t see an answer to this question: Is it possible to use fireworks across multiple supercomputers such as a local cluster and various other supercomputers that would normally share data via scp? I’m commonly running jobs on three different supercomputers and being able to collect all run info and keep it organized would be fantastic.

Thanks
Kirk