I ran in a problem of too many open files during execution of big workflows (several thousands of fireworks). the system is currently under development so it is not a problem, but it will, because the final system is intended to run many worflows of several thousand of fireworks on a dedicated cluster.
You can find here after the state of a simple scriptTask FIZZLED and the call stack which lead to the problem.
My system limit was 1024 opened files (ulimit -n). I have increased this limit to 4096 and everything runs fine now, but I wonder how I can reduce the number of files open by firework :
- should I reduce the number of fireworks and increase the number of tasks inside them or something like this ?
- Run the script-task with useShell = False ?
- … every advice will be appreciated
And last but not least : thanks a lot for the good work and for this very nice tool that makes my life easier
- “_details”: null,
- “_failed_task_n”: 0,
- “_stacktrace”: “Traceback (most recent call last):\n File “/usr/lib/python3.4/site-packages/fireworks/core/rocket.py”, line 211, in run\n m_action = t.run_task(my_spec)\n File “/usr/lib/python3.4/site-packages/fireworks/user_objects/firetasks/script_task.py”, line 37, in run_task\n return self._run_task_internal(fw_spec, stdin)\n File “/usr/lib/python3.4/site-packages/fireworks/user_objects/firetasks/script_task.py”, line 48, in _run_task_internal\n shell=self.use_shell)\n File “/usr/lib64/python3.4/subprocess.py”, line 859, in init\n restore_signals, start_new_session)\n File “/usr/lib64/python3.4/subprocess.py”, line 1359, in _execute_child\n errpipe_read, errpipe_write = os.pipe()\nOSError: [Errno 24] Too many open files\n”
“_message”: “runtime error during task”,
- “echo “ending correl_S2 workflow””