When I run pollmach runstruct_vasp, occasionally I will get the following error:
I run as follows:
> maps -d &
> pollmach runstruct_vasp
The output looks like:
…
Finding best cluster expansion…
1 1 1 1.32221
done! sort: fflush failed: standard output: Broken pipe
sort: write error
This error always happens after ezvasp has completed running, but runstruct_vasp has not yet completed. I tried looking into the runstruct_vasp, but I do not see sort being called anywhere in that script. I think it may be occurring in the maps executable itself, but I’m not sure.
Will this error impact the results? (everything seems to be running fine)
Is there anything I can do to fix this?
If you want my input files, I can send them, but it seems like it doesn’t matter what alloy I choose. Also, this error doesn’t always happen. On average it happens ~10-20% of the runstruct_vasp runs.
This is a bit difficult to debug without having access to your machine…
But we can try a few things.
Trying running runstruct_vasp within a structure directory like this:
csh -v runstruct_vasp
and see at which steps it fails.
This script calls other scripts: ezvasp and extractvasp
so you can try putting the "csh -v" prefix in front the calls to those script withing runstruct_vasp.
My guess is that one command called just after sort dies prematurely and sort cannot sent it data anymore, hence the error. The above will let us see which command dies.
Thanks for the help! I was able to track down the error and find a fix (at least for me). The error occurs when running the extract_vasp script. Specifically when executing the following command:
For some reason, this command is unstable for me. I get the above mentioned fflush error approximately once out of every 5-10 times I run this command. To fix this, I simply broke up this command into separate smaller commands as follows:
After more testing, the error came back for me even with the above modifications. I think this error is just machine dependent, which makes it really trick to debug. I modified the above command slightly as follows, and this now finally does seem to fix this problem on all machines (at least for me). Sorry for the confusion. - Justin
Thanks for sharing your efforts!
BTW, I had gotten such types of error (I think) when there are too many processes running and the OS doesn’t allow a user to run a command anymore. The problem is intermittent because the load changes over time. (This happened to me under cygwin.)