Lammps under HTCondor

Hello,

Has anyone used lammps (stable_16Mar2018) under HTCondor (https://research.cs.wisc.edu/htcondor/) to run it on multiple cores/machines?

I’m trying to do it, but every time I do, jobs immediately go to sleep mode on all cores.

Any help is appreciated.

Hello,

Has anyone used lammps (stable_16Mar2018) under HTCondor (https://research.cs.wisc.edu/htcondor/) to run it on multiple cores/machines?

I’m trying to do it, but every time I do, jobs immediately go to sleep mode on all cores.

Any help is appreciated.

​this is a topic better discussed with the system managers of your condor pool. they should know best what can work and what not.
you will have to tell them exactly how​ you are running LAMMPS.

axel.

Hi Axel,

Thanks so much for your response.

Indeed I’m the system manager of condor pool as well.

I set it up and have been maintaining it.

I’ve successfully run GROMACS on the pool, but with lammps I’m having difficulty.

I was wondering if anyone in the lammps community has had this experience before and whether or not it can be fixed.

Hi Axel,

Thanks so much for your response.

Indeed I’m the system manager of condor pool as well.

​then you should be in the best position to debug the situation. it should be extremely simple for you to attach a debugger to the stalled processes and identify where they get stuck.​

I set it up and have been maintaining it.

I’ve successfully run GROMACS on the pool, but with lammps I’m having difficulty.

I was wondering if anyone in the lammps community has had this experience before and whether or not it can be fixed.

​unlikely. i don’t know much about condor, but i would first check, if you can run a serial executable. i would avoid MPI and instead use OpenMP for parallelization (either via USER-OMP, KOKKOS, or USER-INTEL​), and finally, i would avoid reading from stdin and instead use the -in flag.

axel.

Thanks so much Alex.

When you say “avoid MPI” is OpenMPI included?

you are the admin of your condor pool, it is your job to study its documentation, research in forums and mailing lists or the web in general. it took me less than 5 minutes to see that parallel jobs need special treatment and to see lots of related discussions to read through. this has become off-topic for this mailing list. unless there actually is somebody subscribed here, that has specific LAMMPS related experience with running parallel jobs on a condor pool and is willing to share it publicly, there is nothing more to discuss here.

Thanks so much Alex.

it is axel, please.​

When you say “avoid MPI” is OpenMPI included?

​is this a rhetorical question?​

axel.