multiple nodes in an HPC

Dear fellows,
I have a system of around 4000 molecules of H2O and CO2. The HPC I use for lammps has 24 processors per node. If I use one node, the speed is 63.115 timesteps/s. In order to speed up, I try to use three node but the speed is much lower to 26.637 timesteps/s. Some people say that this is because my system is not big enough to use multiple nodes. But if I just want to save time, how can I use multiple nodes at a time appropriately?
Thanks a lot.