Moji,
You are making the strategic error of doing lots of little steps (Fortran code, Yukawa, etc.) and then trying to make one huge step (1/r^256 potential with rRESPA). It is natural to be impatient and try to jump to the end, but you will learn a lot more and do better quality work if you try to take reasonable steps. Enough about strategy; here are some tactics:
1. Make sure that you are conserving energy. For the PBC case this is easy. As soon as you add fixed walls, you need to be careful that you are bookkeeping the wall potential energy correctly
*I have checked the energy conservation for PBC with just interaction between particles (YUKAWA with Rc=9.0/kappa , kappa=1.0) for several time steps and we have agreements on dt = 0.5*
2. I am not sure how MD timestep stability limit scales with repulsive exponent, but it might be much worse than your estimated value of dt=0.0001 for 1/r^256.
*In our group some has done some simulation with u = 1/(r^256) and dt = 0.00025 has worked (we have implemented the scheme with fortran code, based on Daan Frenkel's book and also allen tildsley's MTS scheme). I shoule mention that the cut off for this potential is around 1.2*sigma*
3. I am not even sure that double precision representation will be accurate enough to resolve 1/r^256 adequately for energy conservation
4. I suggest you do some testing of energy conservation versus dt for cases, like 12, 14, 16, 20, 40, 80.
5. Only attempt to use rRESPA when you understand the single-step case well, and probably then test it on 1/r^12, 14, etc.
6. Remember, even if you get rRESPA to work, it may not deliver the timestep increase that you expect.
7. Finally, I strongly suggest using wall/reflect. It represents the effect of an infinitely hard wall, up to the resolution of the timestep. It does not conserve energy as well as a smooth potential, but error is O(dt). So, you can empirically find the timestep that is small enough for your purposes.
*we discussed about the reflect wall but we decided make this out of the table!
AND ALSO THIS IS MY SCRIPT:
# 3D Yukawa potential with walls
units lj
dimension 3
atom_style atomic
boundary p p p
read_restart restart.relax
change_box all z final -1.3517 102.2876 boundary p p f
reset_timestep 0
pair_style yukawa/smooth 1.0 9.0
pair_coeff 1 1 1.0 9.0
pair_modify shift yes
neighbor 0.3 bin
neigh_modify every 1 delay 0 check yes
fix solnvt all nvt temp 0.00202 0.00202 1.0
fix wallti all wall/rw256 zlo EDGE 1.0 1.0 9.0 zhi EDGE 1.0 1.0 9.0 units box
timestep 0.5
thermo_style custom step time pe ke etotal temp press density
thermo_modify format float "% .15e"
thermo 1000
dump solnvtdump all custom 100000 dumpsol.lammpstrj x y z
run 1000000
write_restart restart.relax