RDF output

I used these commands to find the radial distribution function.

“compute myRDF all rdf 100”

"fix 1 all ave/time 100 1 100 c_myRDF file tmp.rdf mode vector  "

The output file has 4 columns. The first one is number of bins, the second is bin 's coordinate, third and forth columns is g® and coordination number, respectively.
For plotting RDF, i simply need to plot the 2nd and 3rd column. But, the maximum value for g® are pretty high e.g. It 's 1300.
Do i need to scale the g® by density of system (number of atoms / volume) ?

Also, based on LAMMPS document “A coordination number coord® is the sum of g® values for all bins up to and including the current bin”
But in my output, the summation of 3rd column is not equal to the last value in the 4th column (coordination number for last bin).
Is there something wrong with the way which i try to interpret the output data for RDF ?

I used these commands to find the radial distribution function.

"compute myRDF all rdf 100"

"fix 1 all ave/time 100 1 100 c_myRDF file tmp.rdf mode vector "

The output file has 4 columns. The first one is number of bins, the second
is bin 's coordinate, third and forth columns is g(r) and coordination
number, respectively.
For plotting RDF, i simply need to plot the 2nd and 3rd column. But, the
maximum value for g(r) are pretty high e.g. It 's 1300.

so? have you looked at what the g(r) represents? it is not impossible
for it to be very large under certain circumstances. so it is not the
absolute value that you need to look at, but whether the value is
consistent with its physical interpretation.

Do i need to scale the g(r) by density of system (number of atoms / volume)
?

see my previous comment.

Also, based on LAMMPS document "A coordination number coord(r) is the sum
of g(r) values for all bins up to and including the current bin"
But in my output, the summation of 3rd column is not equal to the last
value in the 4th column (coordination number for last bin).
Is there something wrong with the way which i try to interpret the output
data for RDF ?

are you by any chance doing variable cell simulations?

axel.

As Axel said, you should know the physical interpretation (and definition) of g® when computing it, if you want to make sense of outputs. Knowing the def answers nearly all of your questions. However, there are two reasons I can think of for large peak values in g®. First is a perfect storm of small system size, too small of bin size, and you are not sampling long enough; in this case your large peaks are simply fluctuations(this should be easy to tell if you know the interpretation of g®). Second, you have a relatively dilute system that is condensing. In any case, its not likely that LAMMPS is computing your RDF output incorrectly.

1. Coordination number: The definition of coordination number in the
documentation is slightly off. Instead of:

"A coordination number coord(r) is also calculated, which is the sum
of g(r) values for all bins up to and including the current bin."

it should read:

"A coordination number coord(r) is also calculated, which is the
number of atoms of type jtypeN within the current bin or closer,
averaged over atoms of type itypeN. This is calculated as the volume
or area sum of g(r) values over all bins up to and including the
current bin, scaled by the average volume density of atoms of type
jtypeN."

2. Large rdf value: As Eric suggested, an arbitrarily large values of
the rdf can be produced by an ordered cluster in a box that is mostly
empty. In such cases, the LAMMPS calculation of rdf is not going to
be realistic, but the coordination number is still good. Here is a
minimal script that demonstrates the problem for a single pair of
atoms in a big box. Note that the coordination number is still spot
on.

[[email protected]... src]$ more in.pair
# RDF for a single pair of atoms
units lj
atom_style atomic
atom_modify sort 0 0
region box block 0 1.0e6 0 1.0e6 0 1.0e6
create_box 1 box
create_atoms 1 single 0 0 0
create_atoms 1 single 0.995 0 0
mass * 1.0
pair_style lj/cut 1.0
pair_coeff * * 0.0 0.0 1.0
neighbor 0.0 nsq
compute myRDF all rdf 100
fix avrdf all ave/time 100 1 100 c_myRDF file rdf.dat mode vector
run 0 pre no post no

[[email protected]... src]$ lmp_mac_mpi -in in.pair -echo none
LAMMPS (12 Aug 2013)
Created orthogonal box = (0 0 0) to (1e+06 1e+06 1e+06)
  1 by 1 by 1 MPI processor grid
Created 1 atoms
Created 1 atoms
Setting up run ...
Memory usage per processor = 2.33475 Mbytes
Step Temp E_pair E_mol TotEng Press
       0 0 0 0 0 0
Loop time of 7.86781e-06 on 1 procs for 0 steps with 2 atoms

[[email protected]... src]$ tail -2 rdf.dat
99 0.985 0 0
100 0.995 4.01893e+18 1

Thanks to all.

I am going to compare my g® with experiments to find the crystallographic structure (BCT, HX, or WZ). The maximum value in the g® (experiments) is “11” for WZ structure but mine is 1300 (LAMMPS).
This is the reason which i can not interpret the g® result from LAMMPS.
Also, I believe LAMMPS exactly calculates the radial distribution function which i want :
g®=(number of atoms in nth bin / volume of nth bin) / (system density)

In my model the simulation box is scaled to model tensile loading and i have periodic boundary condition only in the length of NW.
So, you are right Axel, i am doing variable cell simulation. Does it cause any problem in calculating g® ?

Eric and Aidan:
Yes, i used a very large box to introduce the surface effect in the lower dimensions of NW. So, this maybe the reason for high peaks in g®.
For the bin size, i do not think so. Since I used different bin sizes to understand the sensitivity of my results and in all cases i 've got high peaks.

Thanks to all.

I am going to compare my g(r) with experiments to find the crystallographic
structure (BCT, HX, or WZ). The maximum value in the g(r) (experiments) is
"11" for WZ structure but mine is 1300 (LAMMPS).
This is the reason which i can not interpret the g(r) result from LAMMPS.
Also, I believe LAMMPS exactly calculates the radial distribution function
which i want :
g(r)=(number of atoms in nth bin / volume of nth bin) / (system density)

In my model the simulation box is scaled to model tensile loading and i have
periodic boundary condition only in the length of NW.
So, you are right Axel, i am doing variable cell simulation. Does it cause
any problem in calculating g(r) ?

it can lead to inconsistencies when doing the integration, as your
particle density is changing during the simulation. now the
coordination number can still be computed exactly, if you keep the
original histogram data around. the g(r) plugin in VMD does that for
example.

Eric and Aidan:
Yes, i used a very large box to introduce the surface effect in the lower
dimensions of NW. So, this maybe the reason for high peaks in g(r).
For the bin size, i do not think so. Since I used different bin sizes to
understand the sensitivity of my results and in all cases i 've got high
peaks.

the g(r) normalization is only meaningful for a (homogeneous) bulk
system. if you have a system with surfaces (and vacuum), this will
mess up the volume for computing the particle density and thus make
the absolute value of the g(r) meaningless. the peak positions are
still correct and the relative heights. well, almost, there is a tiny
system size dependent error that results from using a finite size
system due to the excluded volume of the reference particle.

axel.