Dear all,
I want to compute the standard deviation. First I compute the variance with avesq-ave^2, then compute its square root. The variance can be negative if it is very small and near zero, like -2e-17
. So I want to set it to zero if it is negative. In lammps source code, I could use MAX(variance, 0)
to do the limitation. But in lammps script, there seems to be only a ternary function in the variable that can do this. So I try variable Std equal ternary(v_Var>0,sqrt(v_Var),0)
. Then I encountered this error:
ERROR on proc 0: Variable Std: Sqrt of negative value in variable formula
Why did lammps compute the sqrt of a negative value even if I specify the condition? With this question, I check the source code.

No matter whether arg2
and arg3
are valid, they will always be computed. In C++, the ternary opeartor has short-circuit mechanism, i.e., only one of arg2
and arg3
is calculated but not all of them.
In some cases, only one of the args is valid and computing the invalid value will yield errors. For example, only compute sqrt(x)
when x>=0, only compute a/b when abs(b)>1e-15, etc.
I don’t know if it is appropriate if the source code is modified as follows:
if (tree->type == TERNARY) {
arg1 = collapse_tree(tree->first);
if (tree->first->type != VALUE) return 0.0;
tree->type = VALUE;
if (arg1) tree->value = collapse_tree(tree->second);
else tree->value = collapse_tree(tree->extra[0]);
return tree->value;
}
Counter example: In Fortran there is no short-circuiting.
The LAMMPS input script language is neither C++ nor Fortran.
To find out, please post a feature request issue on GitHub and assign it to @sjplimp who implemented this feature.
This is because you are using the worst formula possible. You are less likely to have these numerical issues if you compute the variance as a sum of squared distances. See here.
Because LAMMPS evaluates the arguments in variable formulas before performing the computation. It is not a programming language, and it does not implement lazy evaluation explicitly, so this is the behavior implemented in all three argument functions. I am unsure your correction would be understood as standard behavior of the code.
Note that you use ternary to bypass an error thrown at execution because of lazy numerical analyses. Replacing a bad estimator by a value you know is wrong is bad practice IMHO. It is sometimes better to have an error thrown at you to know that what you attempt to do is wrong rather than correcting the code so it doesn’t tell you about it.
3 Likes
Indeed. But this is what the doc of compute reduce
recommends:
The avesq setting does the same as sumsq , then divides the sum of squares by the number of values. The last two options can be useful for calculating the variance of some quantity (e.g., variance = avesq ave ).
If I use the direct formula, the LAMMPS commands might be this:
compute ave all reduce ave v_foo
variable diffsq atom (v_foo-c_ave)^2
compute var all reduce ave v_diffsq
In this case, why not add a new mode of compute reduce
so that the users can get the variance in one line:
compute var all reduce var v_foo
and remove avesq
from the modes so that they will always get a valid variance (I can’t think of any other use for avesq besides variance calculation).
This has been done in the recent fix ave/moments
.
This is a use suggestion, not a recommendation per se. If you see that this method does not fit to your use, there is no need to try and use it at all cost.
What fix ave/moments
does is calculating the time average/variance of global quantities. But what I want is calculating the spatial average/variance of a global vector, e.g., the result of fix ave/chunk
, or that of a per-atom vector, e.g., an atom-style variable.