including 3rd-party code in LAMMPS (license issue)

Question 1)
If I create a fix that includes code from Eigen, will it be considered for inclusion with LAMMPS? Eigen is a popular linear algebra library which uses the MPL2 license, and it is a big library (I think it’s over 100000 lines of code). To be clear, I am not proposing to include the entire Eigen library with LAMMPS. I only need about 200 lines of code from one of their files.

— details —
The MPL2 license (that Eigen uses) is less restrictive than the LGPL. Even so, I am guessing that if I use any Eigen code in my LAMMPS fix, I would be forced to include the MPL2 license somewhere in my source files which contains their code. So I would put the contents of their MPL2 license in one big comment at the end of one of my source files. (Would that be okay?)

Andrew

P.S.
— alternative —
I am aware that one of the existing LAMMPS files (“pppm_disp.cpp”) includes some code for calculating eigenvalues. I would need to modify that code heavily, so I prefer to use the Eigen code instead. I can do that if I have to. On that note, it occurs to me that many different fixes might eventually need to calculate eigenvalues and eigenvectors. So perhaps it makes sense to have a file in the “src/” directory that calculates eigenvectors and eigenvalues that different fixes can refer to (“pppm_disp.cpp” is in the KSPACE directory).

Question 1)
If I create a fix that includes code from Eigen, will it be considered for inclusion with LAMMPS? Eigen is a popular linear algebra library which uses the MPL2 license, and it is a big library (I think it’s over 100000 lines of code). To be clear, I am not proposing to include the entire Eigen library with LAMMPS. I only need about 200 lines of code from one of their files.

the USER-SMD package already uses code from Eigen (Eigen3, to be exact), however we do not bundle the corresponding code, but you have to install Eigen. When using CMake to configure/build LAMMPS, the presence of Eigen is detected and if not, it will try to download a copy and use that. Since Eigen is a template library, this is portable and simple.
i would strongly recommend against copying code from Eigen, as that would “decouple” the code from the Eigen distribution and - for example - no bugfixes or other improvements would be applied.
The biggest problem we had in figuring out in making (most of) LAMMPS compatible with LGPL was to identify pieces of code that people had liberally (and sometimes in violation of the license) copied into LAMMPS without giving attribution or any other way of identifying it easily.

— details —
The MPL2 license (that Eigen uses) is less restrictive than the LGPL. Even so, I am guessing that if I use any Eigen code in my LAMMPS fix, I would be forced to include the MPL2 license somewhere in my source files which contains their code. So I would put the contents of their MPL2 license in one big comment at the end of one of my source files. (Would that be okay?)

see above. we would want to avoid that.

Andrew

P.S.
— alternative —
I am aware that one of the existing LAMMPS files (“pppm_disp.cpp”) includes some code for calculating eigenvalues. I would need to modify that code heavily, so I prefer to use the Eigen code instead. I can do that if I have to. On that note, it occurs to me that many different fixes might eventually need to calculate eigenvalues and eigenvectors. So perhaps it makes sense to have a file in the “src/” directory that calculates eigenvectors and eigenvalues that different fixes can refer to (“pppm_disp.cpp” is in the KSPACE directory).

The alternative to using Eigen would be to use BLAS/LAPACK, which is the traditional library for linear algebra operations (e.g. USER-ATC, USER-AWPMD). in that case you can link to either an installed (and vendor optimized) BLAS/LAPACK library (like MKL, OpenBLAS) or compile and link to the subset in the lib/linalg/ folder, which contains the subset required for compiling and linking with just the BLAS/LAPACK functions used by different packages in LAMMPS.

having custom code or copies (e.g. from numerical recipes, which is problematic if copied literally from the numerical recipes provide files) for preforming such tasks is not such a good idea for maintenance reasons. there are a few files with library functions in LAMMPS for simple matrix/vector/math operations already available. the problem with these is usually that they require a specific layout of the data (same for BLAS/LAPACK), which is often considered too much of an inconvenience.

there is not a clean and consistent situation in LAMMPS for historical reasons. the code has grown over many, many years and in parts people didn’t pay too much attention to all those details affecting licensing and maintainability. now that we have changed things in that respect and do automatic integration and regression testing, the requirements for overall code quality and consistency are higher. so we are somewhat stuck between having bad examples already in the distributions, but at the same time are applying higher standards to new contributions (at least most of the time and to the best of our abilities).

HTH,
Axel.

Question 1)
If I create a fix that includes code from Eigen, will it be considered for inclusion with LAMMPS? Eigen is a popular linear algebra library which uses the MPL2 license, and it is a big library (I think it’s over 100000 lines of code). To be clear, I am not proposing to include the entire Eigen library with LAMMPS. I only need about 200 lines of code from one of their files.

the USER-SMD package already uses code from Eigen (Eigen3, to be exact), however we do not bundle the corresponding code, but you have to install Eigen. When using CMake to configure/build LAMMPS, the presence of Eigen is detected and if not, it will try to download a copy and use that. Since Eigen is a template library, this is portable and simple.

Hi Axel.
Thank you for your thoughtful, thorough, and sweeping answer to my question.

Incidentally, the fix in question is “fix bond/react” which is currently in USER-MISC. (I am adding new functionality to this fix.) I will say that I prefer your approach, and I think it’s cool you figured out a way to automate installing Eigen using CMAKE.

My concern is that following your advice would prevent end-users from being able to use fix bond/react (or any of the other fixes in USER-MISC), unless they figure out how to use CMAKE. In an ideal world, they would take the time to learn this useful skill. In the real world, they might not. I am reluctant to introduce new code which adds the first external dependency to USER-MISC (or fix bond/react). It would be different if other fixes in USER-MISC also require Eigen.

For now, I will try to adapt the code in “KSPACE/pppm_disp.cpp” (specifically the “qr_alg()” function) to suit my needs. (After some half-assed googling, I convinced myself that this code does not appear to be an exact copy of anybody else’s work.)

i would strongly recommend against copying code from Eigen, as that would “decouple” the code from the Eigen distribution and - for example - no bugfixes or other improvements would be applied.
The biggest problem we had in figuring out in making (most of) LAMMPS compatible with LGPL was to identify pieces of code that people had liberally (and sometimes in violation of the license) copied into LAMMPS without giving attribution or any other way of identifying it easily.

I can only imagine…

— details —
The MPL2 license (that Eigen uses) is less restrictive than the LGPL. Even so, I am guessing that if I use any Eigen code in my LAMMPS fix, I would be forced to include the MPL2 license somewhere in my source files which contains their code. So I would put the contents of their MPL2 license in one big comment at the end of one of my source files. (Would that be okay?)

see above. we would want to avoid that.

Agreed. Don’t worry. I won’t do this.

Andrew

P.S.
— alternative —
I am aware that one of the existing LAMMPS files (“pppm_disp.cpp”) includes some code for calculating eigenvalues. I would need to modify that code heavily, so I prefer to use the Eigen code instead. I can do that if I have to. On that note, it occurs to me that many different fixes might eventually need to calculate eigenvalues and eigenvectors. So perhaps it makes sense to have a file in the “src/” directory that calculates eigenvectors and eigenvalues that different fixes can refer to (“pppm_disp.cpp” is in the KSPACE directory).

The alternative to using Eigen would be to use BLAS/LAPACK, which is the traditional library for linear algebra operations (e.g. USER-ATC, USER-AWPMD). in that case you can link to either an installed (and vendor optimized) BLAS/LAPACK library (like MKL, OpenBLAS) or compile and link to the subset in the lib/linalg/ folder, which contains the subset required for compiling and linking with just the BLAS/LAPACK functions used by different packages in LAMMPS.

having custom code or copies (e.g. from numerical recipes, which is problematic if copied literally from the numerical recipes provide files) for preforming such tasks is not such a good idea for maintenance reasons. there are a few files with library functions in LAMMPS for simple matrix/vector/math operations already available. the problem with these is usually that they require a specific layout of the data (same for BLAS/LAPACK), which is often considered too much of an inconvenience.

there is not a clean and consistent situation in LAMMPS for historical reasons. the code has grown over many, many years and in parts people didn’t pay too much attention to all those details affecting licensing and maintainability. now that we have changed things in that respect and do automatic integration and regression testing, the requirements for overall code quality and consistency are higher. so we are somewhat stuck between having bad examples already in the distributions, but at the same time are applying higher standards to new contributions (at least most of the time and to the best of our abilities).

HTH,
Axel.

Thank you very much.
I’m glad I asked before proceeding.

Andrew

[…]

My concern is that following your advice would prevent end-users from being able to use fix bond/react (or any of the other fixes in USER-MISC), unless they figure out how to use CMAKE. In an ideal world, they would take the time to learn this useful skill. In the real world, they might not. I am reluctant to introduce new code which adds the first external

the same feature is also available for the conventional build (similarly for other packages with downloadable libraries or tools). you would type:

make lib-smd args="-b"

dependency to USER-MISC (or fix bond/react). It would be different if other fixes in USER-MISC also require Eigen.

the way fix bond/react has grown, it would justify moving it to its own package, and possible split the file into multiple files and consider certain variants with additional functionality being implemented as a derived class, where possible and meaningful. then the external dependency could be just copied from lib/smd and accordingly in the CMake configuration.

For now, I will try to adapt the code in “KSPACE/pppm_disp.cpp” (specifically the “qr_alg()” function) to suit my needs. (After some half-assed googling, I convinced myself that this code does not appear to be an exact copy of anybody else’s work.)

i have already voiced my concerns about duplicating code. if anything, please consider generalizing the interface, and make it a member of the MathExtra or utils namespace
if you look at my current draft PRs #1731 and #1721, i am currently doing some “spring cleaning” and consolidating use of utility functions, adding more “safe” I/O functions and reducing (trivial) compiler warnings (so that the ones that matter and indicate problems stand out more). long term goal is to get to a point, where we can justify requiring more strict coding standards from contributors and run more strict (automated) quality control tools on the code. as LAMMPS is growing, maintaining consistency and code quality is a growing problem and i want to keep it in check for as much and as long as possible in the time available. looking back at the last 3-4 years, we have already made large leaps ahead and especially the commonly used code paths have seen some significant auditing and cleanups.

axel.

Hi Steve

Axel has already provided his opinion. (I really appreciate the time he spent replying.)
I think I’d like to hear yours:

Jacob Gissinger and I are adding cool features to fix bond/react. But we need to be able to calculate eigenvectors of matrices. (See below.) Here are the options:

  1. An increasing number of fixes rely on being able to calculate eigenvalues and eigenvectors. But the existing LAMMPS source does not include a function for doing this on arbitrary-size matrices. I propose we modify the standard Makefiles and Cmake configuration files to automatically download Eigen C++ template library. (This is currently the standard open-source matrix diagonalization code.) As Axel pointed out, this has already been done for USER-SMD (which uses Eigen). I propose that we make this a general dependency of LAMMPS (similar to MPI and FFTW3). Are you open to this?

  2. Move the fix we are working on into a new package (eg “USER-REACT”) and ask users to follow specific installation instructions when using that package. (This is how “USER-SMD” currently works.) Jacob and I both want to avoid that if at all possible. It will reduce the number of people who can get fix bond/react to work.

  3. If I modified file a from “Eigen” and included it with our new fix, this would not prevent us from distributing the fix with LAMMPS using the LGPL license. Eigen uses the MPL license, which is compatible with LGPL and GPL. I only need to add a short preamble at the beginning of the file stating that it is compatible with both the MPL and LGPL licenses. (The wording of that preamble is at the end of this message.) Axel really does not like this, and I don’t blame him.

  4. Write my own (probably half-assed) function from scratch to diagonalzie matrices. It sounds easy, but it’s subtle work.

Which option would you prefer?

More details below, but you cans skip them:

For now, I will try to adapt the code in “KSPACE/pppm_disp.cpp” (specifically the “qr_alg()”
function) to suit my needs. (After some half-assed googling, I convinced myself that this
code does not appear to be an exact copy of anybody else’s work.)
Andrew

For the record, this effort failed: The “qr_alg()” function included in “pppm_disp.cpp” does not work. (The function always fails to converge. However since that function is apparently never invoked during normal usage, there is no reason to file a bug report.) There’s some Rosetta code that implements the QR algorithm but it uses another odd license (the “GFDL” license) and I cannot tell if it is compatible with LGPL. My efforts finding the QR algorithm on github with a clearly spelled-out license failed as well.

Andrew:
P.S. For completeness, here is text I would have to put at the beginning of the file to make it compatible with MPL and LGPL:

This file is free software: you may copy, redistribute and/or modify
it under the terms of the GNU General Public License as published by the
Free Software Foundation, either version XX of the License, or (at your
option) any later version.

This file is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
Public License for more details.

You should have received a copy of the GNU General Public License
along with this program. If not, see [http://www.gnu.org/licenses/](http://www.gnu.org/licenses/).

Copyright (c) 20XX (L)GPL-only Contributor1 [[email protected]](mailto:[email protected])

This file incorporates work covered by the following copyright and
permission notice:

This Source Code Form is subject to the terms of the Mozilla Public
License, v. 2.0. If a copy of the MPL was not distributed with this
file, You can obtain one at [http://mozilla.org/MPL/2.0/](http://mozilla.org/MPL/2.0/).

Copyright (c) 20XX, MPL Contributor1 [[email protected]](mailto:[email protected])
Copyright (c) 20XX, MPL and (L)GPL Contributor2 [[email protected]](mailto:[email protected])