I've recently done a comparison of the OpenKIM AIREBO model and the LAMMPS AIREBO code and found a number of discrepancies in the force calculation of the AIREBO LJ interaction.
In the past, there have been a number of reports of issues with AIREBO by a number of groups [1,2,3], which this discussion is probably related to.
The code from OpenKIM seems to be a version of the serial code developed by the Stuart group.
In all cases I believe that these discrepancies are indeed bugs in LAMMPS' AIREBO implementation (analytically taking the derivatives of the potential to see which variant should be right).
I've collected all the fixes that are necessary in the AIREBO code in LAMMPS in a pull request .
For details, see the individual commit messages---each commit addresses exactly one issue.
To list the issues:
1. In bondorderLJ(), Etmp is calculated twice, so it's value is two time the expected value.
2. In bondorderLJ(), the derivatives of the cosines in p_ij do not take into account that rij is scaled.
3. In bondorderLJ(), the sum omega term does not take into account that rij is scaled.
4. In FLJ(), the LJ term is modified, and the formula contributing to the derivative of the modified LJ term mixes up format and signs.
In addition, I also found a discrepancy in the pi^rc_CC spline for Nconj > 8.
However, I am not sure about this one; the original papers can IMO be interpreted such that any of the splines is correct.
This discrepancy also probably has a much lower impact on force accuracy, so it is not as important.
It would be great if someone else could try my patch to see if there are any issues, and possibly run it against other AIREBO implementations to verify its efficacy.
Thanks a lot,
PS. I tried to conform to the Github workflow that you have, but please forgive me if I overlooked something. I am more than happy to get pointers how to improve this in the future.
PPS. Steps to reproduce my observations:
1. Grab the current LAMMPS version
2. Enable Manybody and OpenKIM
3. Install OpenKIM API with AIREBO model included 
4. Build LAMMPS, once with patch, once without
5. Run provided input files  with lammps w/o patch
6. vimdiff dump.kim dump.lammps
-> see the deviations, especially in timestep 8-10
7. Run provided input files with lammps w/ patch
8. vimdiff dump.kim dump.lammps
-> note that the deviations are noticeably reduced
The input file is a reduced (to 50 atoms/10 timesteps) example from a bigger verification suite.
The scripts rerun a trajectory that exercises the offending code regions, so each trial really is based on the same atom positions.
For your convenience, I've also included the resulting dump files.
5: You might (depending on your system) have to add "perror.f" to the source line in the src/models/*AIREBO*/Makefile