#

A Cautionary Tale

by Paul Edmon, October 16, 2019

First, in order to not bury the lede, if you are a chemist using or have used the "Willoughby-Hoye" python scripts please be advised that there is a bug which may impact your results.  It is advised that you re-verify your results with the updated patched version.  See these articles for more details:

https://pubs.acs.org/doi/10.1021/acs.orglett.9b03216

https://arstechnica.com/information-technology/2019/10/chemists-discover-cross-platform-python-scripts-not-so-cross-platform/

While this does only impact chemists using these algorithms it should serve as a cautionary tale for the rest of the numerical community.  As a numerical astrophysicist I recall a similar episode with the hydrodynamic code ZEUS.  In this case the code was being used in regimes that the authors never intended and thus the results in those regimes were found to be flawed and unreliable.  In another case I have seen it was found that MRI image results depended on which machine you were on, such that one could not trust the results being presented.

In graduate school, my advisor Professor Tom Jones at the University of Minnesota gave this advice to his students: "Always double and triple check your results even if they look correct."  Just because a code appears to be producing the correct answer does not mean it is doing so for the right reasons or will do so in all regimes.  Undiagnosed bugs can produce apparently correct results in one code path but then produce erroneous results in another.  Assumptions made by the author of the code need to be known by the user so that the user knows where the code is meant to be used and has been tested and where it has not.

This does not mean one needs to be ultra paranoid about everything coming out of numerical sciences but it should make us skeptical, as good scientists are.  A few rules to follow to avoid pitfalls:

  1. Always sanity test your results.
  2. Make sure you understand how the code you are using works and what regimes it works for and why (i.e. avoid blackboxes)
  3. When moving to a new platform, test that your code is still producing valid results.  Same with code upgrades and changes.
  4. Write your code to be platform agnostic and portable, and generate a robust testing suite.
  5. Recall that computers only do what they are told and people are fallible.
  6. Stay up to date with compilers and libraries especially bug fixes.
  7. Avoid building a virtual house of cards in terms of specific versions of libraries and types of machines your code is valid on.
  8. Have multiple different ways to independently check results.

There is more to be said about ensuring that your code and science are robust but the rules above should help you avoid falling into a situation where you produce results and even write papers but then find that it was all for naught due to a small error that no one noticed at the time.

That said we are all human and mistakes will be made even with the most diligent care is taken.  Thus if you do find an error or bug in your code, do not just fix it quietly but also notify the community and retract any results you published.  Honesty, humility, and openness is as always the best policy as it makes the whole computational community better.

CC BY-NC-SA 4.0 This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Permissions beyond the scope of this license may be available at Attribution.