## Monday, January 16, 2017

### Open access publishing options in 2017

I just noticed that my go-to journal increased its APC again.  Now there's a flat fee of $1095 so I am re-evaluating my options for impact neutral OA publishing. I don't think PeerJ is greedy, so I think the most likely explanation is be that their old model was not sustainable. I now feel I have been a bit to hard on some other OA publishers (e.g. here and here, but not here). While price and impact-neutrality is the main consideration, open peer review is a nice bonus that I became accustomed to from PeerJ. In my experience it makes for much better reviews and keeps the tone civil.$0. Royal Society Open Science still has an APC waiver and open peer review.

€750. Research Ideas and Outcomes (disclaimer: I am subject editor), open peer review.

$1000 F1000Research. Open peer review$1095 PeerJ. Open peer review.

$1350 Cogent Chemistry. Has a "pay what you can" policy. Closed peer review. HT +Stephan P. A. Sauer$1495 PLoS ONE. Closed peer review.

$1675 Scientific Reports. Closed peer review$2000 ACS Omega. Price for CC-BY by ACS member ($140/year). Closed peer review. So it looks like Royal Society Open Science is the next thing for me to try, as long as the APC waiver is in place. I should also say that I had a very good experience publishing in Chemical Science ($0, closed peer review) recently, but not all my papers are appropriate for that journal.  Similarly, the cost for publishing in ACS Central Science under the CC-BY licence is $500 for non-members. Let me know if I have missed anything. This work is licensed under a Creative Commons Attribution 4.0 ## Sunday, January 15, 2017 ### Making your computational protocol available to the non-expert I recently read this paper by Jonathan Goodman and co-workers which I learned about through this highlight by Steven Bachrach. The DP4 method is a protocol for computing chemical shifts of organic molecules using DFT and comparing the chemical shifts to experimental values. This paper automates the method, switches to free software packages (NWCHEM instead of Gaussian and TINKER instead of Macromodel), and tests the applicability for drug like molecules. The python and Java code is made available on Github under the MIT license. I like everything about this paper and what follows is not a criticism of this paper. The method is clearly aimed at organic chemists who use NMR to figure out what they made or isolated. Let's say they want to try DP4 to see how well it works on some molecule they are currently working on. What's needed to get started 1. Access to multicore Linux computer. The method requires quite many B3LYP/6-31G(d,p) NMR calculations and given the typical size of organic molecules it will probably not be practically possible to even test this method on a desktop computer. Even if it is, the instructions for PyDP4 assumes you are using Linux to you'd have to somehow deal with that if you, like many, have a Windows machine. 2. Installation. You have to install NWCHEM, Tinker, OpenBabel and configure PyDP4. 3. Coordinates. PyDP4 requires an sdf file as input. You have to figure out what that is and how to make one. 4. Familiarity with Linux. All this assumes that you are familiar with Linux. How many synthetic organic chemists are? If you'll be using DP4 a lot, all of this may be worth doing but perhaps not just to try it? If you don't have access to a Linux cluster, buying one for the occasional NMR calculation may be hard to justify. If one is convinced/determined enough, the most likely solution would probably be to find and pay an undergrad to do all this using an older computer you were gonna throw out anyway. Or maybe your department has a shared cluster and a sysadmin who could handle the installation. Alternative 1: Web server One alternative is to make DP4 available as a web server, where the user can upload the sdf file and other data. If one includes a GUI all 4 problems are solved ... for the user. The problem for the developer is that this could eat up a lot of your own computational resources. One could probably do something smart to only use idle cycles, but the best case scenario (lots of users) also becomes the worst case scenario. Perhaps there's a way to crowdsource this? Alternative 2: VM Virtual box Another alternative is to make DP4 available as a virtual machine (VM). This mostly solves the installation issue. The main problem here is that the user needs still needs to find a reasonably powerful computer to run this on. The other problem is that the developer needs to test the VM-installation on various operating systems and keep up to date as new ones appear. Perhaps there's a way to crowdsource all this? Alternative 3: Amazon Web Services or Google Compute Engine Another alternative is to make DP4 available as a VM image for AWS or GCE. This mostly solves the CPU and installation issue. The user creates an AWS or GCE account and imports the VM image and then pays Amazon and Google for computer time using a credit card. For reasonably sized molecules the cost would probably be less than$10/molecule as far as I can tell.

I don't have any direct experience with AWS or GCE so I don't know how slick the interface can be made. All examples I have seen have involved ssh to the AWS/GCE console, so some Linux knowledge is required.

Alternative 4: AWS/GCE-based Web server
Another alternative is to combine 2 and 3. The problem here is how to bill the individual user for their CPU-usage. There is probably ways to to this but it's starting to sound like a lot of work to set up and manage. Perhaps by adding a surcharge one could pay someone to handle this on a part-time basis.  Perhaps existing companies would be interesting in offering such a service?

Licensing issues
As far as I can tell the licenses of NWCHEM, TINKER, and OpenBabel allow for all 4 alternatives.

The bigger issue
A key step in making a computational chemistry-based methods such as DP4 usable to the non-expert is clearly automation and careful testing.  Another is using free software (I have access to Gaussian but I am not going to buy Macromodel just to try out DP4!). Kudos to Goodman and co-workers for doing this. But if we want to target the non-experts, I think we should try to go a bit further. One could even imagine something like this in the impact/dissemination section of a proposal: