## Friday, July 17, 2015

### Planned papers for 2015 - six months in

In January I wrote about the papers I plan to publish and made this list:

Submitted
1. Søs Torpenholt, Leonardo De Maria, Mats H. M. Olsson, Lars H. Christensen, Michael Skjøt, Peter Westh, Jan H. Jensen and Leila Lo Leggio "Effect of mutations on the thermostability of Aspergillus aculeatus β-1,4-galactanase" Computational and Structural Biotechnology Journal, submitted
2. Lars A. Bratholm, Anders S. Christensen, Thomas Hamelryck, and Jan H. Jensen "Bayesian inference of protein structure from chemical shift data" PeerJ, submitted. Preprint

Probable
3. Automated prediction of the NMR structure of the protein CI-2.
4. Linear scaling HF-3c calculations by interface to FMO2 in GAMESS
5. Thermodynamics of binding. I plan to turn my recent blogposts (with 2 more to come) on this topic into a perspective article.
6. ProCS14. I need to turn this masters thesis into a paper (how will I find the time?).
7. NMR structure of the protein AKMT.
8. Benchmarking of PM6 and DFTB3 for barrier heights computed using enzyme active site models.
9. Predicting binding free energies for CB7
10. Probabilistic treatment of distance restraints in protein structure determination

Six months later this list is now:

Published
1. Søs Torpenholt, Leonardo De Maria, Mats H. M. Olsson, Lars H. Christensen, Michael Skjøt, Peter Westh, Jan H. Jensen and Leila Lo Leggio "Effect of mutations on the thermostability of Aspergillus aculeatus β-1,4-galactanase" Computational and Structural Biotechnology Journal 2015, 13, 256–264. DOI
2. Lars A. Bratholm, Anders S. Christensen, Thomas Hamelryck, and Jan H. Jensen "Bayesian inference of protein structure from chemical shift data" PeerJ 2015, 3:e861. DOI
5. Jan H. Jensen "Predicting accurate absolute binding energies in aqueous solution: thermodynamic considerations for electronic structure methods" PCCP 2015, 17, 12441-12451. DOI

Actively being worked on
3. Automated prediction of the NMR structure of the protein CI-2.
We found a bug in the CamShift implementation.  This took a while to fix and to re-run a bunch of stuff, including the calculations in Paper 2.  The bug doesn't seem to affect the results much. Lars sent me the first rough draft of the paper this week.  There is still some calculations missing but it's coming together. My goal is a finished manuscript by the end of August.

6. ProCS15
As per usual, when writing up we found a bunch of small things that needed to be fixed/rerun/checked and since the MS student is now a PhD student in another group this, of course, takes time. I think we are at the point where we have all the data we need and know what to say, but enough has changed that it's a matter of writing the paper from scratch.  I'm working on this now. My goal is a finished manuscript by the end of August.

Very likely
8. Benchmarking of PM6 and DFTB3 for barrier heights computed using enzyme active site models.
Jimmy is now actively working on this and generating data at a rapid pace.  Current goal is to submit in early October at the latest, so that it will be published in 2015.  That means a first rough draft in early September, at the latest.

Probably next year
4. Linear scaling HF-3c calculations by interface to FMO2 in GAMESS
Jimmy basically needs to find a bug related to basis set normalization for heavier elements and run some benchmarks, once he is done with "paper 8".  I would like to submit this in December.

7. NMR structure of the protein AKMT//10. Probabilistic treatment of distance restraints in protein structure determination
These two papers are actually related and may be combined into 1.  However, there is some fundamental work with regard to "10" that still needs to be worked out and tested.

9. Predicting binding free energies for CB7
An undergraduate is now working on this, so there is a good chance we'll have something to publish in 2016.

New: 11. Protein structure refinement using ProCS15
We are getting some undergrads to work in this in September, so there is a good chance we'll have something to publish in 2016.

## Wednesday, July 15, 2015

### Does this experiment measure ΔG° or ΔA°?

I recently came across this interesting paper from 1974 in which they measure the equilibrium constants for hydrogen bond formation between methanol and amines in the gas phase.  I find it interesting because it's a concrete example the practical considerations that underlying thermodynamics - in particular whether the experiment best approximates constant pressure or constant volume.

Adapted from Millen & Mines J. Chem. Soc., Faraday Trans. 2, 1974,70, 693-699. (c) Royal Society of Chemistry,  Reproduced with permission.

Here's a picture of the experimental set-up.  What you don't see is that "The entire apparatus was immersed in a glass-fronted thermostat controlled within $\pm$0.1 $^\circ$C."  In essence you have two containers A and B connected by a valve C, which, initially is closed.  You know the volume of container A ($V_{\rm{A}}$) and B ($V_{\rm{B}}$) from calibration experiments using nitrogen gas for some level of mercury in the manometers.

To start an experiment you fill container A with some gas (e.g. methanol) by briefly opening valve D and you do the same for some amine in container B but you make sure that the pressure in B is higher than that in A. (I'll call the molecules in container A, $A$ and similarly for B.) Then you let the system sit a while so that the temperature of the gas is the same as the thermostat.  Then you record the equilibrium pressures in container A ($p_{\rm{A,i}}$) and B ($p_{\rm{B,i}}$) by reading off the heights of the mercury columns (to within $\pm$0.05 mm Hg) using a cathetomer.  The increase in volume due to the mercury moving can be calculated using the radius of the manometer tube and added to the measure volumes to yield $V_{\rm{A,i}}$ and $V_{\rm{B,i}}$

Then you open valve C briefly and let some gas flow from B into A, wait for equilibration, and re-measure the pressure in A ($p$) and B ($p_{\rm{B, f}}$).  The pressure in volume A
$$p=p_A+p_B+p_{A_2}+p_{B_2}+p_{AB}$$
is a sum of the pressures of the individual species now present in container A, such as molecule $A$, dimers of $A$ ($A_2$), etc. This expression can be rewritten as
$$p=p_A+p_B+K_{A_2}p_A^2+K_{B_2}p_B^2+K_{AB}p_Ap_B$$
where $K_{AB}$ is the equilibrium constant we are after.

$K_{A_2}$ and $K_{B_2}$ can be measured by similar experiments on pure $A$ and and $B$, but we have two additional unknowns $p_A$ and  $p_B$ so we need two additional equations:
$$\pi_A=p_A+2K_{A_2}p_A^2 +K_{AB}p_Ap_B$$
$$\pi_B=p_B+2K_{B_2}p_B^2 +K_{AB}p_Ap_B$$
Here $\pi_A$ is the "formal" pressure of $A$ which is the "the pressure the compound would exert if present in the vapour phase solely as monomer obeying the ideal gas law", and similarly for $B$.

The initial formal pressure of $A$ can be computed from the initial pressure measurement ($p_{\rm{A,i}}$) and the second virial coefficient of $A$ ($B_A(T)$)
$$\pi_{A,\rm{i}} = \frac{p_{\rm{A,i}}}{1+B_A(T)/V_{\rm{A,i}}}$$
and from this we can compute the final formal pressure of $A$ using Boyle's law.
$$\pi_A = \frac{\pi_{A,\rm{i}}V_{\rm{A,i}}}{V_{\rm{A,f}}}$$
$V_{\rm{A,f}} \ne V_{\rm{A,i}}$ because the level of mercury in the manometer changes but $\Delta V_{\rm{A}}$ can easily be computed knowing the radius of the manometer tube.

To get the formal pressure of $B$ in container A we convert the initial and final pressures measured for container B to formal pressures and compute the number of moles of $B$ transferred to volume A
$$\Delta n_B = \frac{\pi_{\rm{B,i}}V_{\rm{B,i}}-\pi_{\rm{B,f}}V_{\rm{B,f}}}{RT}$$
and use this value to compute the formal pressure of $B$ in volume A, $\pi_B$
$$\pi_B = \frac{\Delta n_B RT}{V_{\rm{A,f}}}$$
What about the standard free energy change?
Colloquially speaking, if we use the raw pressure data $K_{AB}$ will have units of (mm Hg)$^{-1}$ but after we convert it to bar$^{-1}$ we can compute a standard free energy change from $K_{AB}$. The question is whether this free energy change corresponds to a Gibbs ($\Delta G^\circ$) or Helmholtz free energy ($\Delta A^\circ$) change or something in between.  The Gibbs free energy corresponds to constant pressure while the Helmholtz free energy corresponds to constant volume, and neither seem to strictly apply. The short answer is "I don't know" while the long answer is:

I think the standard free energy change most closely represents $\Delta A^\circ$
While volumes A and B are connected under equilibrium conditions, $K_{AB}$ represents equilibrium measurements done on both volumes A and B, so the thermodynamic system is comprised of both volumes.  The manometer tubes most likely have the same radius to maximize error cancellation, so the decrease in volume B would be quite similar to the increase in volume A, leading to a small net volume change for the system.

What do you think?

## Friday, May 15, 2015

I recently came across this tweet which makes very innovative use of pictures to ask a multiple choice question.  After some experimentation I made this quiz yesterday

So far, the tweet has received 2,568 impressions and 741 engagements, which is twitter-speak for number of people who saw the tweet and clicked on it, respectively.  This is in large parts thanks to retweets by RealTimeChem, A-Level Chemistry and EiC, each with several thousand followers.

The question is taken straight from my teaching material and 741 is far more students that I reach in a year of teaching at the University of Copenhagen.  Did I just teach my first (Nano)MOCC?

Anyway, since I use peer instruction in all my courses I have tons more questions so this won't be the last #twitterquiz I post.

If you would like to make your own #twitterquiz you can find the images here.

## Sunday, May 10, 2015

### My teaching statement from 1996

Here is the teaching statement I wrote when was applying for tenure track positions in the US in 1996.  Glad to see the WWW is still around today, otherwise it would have spoiled all my teaching plans :).  How this would look like if I had to write it today is a topic for another blog post.

TEACHING PHILOSOPHY
Jan H. Jensen

Lecturing.  Even though I am a theoretical chemist I strongly believe that doing experimental demonstrations is one of the best ways of teaching and I have used it whenever I could.  It is a great way to show chemical concepts at work and it makes everything less abstract.  I have used experiments both as a way of introducing a topic and as a summary.  In both cases I generally ask the students to predict and explain the outcome to me first.  Ideally, I would like to perform one short demonstration per lecture but an appropriate demonstration for a given topic can be hard to find.  So I would like develop some new demonstrations as part of my teaching efforts.

Invariably we as chemists must explain these experiments and other phenomena in terms of atoms and molecules and this can be very hard.  For example, it is easy enough to say that heat is the random motion of molecules, and quite another to make the students imagine what we think this really looks like.  I think the easiest solution is to use computer graphics to generate animation to describe these things.  Computer animations are routinely done in computational research and there is no reason why one could not create and show these animations during lecture.  Furthermore, additional demonstrations could easily be made available outside of class if a computer lab is available.  Many of the programs used to calculate and display chemical results are easy enough to use so that students can be given access to them outside class.

In addition I would like to explore an attractive and increasingly viable alternative to the traditional computer labs, namely the World Wide Web.  The development of the JAVA language will soon make it possible to imbed 3-dimensional objects and movies, which can be rotated and manipulated interactively, in Web pages.  In effect, this will enable instructors to make interactive course notes or textbooks available to students in a format that most students already will be familiar with.  Furthermore, this material can be accessed from almost anywhere and at any time.

I have intentionally kept my comments fairly general because I believe they apply not only to introductory chemistry (where I have the most experience) but also physical chemistry and graduate classes.  Suitable experimental demonstrations may become harder to perform in-class for higher level courses, but these classes are generally small enough so that the demonstrations could be performed even in a research laboratory.

Research.  Science is best taught through research.  Based on personal experience I believe that undergraduate students should become involved in research as early as possible, and I plan to actively encourage that.  Basic computational chemistry has a fairly easy learning curve and students can quickly get started.  The subsequent interpretation of results will naturally lead the student to learn basic chemical and quantum mechanical principles.

It is hard to say how one would teach graduate students in general since that clearly depends on the individual student.  However, I will address one general expectation.  Students will be required to take on both computational and theoretical/programming projects, though not necessarily an equal amount of both.  Student whose main interest is computational chemistry should have at least some modest ability to alter the computer programs with which they work, so that their research is not totally determined by the capabilities of the programs they employ.  Theory/programming oriented students have to learn how to effectively apply the tools they develop.  Planning and performing a thorough theoretical study of a particular chemical problem is a non-trivial task that must be taught.

Shared students between experimental and theoretical groups are becoming increasingly popular, and I think it is a positive development.  Such students would not be required to take on a theoretical/programming problem.

### Python peer instruction questions

I am teaching a molecular simulations/intro to python course and have just finished drafting the last sets of peer instruction questions. Here's the last question. The idea is that they have to write a small python program on the spot but this might be more of a take home question.  Can you do it?

I'm not always this "evil".  Here's a more typical one.

## Saturday, May 2, 2015

### Koding.com: Installing numpy and matplotlib and transferring files

note to self:

"kpm install pip"
"sudo apt-get install python-dev"
"sudo pip install numpy"  (takes a while)
"sudo apt-get install libfreetype6-dev libxft-dev "
"sudo easy_install matplotlib "

To transfer files from Koding.com to your computer
move the file into the web directory (e.g. coordinates_end.png)
Go to vm setting modal and find the assigned URL (e.g. http://ujkkbe932623.jhjensen.koding.io/)
The file can be accessed at http://ujkkbe932623.jhjensen.koding.io/coordinates_end.png

If you want to transfer a .py file change the extension to .txt

## Saturday, April 25, 2015

### The main reason I use OA? It makes my research better

When I started publishing OA the answer was "The people who payed for this research should be able to read about the results".  Now the answer is more complex and difficult to fit into 140 characters. Hence this blog post.

The OA movement has three important "side effects":
1. Pioneered by PLoS ONE, many OA journals have removed perceived "impact" as a review criterion
2. Pioneered by PLoS ONE, many OA journals are mega-journals where the appropriateness of the topic of the manuscript to the journal is not an acceptance criterion.
3. Most OA journals allow you to make your manuscript public prior to submission

I have found that points 1-2 has made my research much less risk-averse.  I can focus on truly challenging and long-term research questions without worrying whether or where I will be able to publish.  Before it was: "In order to do X I have to do Y and Z first, but where will I publish Y and Z?" or " If I manage to do X and Y this will sail in to Journal Z". Now it is "It's important to find out about X; let's try it and publish what we find".

I can share our work at any stage in any way I see fit.  We put all our manuscripts, MS and PhD theses on preprint servers such as arXiv and we get a lot of great feedback long before the "official reviews" arrive.

It still puzzles me when I see tweets like "Manuscript submitted to X!" and "Paper finally accepted in Y!!!" without a link to a preprint.  What's the information really communicated here?  "Hope to score a publication point soon!"? and "Another line on my CV!!"?

Of course "The people who payed for this research should be able to read about the results" is still a major factor but it has become so much more.