Thursday, December 3, 2015

Virtual lab notebook in IPython

I am using IPython a lot, either as a command prompt or in a Jupyter notebook. This is great to analyse data on the go. During such analysis you may generate some files, either some intermediate data, or figures.

Sometimes, weeks, months or years later you would like to remember exactly what you did to generate this file. This is important for science reproducibility, to check you followed the right method or just to reuse this handy bit of code.

When you have copied the code in a script file, easy. When you have organised properly your notebook and never delete the interesting cell, piece of cake. When it was yesterday and you can press the up key N times to look for the right set of lines in the history, painful but doable. But this no warranty, no rigorous method. One day you will think that writing the script is useless, one day you will delete the wrong cell in your notebook, and months after you will have to come back to this analysis and the history will be gone.

It is really akin to the lab notebook, the one in which you note the date, the temperature, the sample name, the procedure you follow, the result of the measures and all the qualitative observations. Some of it seems to matter much at the moment, but if you forget to write it down, you will never be able to reproduce your experiment.

How to make IPython generate this virtual notebook for you?

You want a file per IPython session with all the commands you type. You can achieve this manually by typing at the begining of each session

%logstart mylogfile append


But of course, you will forget.

We have to make this automatic. This is made possible by the profile configuration of IPython. If you have never created a profile for IPython, type in a terminal

ipython profile create

This will generate a default profile in HOME/.ipython/profile_default

Edit the file HOME/.ipython/profile_default/ipython_config.py and look for the line
#c.InteractiveShell.logappend = ''

Instead, write the following lines
import os
from time import strftime
ldir = os.path.join(os.path.expanduser("~"),'.ipython')
filename = os.path.join(ldir, strftime('%Y-%m-%d_%H-%M')+".py")
notnew = os.path.exists(filename)
with open(filename,'a') as file_handle:
    if notnew:
        file_handle.write("# =================================")
    else:
        file_handle.write("#!/usr/bin/env python \n# %s.py \n"
                  "# IPython automatic logging file" %
                  strftime('%Y-%m-%d'))
    file_handle.write("# %s \n# =================================" %
              strftime('%H:%M'))

# Start logging to the given file in append mode. Use `logfile` to specify a log
# file to **overwrite** logs to.
c.InteractiveShell.logappend = filename

And here you are, each time you open a IPython session, either console or notebook, a file is created in HOME/.ipython. The name of this file contains the date and time of the opening of the session (format YYYY-MM-DD_hh-mm). Everything you type during this session will be written automatically in this file.