Session 4: Analysis pipelines and IPython Notebook¶

A slightly more useful sqer script¶

Put the following in sqer/calc-lengths.py:

#! /usr/bin/env python
import argparse
import screed
import sqer

def main():
   parser = argparse.ArgumentParser()
   parser.add_argument('filenames', nargs='+')

   args = parser.parse_args()

   total = 0
   for filename in args.filenames:
       records = screed.open(filename)
       for record in records:
          print len(record.sequence)

if __name__ == '__main__':
   main()

then

chmod +x calc-lengths.py
git add calc-lengths.py
git commit -am "added calc-lengths.py"

Exercises¶

Write a test for calc-lengths.py!

Write a little analysis pipeline¶

Create a directory pipeline under sqer:

mkdir pipeline

and copy in the ‘trinity-nematostella.fa.gz’ file from the training files into this directory (any FASTA/FASTQ file will do here), gunzip it, and then rename it to assembly.fa.

Now, create pipeline/Makefile containing:

all: lengths.txt

lengths.txt: assembly.fa
     ../calc-lengths.py assembly.fa > lengths.txt

Now, when you type ‘make’, it will run your analysis pipeline. (...pretend that ‘calc-lengths.py’ takes a long time or something :)

Start up IPython Notebook¶

From within the pipeline directory, run:

ipython notebook --pylab=inline

Click on ‘New Notebook’. In this new notebook, enter:

data = numpy.loadtxt('lengths.txt')
hist(data, bins=100)
xlabel('Sequence lengths')
ylabel('N sequences with that length')
title('Sequence length spectrum')
savefig('hist.pdf')

and hit ‘Shift-ENTER’ to execute.

Voila!

Save the notebook (File... save...)

Now, do (from within the pipeline directory):

ls -1 assembly.fa lengths.txt > .gitignore
git add Makefile .gitignore *.ipynb
git commit -am "analysis makefile and notebook"

and then:

git push origin master

Now go find the raw URL to your notebook on github, copy it, and then paste it in at:

http://nbviewer.ipython.org

Voila!

Additional IPython resources:

The ipynb site: http://ipython.org/notebook.html
A gallery of interesting notebooks: https://github.com/ipython/ipython/wiki/A-gallery-of-interesting-IPython-Notebooks
The matplotlib gallery: http://matplotlib.org/gallery.html

Note that you can use ‘%loadpy’ in IPython Notebook to grab code from online and import it into your notebook automagically.

Session 4: Analysis pipelines and IPython Notebook¶

A slightly more useful sqer script¶

Exercises¶

Write a little analysis pipeline¶

Start up IPython Notebook¶

Table Of Contents

Previous topic

This Page

Navigation

Session 4: Analysis pipelines and IPython Notebook¶

A slightly more useful sqer script¶

Exercises¶

Write a little analysis pipeline¶

Start up IPython Notebook¶

Table Of Contents

Previous topic

This Page

Quick search

Navigation