Archives

Legs of Snakes and Religion of Biologists

We wrote about bladderwort genome paper earlier.

Carnivorous Bladderwort Plant Genome Shows Some Designers are More Intelligent than Others

The authors discovered that the genome of the carnivorous plant was packed with protein-coding genes and it thoroughly purged out all ‘junk DNA’. It was the clearest evidence of lack of functionality of junk DNA, because if 80% junk DNA were functional as claimed by ENCODE, 22 Gb pine genome would have coded for lot more function than 82Mb bladderwort. Yet, externally we do not see pine trees doing anything more than bladderwort. They are both multi-cellular, both grow leaves, roots and shoots and both perform photosynthesis. Sure, pine trees are much bigger than bladderworts, but no correlation between physical size and genome size has been shown in plants.

Dr. Jonathan Eisen did not agree and came up with the most logically distorted argument in support of junk DNA.

Most plants have junk DNA, one doesn’t. Thus, junk DNA is useless.

Most reptiles have legs, snakes don’t. Thus, legs are useless.

The analogy is flawed, as pointed out by T. Ryan Gregory -

Genome Reduction in Bladderworts vs. Leg Loss in Snakes

No, that isn’t the logic, and the legless snakes or eyeless cave fishes analogy is flawed. Why?

1. We know that legs and eyes are functional, and we know what they are functional for (walking and seeing, respectively). By contrast, we do not have strong evidence that non-coding DNA is functional or what it may be functional for. Worse, the very existence of so much non-coding DNA itself is taken as “evidence” that it must be doing something. Therefore, the observation of a plant that lacks a substantial amount of non-coding DNA but gets by just fine suggests that this kind of DNA isn’t strictly necessary in order to make a complex plant.

2. If most of the non-coding DNA in a larger genome does serve an important regulatory function, then it means this plant with a tiny genome must have evolved a totally different system for regulating its genes. This strikes me as a rather large assumption — and in any case, it’s one for which we have no evidence. As such, I would argue that it is at least as parsimonious to take this small genome as evidence that non-coding DNA in general does not serve a key regulatory function for the most part.

3. When snakes lost their legs or cave fishes lost their eyes, they also lost the specific ability that legs or eyes provided. Legless snakes can’t walk, because the function of legs is walking. Eyeless fishes can’t see, because the function of eyes is seeing. The proposed function for non-coding DNA is gene regulation. Unlike the snake or fish example, the bladderwort has lost most of its non-coding DNA but it can still regulate all of its genes just fine.

What T. Ryan Gregory wrote in so much text was summarized by Dan Graur in three sentences.

Thus, a better analogy would be

Most plants have junk DNA, one doesn’t. Thus, junk DNA is useless.
Most people have imaginary friends (religion), some don’t. Thus, imaginary friends are useless.


An (unnecessary) imaginary friend

We presume Dr. Graur did not want to mean NIH-funded biologists as those without religion, because, for some, religious practices and leaders evolved with time -

Introduction to Chaos and Nonlinear Dynamics for Biologists, Perl Code Included

Mathematics of chaos and nonlinear dynamics are so important for biological modeling that we thought it would help our readers, if we explain them in simple language. The toy model presented below was already mentioned in an earlier commentary, but here we will add code and other details so that you can play with it.

The concept of toy model is very important in data analysis and in understanding ‘big data’ or large amount of likely useless data. For example, imagine you are studying how people get fat. You may collect extensive amount of data on their food habits, what languages they speak, how many sentences they utter in a day, how many hours they sleep, but ultimately may find that their weight gain is directly proportional to number of bottles of sugary syrup they drink. So, the toy model is a straight line correlating weight gain and amount of sugar input, while other factors contribute to noise.

Mathematicians studied many types of simple toy models and observed that complex behavior appeared in very simple models as long as they included some nonlinearity. Try this equation -

It is a discrete equation and here is how to run it. Let us say r=1. We will start with x_0 between 0 and 1 (let’s say 0.5), and compute x_1=r *x_0 * (1-x_0)=0.25. Then we will plug-in x_1 into the equation and compute x_2. Here is the code you can play with, where r is the input parameter.


#!/usr/bin/perl

$r=$ARGV[0];

$x=0.5;

for($i=0; $i<20000; $i++)
{
$x= $r*$x*(1-$x);
print "$i $x\n";
}

What will you see at different values of r?

With r between 0 and 1, the population will eventually die, independent of the initial population.

With r between 1 and 2, the population will quickly approach a steady state value, independent of the initial population.

With r between 2 and 3, the population will also eventually approach the same value, but first will fluctuate around that value for some time. The rate of convergence is linear, except for r=3, when it is dramatically slow, less than linear.

With r between 3 and (approximately 3.44949), from almost all initial conditions the population will approach permanent oscillations between two values. These two values are dependent on r.

With r between 3.44949 and 3.54409 (approximately), from almost all initial conditions the population will approach permanent oscillations among four values.

With r increasing beyond 3.54409, from almost all initial conditions the population will approach oscillations among 8 values, then 16, 32, etc.

At r approximately 3.56995 is the onset of chaos, at the end of the period-doubling cascade. From almost all initial conditions we can no longer see any oscillations of finite period.

Beyond r = 4, the values eventually leave the interval [0,1] and diverge for almost all initial values.

Play with the equation yourself and see how it behaves. You can think about x_0 as the part of land on earth covered by trees, and x_1 as the the part covered by trees next year. Imagine you have a model, where 'climate change' increases amount of rainfall, which changes the part of land on earth covered by trees in a nonlinear way. Or x_0 can also be proportion of bugs in a forest, part of test-tube filled with a bacteria, normalized count of a type of transcription factor in a cell and so on.

Nanopore Sequencing by Divine Intervention?

It is not clear why nanopore technology, among all sequencing approaches, creates so much controversies. Oxford Nanopore has been ‘unveiling’ their ultra-low cost USB stick sequencer since 2011 -

Jan 2011: Cluster Sequencing with Oxford Nanopore’s GridION System

Feb 2012: Making sequencing simpler with nanopores

Sep 2012: Benchtop sequencers ship off

Nov 2012: Oxford Nanopore to Unveil New DNA Sequencers This Week

Mar 2013: MinION: A complete DNA sequencer on a USB stick

but there is only one problem. They did not send any data to Mike the mad biologist (
Dear Oxford Nanopore: How About Some Data?) or other less mad ones for that matter.

—————————————————-

In contrast, the latest nanopore controversy is about a new paper that presented plenty of data, but the data looks so good that others are wondering whether researchers had help from God. If not, their paper would be worthy of Nobel-prizes from all branches of science. Strangely none of those critical break-throughs are described in the paper. (h/t: @infoecho)

The authors built a protein transistor working at superconducting temperature and passed the DNA molecule through it. Supposedly, the gate voltage modulated with DNA polymerase incorporating additional nucleotides and caused nucleotide-specific spikes in current between drain and source.

The claims were questions by everyone including physicists, chemists and biologist. However, one group did not find them to be exaggerated (see at the bottom).

Industry Experts Question Claims of Sequencing Study Published in Nature Nanotechnology

The group questioning the paper has issues with components of the physics,
enzymology and chemistry behind the study.

One issue, said Lindsay, is that the study describes using superconducting materials
at the interface of the protein transistor and probes to reduce signal decay.
However, he said, “none of us know of a superconducting material that works at the
same temperature as a polymerase.”

Superconducting materials operate at well below freezing with even so-called high
temperature superconductors operating below -135 degrees Celsius. Polymerases, on
the other hand, generally operate around room temperature

Another issue relates to the bioelectronics and chemistry. The study’s authors say
that they applied a voltage across the electrodes of up to 9.0 volts. However, at
that high of a voltage, water would become hydrolyzed, generating hydrogen and
oxygen gases. If that happened, said Wanunu, it would be difficult to measure
signals that were a property of the enzyme.

In order to not hydrolyze water, he said the applied voltage would have to be below
around 1.5 volts.

Finally, the scientists had concerns with the enzymology described in the paper.

Which group of ‘scientists’ did not question the paper? Economists of course, where exaggeration is standard operating procedure :)

How about a Chaotic Genome Assembler?

Earlier we discussed about conifer genome assembly in two commentaries.

Steven Salzberg at #bog13: Assembling 22Gb Conifer Genome

#bog13 Posters on Pine and Coelacanth

One important point overlooked in the commentary was that University of Maryland (WMD) did most of the work on the whole-genome assembly with MaSuRCA, and the acronym itself stands for Maryland Super-Reads with Celera Assembler. The leaders of the UMD effort are Aleksey Zimin and Jim Yorke.

Dr. Yorke is a distinguished mathematician with major contribution in chaos theory. In 1975, he and his student Tien-Yien Li used the word ‘choas’ (as in chaos theory) for the first time in a very interesting paper – “Period Three Implies Chaos” (click on the image to read their paper).

Capture

For readers unfamiliar with chaos theory, it was one of the greatest advances in mathematics in 20th century. It showed that perturbation method of long-term forecast was invalid in many nonlinear systems, and ‘complex’ behavior in nature could be derived from simple nonlinear equations. As an example, try the following discrete equation -

If you increase r above 3.56995, the series will start to behave erratically. With a small change in r, you will see large divergence in the output over time.

We know what you are thinking. How did the biodiversity paper published in Nature (
Severe Biodiversity Loss Expected by Year 2080) predict future in 2080 given that even the simplest nonlinear system has chaotic uncertainty? Here is how. Chaos theory was considered to be the greatest mathematical discovery of 20th century, but it was extremely inconvenient for central planners, who wanted to convey feeling of certainty, confidence and omniscience. A theory that claimed to have changed long-term output dramatically with small change in input was completely unacceptable. For that reason, chaos theory was ‘disproved’ by a very famous inventor using a method better than mathematics, namely politics. Central planners never had to look back since then.

In the meanwhile, those of you still stuck in the mathematical glories of past century may enjoy our earlier commentary on complexity of sandpiles – What Does BTW Stand for?.

External Memory Generalized Suffix and LCP Arrays Construction

Felipe A. Louza, one of our readers, posted link to his paper that others may find helpful. It is accepted at 24th Annual Symposium on Combinatorial Pattern Matching (CPM 2013).

A suffix array is a data structure that, together with the LCP array, allows solving many string processing problems in a very efficient fashion. In this article we introduce eGSA, the first external memory algorithm to construct both generalized suffix and LCP arrays for sets of strings. Our algorithm relies on a combination of buffers, induced sorting and a heap. Performance tests with real DNA sequence sets of size up to 8.5 GB showed that eGSA can indeed be applied to sets of large sequences with efficient running time on a low-cost machine. Compared to the algorithm that most closely resembles eGSA purpose, eSAIS, eGSA reduced the time spent to construct the arrays by a factor of 2.54.8.

Our previous threads on similar topic are -

On Constructing Suffix Arrays and LCP Arrays in External Memory

Construction of Suffix Array in External Memory – Follow Up

Also, in this context, we like to mention another (old) paper that readers may enjoy. We believe we first saw it by digging through Jared Simpson’s SGA code.

Dynamic Extended Suffix Arrays

by M. Salson, T. Lecroq, M. Leonard, L. Mouchard.

In this article, we are presenting an algorithm that modifi es the suffix array and the Longest Common Prefix (LCP) array when the text is edited (insertion, substitution or deletion of a letter or a factor). This algorithm is based on a recent four-stage algorithm developed for dynamic Burrows-Wheeler Transforms (BWT). For minimizing the space complexity, we are sampling the Suffix Array, a technique used in BWT-based compressed indexes. We furthermore explain how this technique can be adapted for maintaining a sample of the Extended Suffix Array, containing a sample of the Suffix Array, a sample of the Inverse Suffix Array and the whole LCP array. Our practical experiments show that it operates very well in practice, being quicker than the fastest suffix array construction algorithm.

Slides:

Dynamic Burrows Wheeler Transform

Severe Biodiversity Loss Expected by Year 2080

An article from Nature is so hilarious, it got us wide awake at 5AM in the morning. After opening yahoo front page, we noticed a story -

On the Brink: Climate Change Endangers Common Species

Here are the key sentences to spare you from reading the link.

Under a “business as usual” scenario, where greenhouse gas emissions aren’t significantly reduced, about 50 percent of plants and one-third of animals are likely to vanish from half of the places they are now found by 2080.
…………….
Not too late
It’s not too late to do something to prevent the widespread loss of species, however. The study found that if emissions are slowed and ultimately begin being reduced by 2017, about 60 percent of the losses can be avoided, Warren said.

In a nutshell, the authors of Nature study ‘simulated’ changes in world economy between today and 2080, and death, migration and behavior of all species between today and 2080 to make two very specific predictions.

As an exercise, we request the readers to place themselves in 1910, close eyes and try to predict the world of 1980. Prior to 1910, the world, and especially Europe, saw general levels of peace over the previous seven decades. Germany was technologically the most advanced nation and was going strong under an emperor. China was living under another imperial regime and did not see Xinhai revolution yet. Romanovs were ruling over Russia and experienced surprise loss against rapidly industrializing Japan (ruled by another empire) only five years back. Quantum mechanics was not discovered. Einstein barely left his job at the patent office.

Excellent. Sitting in 1910, now that you have been able to predict economic events between 1910 and 1980, predict the amount of horseshit to be generated in the world in 1980. Why horseshit? It is because Henry Ford did not start mass production of cars until 1914, and horse-buggy was still the primary mode of transportation.

The above two models are not enough. Next you need to forecast how every duckling around the globe will be affected by the smell of incredible amount of horseshit over the next 70 years. Yes, that is what the Nature article essentially did in their analysis of changes in global biodiversity. It claimed to have a model to forecast the changes experienced by every crow, every cow, every tiger, every fly and every bigfoot over the next 70 years due to predicted changes in amount of horseshit carbon dioxide.

At this point, you completed only half of the analysis, and need to redo all under the alternate scenario dubbed ‘solution’. Let us say the solution is (was) to feed horses a different type of grasses from a small town named Los Angeles, whose population was at incredibly large 300,000 after 300% growth over a decade. You do the calculation again, forecast the counts of ducklings, crows, cows, tigers, flies and bigfoots in 1980 and tell everyone how the world would be much better, if they implemented your ‘solution’ instead of going with ‘business as usual’.

We are so saddened to see the above analysis in a scientific journal that we need to take a break from blogging for few days. During the break, we will also ponder, why an Economist article cannot forecast India’s future over the next seven years, let alone next seventy years.

Ways to Find Homologous Genes

The focus of this commentary is sequence homology defined as shared ancestry between two segments of sequences. It includes two broad classes of relations between genes – orthology and paralogy. Lately Ken Wolffe added a new term ‘ohnology’ to describe paralogous genes, whose shared ancestry is from whole genome duplication events. As an example, bladderwort genome paper presents very informative analysis of various whole genome duplication event in plants. Check their Fig. 1 for example.

Orthology

Orthology refers to shared ancestry of a gene in two species. For example, human hox genes in four genomic clusters have orthologies with four sets of hox genes.

Paralogy

Paralogy refers to shared ancestry caused by gene duplication. Hox1, Hox2, Hox3 and Hox4 clusters in human genome are paralogous to each other.

————————————-

Now that large number of complete genome sequences are being published online, we checked for best approaches to find orthologous genes in those genomes and came across a very informative Biostars thread.

What is the best method to find orthologous genes of a species?

Rest of this commentary is our lame attempt to summarize excellent discussions in the above thread.

Bioinformatics approaches for finding orthologies

They fall into two categories – reciprocal BLAST and construction of phylogenetic trees. Reciprocal BLAST approach is fast and less accurate, and phylogenetic tree approach is slow and more accurate.

BlastO is a BLAST web-service developed by Princeton researcher Yi Zhou to allow you to do BLAST on orthologous genes.

——————————————
Databases

Several online databases exist, from where researchers can download pre-computed orthology tables for a set of organisms.

MetaPhOrs

MetaPhOrs can be described as a orthology database of orthology databases. It combines results from many other BLAST-based orthology databases using phylogeny-based method and develops more accurate tables.

MetaPhOrs is a public repository of phylogeny-based orthology and paralogy predictions that were computed using resources available in seven popular homology prediction services (PhylomeDB, EnsemblCompara, EggNOG, OrthoMCL, COG, Fungal Orthogroups, and TreeFam).

MetaPhOrs data can be downloaded from this ftp site. We do not know how often they update their tables, because keeping up with the newly released genomes is the biggest challenge faced by all groups maintaining orthology databases. Currently it includes 863 organisms, but a large number of them are prokaryotes.

COG and eggNOG

COG database is maintained at NCBI.

Clusters of Orthologous Groups of proteins (COGs) were delineated by comparing protein sequences encoded in complete genomes, representing major phylogenetic lineages. Each COG consists of individual proteins or groups of paralogs from at least 3 lineages and thus corresponds to an ancient conserved domain.

FTP site for COG eukaryotic data is here. Its eukaryotic cluster includes only seven organisms.

EggNOG is EMBL’s improvement over COG.

EggNOG (evolutionary genealogy of genes: Non-supervised Orthologous Groups) is a database of orthologous groups of genes. The orthologous groups are annotated with functional description lines (derived by identifying a common denominator for the genes based on their various annotations), with functional categories (i.e derived from the original COG/KOG categories).

eggNOG’s database currently counts 721,801 orthologous groups in 1133 species, covering 4,396,591 proteins (built from 5,214,234 proteins).

Their data can be downloaded from here.

EnsemblCompara

EnsemblCompara does very extensive work of building trees for orthologous genes. We do not know how much they differ from eggNOG, because we are too lazy to read the papers. Biostars thread says that Ensembl is phylogeny based and eggNOG is BLAST-based, but the download page of eggNOG does list some trees.

Gene trees are constructed using a representative protein for every gene in Ensembl: proteins are clustered using hcluster_sg based on WU-BLAST scores, and each cluster of proteins is aligned using M-Coffee. Finally, TreeBeST is used to produce a gene tree from each multiple alignment, reconciling it with the species tree to call duplication events. Homologues are deduced from these trees. We also determine gene gain and loss events using the CAFE software.

————————————————

Possibly not being updated:

InParanoid and MultiParanoid

InParanoid was last updated in June 2009, when nobody expected more than five Illumina machines to be sold. MultiParanoid was last updated three years prior to that. Oh well.

OrthoMCL, phylofacts, TreeFam, PHOG – are they being updated?

If we like to use a database being updated regularly with arrival of new genomes from all kinds of places, which one should we use?

Titus Brown’s Commentary on the Cost of Open Science

Titus Brown, a professor at MSU, is conducting a social experiment with his bioinformatics research approach that he calls ‘open science’. He posted an informative commentary about what he learned so far and his suggestions for what the institutions should do. We agree with many things he said, and the following commentary is only on the points we disagree with.

Disagreement 1

Titus wrote:

In the specific realm of biology and software, I think there’s a strong argument to be made that the future belongs to those who try to build good software. I hope so. But I’m getting tired of the slow pace, and I’m not sure how to accelerate things — discussion and ideas here. (I hope to have some good news on this front in a few weeks, BTW.)

What he and others call ‘open science’ is really not science, but rather technology development.

In the context of technology development, ‘openness’ was invented many centuries ago and it was called patent.

The word patent originates from the Latin patere, which means “to lay open” (i.e., to make available for public inspection).

When an inventor is granted a patent, he tells others about his know-how and others agree to pay him a fee for using his invention. There is always risk that others will try to tweak his method and find a better solution. Also, there is risk that others will get ahead lot faster by leveraging his discovery. Everything Titus mentioned about his experience of ‘openness’ had been learned by technology developers for centuries.

Companies often choose between getting a ‘patent’ or keeping a ‘trade secret’ (closed discovery). Here is a classic example of trade secret leaking out.

RC4 was designed by Ron Rivest of RSA Security in 1987. While it is officially termed “Rivest Cipher 4″, the RC acronym is alternatively understood to stand for “Ron’s Code” (see also RC2, RC5 and RC6).

RC4 was initially a trade secret, but in September 1994 a description of it was anonymously posted to the Cypherpunks mailing list. It was soon posted on the sci.crypt newsgroup, and from there to many sites on the Internet. The leaked code was confirmed to be genuine as its output was found to match that of proprietary software using licensed RC4. Because the algorithm is known, it is no longer a trade secret. The name RC4 is trademarked, so RC4 is often referred to as ARCFOUR or ARC4 (meaning alleged RC4) to avoid trademark problems. RSA Security has never officially released the algorithm; Rivest has, however, linked to the English Wikipedia article on RC4 in his own course notes.[5] RC4 has become part of some commonly used encryption protocols and standards, including WEP and WPA for wireless cards and TLS.

Academic scholars also had a system for sharing their ideas and making sure they were properly cited. The honor system was called ‘plagiarism’. We covered the history of plagiarism here. Plagiarism was not equivalent to copyright violation until academics sold their research journals to commercial entities. Instead anyone, who did not properly cite prior discoveries, was considered an unethical writer. Check NIH guidelines for ethical writing and plagiarism, and you will find those rules still being mentioned.

That system is thrown out of the window, and today scientists (‘technology developers’) have to curry favors (aka ‘marketing’ in tech world) to get their papers cited.

Disagreement 2

All that reinvention of openness wheel is fine and dandy except one problem. Unlike patenting world supported by users of an invention paying the inventor for his openness, the open world of Titus is sustained through government money. Based on the blog post of Titus and few comments, a consensus needs to be created for ‘science’ to gain acceptance, and ‘openness’ is the method for gaining consensus.

That creates several problems for science, because major scientific discoveries were never popular at the time of discovery. Euler’s formula, described by Feynman as “one of the most remarkable, almost astounding, formulas in all of mathematics”, was not very popular in 18th century Europe full of farmers. Darwin’s theory was not accepted for many decades. What is the guarantee that the scientific discovery of today that people will refer to in year 2300 is popular today.

The above paragraph is not true for technology development, and almost all cool examples cited by readers of Dr. Brown’s blog are technological discoveries. Therefore, we believe the disagreement 1 is more important in this context than disagreement 2.

Should Genome Assembly Improve with More and More Reads?

Conventional wisdom says yes, but that may not work for de Bruijn graph-based assemblers. Last year, we were experimenting with Velvet and noticed something strange. When we randomly removed half of the reads from a genomic library, the ‘assembly quality’ seemed to have improved. Assembly quality was measured in terms of N50 of scaffolds and not using any other detailed analysis. What was going on?

We did k-mer analysis of the assembled scaffolds. It was clear that the assembly was wrong, because some k-mers supposedly present only once in the genome were found more than 5-6 times in the assembly. Here was our best explanation for the observation. A de Bruijn graph-based assembler splits all reads into k-mers and forms a giant graph. That graph is resolved into contigs. At times, the program sees overlaps between two branches (such as the second figure here) and creates multiple contigs all terminating at the junction in the figure.

What happens, when you remove some reads? The data becomes sparse and one or other branch of the graph may not appear due to sparseness. The assembler does not pause at the junction and evaluates one or other branch as a contiguous sequence, even when it is not.

The second best explanation was a possible defect in Velvet’s scaffolding routine. We did not have to explore further to resolve between the possibilities.

#bog13 Posters on Pine and Coelacanth

We decided to post on talks and posters from Biology of Genome 2013 meeting for our readers, who could not attend the meeting at CSHL. The process has been quite time consuming, because neither Cold Spring Harbor Lab nor conference organizers posted abstracts and detailed list of authors online. So, we have to Google the first name of only author, find his email address and then request him personally to get a copy of her slides or posters.

One request was rejected with the excuse that our blog is not ‘neutral’. The snub is likely related to our reporting of ENCODE hype. The characterization is odd, because we try to be as neutral as science allows us to be, and never distorted a scientific analysis that is convincingly correct. Unfortunately science itself is quite brutal to bad theories. In fact, science can be described as extremely biased by contemporary media (think CNN Crossfire), because it does not place an evolution pundit and and intelligent design pundit on the same podium and let them ‘debate’ on the issues.

That aside, we received few slides and posters and will continue to post them in the coming days.

Loblolly Pine

It is quite an odd day to keep switching between longest sequenced plant genome (22 Gb) and smallest one (82Mb). Earlier we wrote about Steven Salzberg’s talk on genome assembly of loblolly pine. Daniela Puiu from John Hopkins university sent us the associated poster that our readers may find informative.

In it, their group presented some comparative statistics after assembling their reads with SOAPdenovo2 assembler. We have one complaint about their conclusion however. Based on our understanding of the algorithm, short SOAPdenovo2 contig is a feature and not a bug. If we understand correctly, SOAPdenovo2 makes up in their scaffolding step by proceeding hierarchically from PE reads to reads with longer mate pairs.

Please click on the image to see a full view of their poster.

Capture1

Regarding availability of their program, Dr. Salzberg and Daniela Puiu informed us:

DP:

The MaSuRCA assembler is available on the University of Maryland ftp server under: ftp://ftp.genome.umd.edu/pub/MSR-CA/

SS:

The specific pipeline we’re using for pine isn’t in any format we could share right now, but as Daniela says, the code is available. There’s lots of other pieces too – all will be made freely available as soon as we can.

Coelacanth

We earlier wrote about Coelacanth genome paper and their google hangout meeting. Chris Amemiya sent us his BOG13 slides that our readers may find informative.

When you click on the link, you will download the zip file and will have to unzip it in your computer. The poster is in Powerpoint format.

Capture2