Challenges in Assembling Fish Genomes

If you find our commentaries useful, please feel free to press the donate button on the right sidebar, and your generous contribution will be acknowledged in the table at the bottom of the page.

You can follow us on twitter – @homolog_us.


Assembling fish genomes is a complex task due to the presence of excessive amount of repeats and polymorphism in the sequences. Lex Nederbragt from Norwegian Sequencing Center has been working on two large fish genomes – those from Atlantic cod and Atlantic salmon. Readers may enjoy the slides he shared with us for one of his recent talks.

Lex’s slides are helpful for three reasons –

(i) They do not make any assumption about what the reader knows on sequencing, and starts from very basic levels. So, even those readers, who are barely dabbling into sequencing projects, can gain from his insights.

(ii) Lex is very knowledgeable about many different sequencing platforms, and recommends choosing technology paths based on the problem at hand. Many bioinformaticians work in a mode, where someone else (non-bioinformatician) decides to sequence a number of libraries from a genome or transcriptome, and then hands over the files to the bioinformatician and asks him to show his magic. It would be far more productive and efficient, if the planning of sequencing project takes input from the bioinformaticians on which choices could improve their analysis. As an example, Lex showed how error-corrected PacBio reads could be utilized to improve the long-range quality of the assembly.

(iii) Lex was an ‘early adopter’ of PacBio reads in his assembly projects, and has shown successfully how to improve the quality of a de novo assembly by using those ‘noisy beasts’. When we ask others about PacBio, the opinions we hear are – (a) “we do not plan to use them, because they are too noisy”, (b) “we tried incorporating for a project, but did not go anywhere. In the meanwhile, we got too much unprocessed Illumina data to take care of.” Even the published success stories are mostly about bacterial genomes, or finishing previously assembled large genomes with plenty of Sanger data (see Baylor paper here). Those starting large genome projects on previously unexplored genomes can learn a lot from Lex.

Heroes and Heroines of New Media--2014

I am strongly influenced by Charles Hugh Smith, who runs his insightful social blog of Two Minds. I hope he will not mind, if I copy his style of acknowledgement to the supporters of our blog.

Our blog is deeply honored by the generous contribution of the following readers. Without their patronage, this site would go away.

Outstandingly Generous:   
Amemiya C. Schnable J. Bowman B. Osipowski P.
Shen M. Furness M. Graur D.  

We are also looking for subscribers to get help to finish the tutorials. Please see this post for details.

Leave a Reply




You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>