Commercial Value of Efficient Metagenomics Pipeline

If you find our commentaries useful, please feel free to press the donate button on the right sidebar, and your generous contribution will be acknowledged in the table at the bottom of the page.

You can follow us on twitter – @homolog_us.


In his blog, Titus Brown asked for ideas to make his open-source algorithm discovery project more exciting. Here is one.

Even though it appears that assembling metagenomes from ocean water, and soil samples has little commercial value, that is not correct according to a Boston-based startup. Ideally we should not call it a startup, because Sanofi already made commitment to buy the company provided they meet their goals.

Warp Drive is being launched with $125 million in funding from Third Rock and French pharmaceutical giant Sanofi (NYSE: SNY). Greylock Partners also participated in the financing. Warp Drive was co-founded by Greg Verdine, a Harvard University chemical biologist and venture partner at Third Rock, along with Harvard University genomics expert George Church, and biolochemist James Wells of the University of California at San Francisco.


Warp Drive refers to its core platform as a “genomic search engine.” The company’s ultimate goal is to develop the technology to the point where it will be able to comb through naturally derived substances—such as plants and soil—and sequence the genomes of the microbes hidden in them. Ideally, says Borisy, the platform will be able to use that information to help scientists uncover new molecules that have the highest probability of hitting disease targets.

Nature has been one of the drug industry’s richest sources of pharmaceutical success stories. The menu of products that originated in the wild include diabetes drug exenatide (Byetta), derived from Gila monster saliva; heart failure treatment digoxin, which comes from the foxglove plant; and ziconotide (Prialt), a pain treatment from the cone snail. “Nature is an incredibly medicinal chemist,” Borisy says. “Nature can drug targets in ways we’ve never been able to figure out how to do.”

Metagenome assembly is a big part of their pipeline, according to the job description posted by them. Do not laugh at ‘minimum requirement’. It is likely written by a manager with little experience in bioinformatics, who threw in as many buzzwords as he could. Maybe Keith Robison of Omics!Omics! blog, who is an employee and author of Kevin’s GATTACA blog, who makes fun of such ridiculous job requirements, should chat.

Minimum Requirements:

Master’s Degree and 10+ years’ work experience or a Ph.D. and 6 + years relevant work
experience in the biological sciences, computer science or computational science

Experience with de novo assembly of microbial or metagenomic genomes from short
read or single molecule data using tools such as Ray, MIRA, Velvet, Celera, ALLPATHS
Experience using assembly refinement and scaffolding tools such as AMOS, SSPACE

Experience developing and maintaining tools in Perl utilizing BioPerl and other open
source frameworks. Python/BioPython will also be considered. Experience in
programming in R a plus.
Experience with short read data manipulation, analysis & trimming tools, e.g. samtools,

Experience developing SQL databases and executing complex queries (joins across many
tables, recursive joins) on such. Experience with NoSQL databases a plus.

Experience working with Linux clusters, particularly the configuring and execution of
jobs under Sun/Oracle Grid Engine.

Experience developing interactive web tools using HTML5.

General sequence analysis background with extensive use of BLAST, HMMER,
CLUSTAL, PHYLIP and similar tools.

Excellent communication skills, including the production of clear scientific updates using
PowerPoint and good scientific visualization skills

Able to work collaboratively in multi-disciplinary environment to accomplish program
and company goals.

How do you go from metagenome assembly to drug discovery? It is by looking for gene clusters having certain signatures. We do not have time to explain, but you may start with the following two papers (among many) –

Automated genome mining for natural products

The Natural Product Domain Seeker NaPDoS: A Phylogeny Based Bioinformatic Tool to Classify Secondary Metabolite Gene Diversity

Heroes and Heroines of New Media--2015

Our blog is deeply honored by the generous contribution of the following readers. Without their patronage, this site would go away.

Outstandingly Generous:   
Amemiya C. Schnable J. Bowman B. Osipowski P.
Shen M. Furness M. Graur D. Diesh C.
Amemiya C.      

We are also looking for subscribers to get help to finish the tutorials. Please see this post for details.

2 comments to Commercial Value of Efficient Metagenomics Pipeline

  • Titus Brown

    If someone wants to give me a $200k gift or a grant, I will solve this assembly problem for them ;).

  • admin

    Seems like they are spending $125 million to solve what you are doing. You need to raise your price-tag to get attention :)

Leave a Reply




You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>