Archives

Categories

Y
Y

The World of Biological Databases

Homolog.us blog is written by professional janitors dedicated to clean up US science. During lunch breaks and other time off from the job, we discuss bioinformatics. The name 'homolog.us' is not a spelling mistake, but is derived by taking Arabic translation of the 'O' in the original word.

Please follow us on twitter – @homolog_us.

-------------------------------------------------------------------------------------------------------------

The world of biological databases is in a gigantic mess, and the problem gets bigger as we go from genome sequences to transcriptomes and other expression-related datasets. The mess is expected to get messier with increasing amount of NGS sequences being available. Let us explain in more detail.

To understand the larger context, we need to start from the principles of evolution. Life on earth originated and evolved from a common ancestor over billions of years. The common ancestor of vertebrates likely appeared hundreds of million years back during Cambrian explosion. Externally, different organisms evolved unique sets of body parts to address different environmental challenges, but those body parts, such as fish fins, bird wings and human legs, came from the same set of genetic tool kits. In that sense, a researcher investigating fish eye should find help from transcriptional studies in human eye or mouse eye.

Unfortunately, various transcriptomal data sets are not easily comparable, and the difficulty gets more acute as we move further away on the evolutionary tree. Often the reason is more political than technical. Scientific communities are divided into different groups such as ‘cancer researchers’, ‘entomologists’, ‘fish geneticists’, ‘neurologists’, ‘plant biologists’, etc. Inter-comparison of data generated by those groups is barely possible at the level of genome sequences, but when it comes to gene expressions, all bets are off.

Here is a large list of various biological databases that were created to till date. We are sure the list is far from complete, because we checked for two or three databases that we are well familiar with and they were not included in the wiki page. For example, the Signal database at Salk Institute is a very useful transcriptome database for plants not mentioned in the wiki page. Rise database from BGI is helpful to researchers working on rice.SpBase hosts data for sea urchin community. Readers are welcome to mention any other useful database that is not included in the wiki page.

Useful links:
Wiki: list of biological databases,
Metadatabase: A database to link to all biological databases,
GEO: NCBI gene expression depository,
SRA: NCBI short read archive also hosts transcriptome data,
Arrayexpress: Gene expression depository at EBI,
emouse: Gene expression database on mouse,




-------------------------------------------------------------------------------------------------------------
Heroes and Heroines of New Media--2014

I am strongly influenced by Charles Hugh Smith, who runs his insightful social blog of Two Minds. I hope he will not mind, if I copy his style of acknowledgement to the supporters of our blog.

Our blog is deeply honored by the generous contribution of the following readers. Without their patronage, this site would go away.

Outstandingly Generous:  
Amemiya C. Schnable J. Bowman B.

We are also looking for subscribers to get help to finish the tutorials. Please see this post for details.

1 comment to The World of Biological Databases

Leave a Reply

  

  

  

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>