Concept

Ensembl genome database project

Summary
Ensembl genome database project is a scientific project at the European Bioinformatics Institute, which provides a centralized resource for geneticists, molecular biologists and other researchers studying the genomes of our own species and other vertebrates and model organisms. Ensembl is one of several well known genome browsers for the retrieval of genomic information. Similar databases and browsers are found at NCBI and the University of California, Santa Cruz (UCSC). The human genome consists of three billion base pairs, which code for approximately 20,000–25,000 genes. However the genome alone is of little use, unless the locations and relationships of individual genes can be identified. One option is manual annotation, whereby a team of scientists tries to locate genes using experimental data from scientific journals and public databases. However this is a slow, painstaking task. The alternative, known as automated annotation, is to use the power of computers to do the complex pattern-matching of protein to DNA. The Ensembl project was launched in 1999 in response to the imminent completion of the Human Genome Project, with the initial goals of automatically annotate the human genome, integrate this annotation with available biological data and make all this knowledge publicly available. In the Ensembl project, sequence data are fed into the gene annotation system (a collection of software "pipelines" written in Perl) which creates a set of predicted gene locations and saves them in a MySQL database for subsequent analysis and display. Ensembl makes these data freely accessible to the world research community. All the data and code produced by the Ensembl project is available to download, and there is also a publicly accessible database server allowing remote access. In addition, the Ensembl website provides computer-generated visual displays of much of the data. Over time the project has expanded to include additional species (including key model organisms such as mouse, fruitfly and zebrafish) as well as a wider range of genomic data, including genetic variations and regulatory features.
About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.