Transcription (biology)Transcription is the process of copying a segment of DNA into RNA. The segments of DNA transcribed into RNA molecules that can encode proteins are said to produce messenger RNA (mRNA). Other segments of DNA are copied into RNA molecules called non-coding RNAs (ncRNAs). mRNA comprises only 1–3% of total RNA samples. Less than 2% of the human genome can be transcribed into mRNA (Human genome#Coding vs. noncoding DNA), while at least 80% of mammalian genomic DNA can be actively transcribed (in one or more types of cells), with the majority of this 80% considered to be ncRNA.
GeneIn biology, the word gene (from γένος, génos; meaning generation or birth or gender) can have several different meanings. The Mendelian gene is a basic unit of heredity and the molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protein-coding genes and noncoding genes. During gene expression, the DNA is first copied into RNA. The RNA can be directly functional or be the intermediate template for a protein that performs a function.
De novo gene birthDe novo gene birth is the process by which new genes evolve from DNA sequences that were ancestrally non-genic. De novo genes represent a subset of novel genes, and may be protein-coding or instead act as RNA genes. The processes that govern de novo gene birth are not well understood, although several models exist that describe possible mechanisms by which de novo gene birth may occur. Although de novo gene birth may have occurred at any point in an organism's evolutionary history, ancient de novo gene birth events are difficult to detect.
Coding regionThe coding region of a gene, also known as the coding sequence (CDS), is the portion of a gene's DNA or RNA that codes for protein. Studying the length, composition, regulation, splicing, structures, and functions of coding regions compared to non-coding regions over different species and time periods can provide a significant amount of important information regarding gene organization and evolution of prokaryotes and eukaryotes. This can further assist in mapping the human genome and developing gene therapy.
Conserved sequenceIn evolutionary biology, conserved sequences are identical or similar sequences in nucleic acids (DNA and RNA) or proteins across species (orthologous sequences), or within a genome (paralogous sequences), or between donor and receptor taxa (xenologous sequences). Conservation indicates that a sequence has been maintained by natural selection. A highly conserved sequence is one that has remained relatively unchanged far back up the phylogenetic tree, and hence far back in geological time.
Regulator geneA regulator gene, regulator, or regulatory gene is a gene involved in controlling the expression of one or more other genes. Regulatory sequences, which encode regulatory genes, are often at the five prime end (5') to the start site of transcription of the gene they regulate. In addition, these sequences can also be found at the three prime end (3') to the transcription start site. In both cases, whether the regulatory sequence occurs before (5') or after (3') the gene it regulates, the sequence is often many kilobases away from the transcription start site.
Reference genomeA reference genome (also known as a reference assembly) is a digital nucleic acid sequence database, assembled by scientists as a representative example of the set of genes in one idealized individual organism of a species. As they are assembled from the sequencing of DNA from a number of individual donors, reference genomes do not accurately represent the set of genes of any single individual organism. Instead a reference provides a haploid mosaic of different DNA sequences from each donor.
Human geneticsHuman genetics is the study of inheritance as it occurs in human beings. Human genetics encompasses a variety of overlapping fields including: classical genetics, cytogenetics, molecular genetics, biochemical genetics, genomics, population genetics, developmental genetics, clinical genetics, and genetic counseling. Genes are the common factor of the qualities of most human-inherited traits. Study of human genetics can answer questions about human nature, can help understand diseases and the development of effective treatment and help us to understand the genetics of human life.
Genome projectGenome projects are scientific endeavours that ultimately aim to determine the complete genome sequence of an organism (be it an animal, a plant, a fungus, a bacterium, an archaean, a protist or a virus) and to annotate protein-coding genes and other important genome-encoded features. The genome sequence of an organism includes the collective DNA sequences of each chromosome in the organism. For a bacterium containing a single chromosome, a genome project will aim to map the sequence of that chromosome.
Whole genome sequencingWhole genome sequencing (WGS), also known as full genome sequencing, complete genome sequencing, or entire genome sequencing, is the process of determining the entirety, or nearly the entirety, of the DNA sequence of an organism's genome at a single time. This entails sequencing all of an organism's chromosomal DNA as well as DNA contained in the mitochondria and, for plants, in the chloroplast. Whole genome sequencing has largely been used as a research tool, but was being introduced to clinics in 2014.