The coding region of a gene, also known as the coding sequence (CDS), is the portion of a gene's DNA or RNA that codes for protein. Studying the length, composition, regulation, splicing, structures, and functions of coding regions compared to non-coding regions over different species and time periods can provide a significant amount of important information regarding gene organization and evolution of prokaryotes and eukaryotes. This can further assist in mapping the human genome and developing gene therapy. Although this term is also sometimes used interchangeably with exon, it is not the exact same thing: the exon is composed of the coding region as well as the 3' and 5' untranslated regions of the RNA, and so therefore, an exon would be partially made up of coding regions. The 3' and 5' untranslated regions of the RNA, which do not code for protein, are termed non-coding regions and are not discussed on this page. There is often confusion between coding regions and exomes and there is a clear distinction between these terms. While the exome refers to all exons within a genome, the coding region refers to a singular section of the DNA or RNA which specifically codes for a certain kind of protein. In 1978, Walter Gilbert published "Why Genes in Pieces" which first began to explore the idea that the gene is a mosaic—that each full nucleic acid strand is not coded continuously but is interrupted by "silent" non-coding regions. This was the first indication that there needed to be a distinction between the parts of the genome that code for protein, now called coding regions, and those that do not. The evidence suggests that there is a general interdependence between base composition patterns and coding region availability. The coding region is thought to contain a higher GC-content than non-coding regions. There is further research that discovered that the longer the coding strand, the higher the GC-content. Short coding strands are comparatively still GC-poor, similar to the low GC-content of the base composition translational stop codons like TAG, TAA, and TGA.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related courses (6)
BIOENG-110: General Biology
Le but du cours est de fournir un aperçu général de la biologie des cellules et des organismes. Nous en discuterons dans le contexte de la vie des cellules et des organismes, en mettant l'accent sur l
BIOENG-320: Synthetic biology
This advanced Bachelor/Master level course will cover fundamentals and approaches at the interface of biology, chemistry, engineering and computer science for diverse fields of synthetic biology. This
BIO-105: Cellular biology and biochemistry for engineers
Basic course in biochemistry as well as cellular and molecular biology for non-life science students enrolling at the Master or PhD thesis level from various engineering disciplines. It reviews essent
Show more
Related concepts (22)
Gene
In biology, the word gene (from γένος, génos; meaning generation or birth or gender) can have several different meanings. The Mendelian gene is a basic unit of heredity and the molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protein-coding genes and noncoding genes. During gene expression, the DNA is first copied into RNA. The RNA can be directly functional or be the intermediate template for a protein that performs a function.
Single-nucleotide polymorphism
In genetics and bioinformatics, a single-nucleotide polymorphism (SNP snɪp; plural SNPs snɪps) is a germline substitution of a single nucleotide at a specific position in the genome that is present in a sufficiently large fraction of considered population (generally regarded as 1% or more). For example, a G nucleotide present at a specific location in a reference genome may be replaced by an A in a minority of individuals. The two possible nucleotide variations of this SNP – G or A – are called alleles.
Frameshift mutation
A frameshift mutation (also called a framing error or a reading frame shift) is a genetic mutation caused by indels (insertions or deletions) of a number of nucleotides in a DNA sequence that is not divisible by three. Due to the triplet nature of gene expression by codons, the insertion or deletion can change the reading frame (the grouping of the codons), resulting in a completely different translation from the original. The earlier in the sequence the deletion or insertion occurs, the more altered the protein.
Show more

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.