Publication

Towards mouse genetic-specific RNA-sequencing read mapping

Maxime Jan, Ioannis Xenarios
2022
Journal paper
Abstract

Genetic variations affect behavior and cause disease but understanding how these variants drive complex traits is still an open question. A common approach is to link the genetic variants to intermediate molecular phenotypes such as the transcriptome using RNA-sequencing (RNA-seq). Paradoxically, these variants between the samples are usually ignored at the beginning of RNA-seq analyses of many model organisms. This can skew the transcriptome estimates that are used later for downstream analyses, such as expression quantitative trait locus (eQTL) detection. Here, we assessed the impact of reference-based analysis on the transcriptome and eQTLs in a widely-used mouse genetic population: the BXD panel of recombinant inbred lines. We highlight existing reference bias in the transcriptome data analysis and propose practical solutions which combine available genetic variants, genotypes, and genome reference sequence. The use of custom BXD line references improved downstream analysis compared to classical genome reference. These insights would likely benefit genetic studies with a transcriptomic component and demonstrate that genome references need to be reassessed and improved.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Ontological neighbourhood
Related concepts (34)
Transcriptome
The transcriptome is the set of all RNA transcripts, including coding and non-coding, in an individual or a population of cells. The term can also sometimes be used to refer to all RNAs, or just mRNA, depending on the particular experiment. The term transcriptome is a portmanteau of the words transcript and genome; it is associated with the process of transcript production during the biological process of transcription. The early stages of transcriptome annotations began with cDNA libraries published in the 1980s.
Reference genome
A reference genome (also known as a reference assembly) is a digital nucleic acid sequence database, assembled by scientists as a representative example of the set of genes in one idealized individual organism of a species. As they are assembled from the sequencing of DNA from a number of individual donors, reference genomes do not accurately represent the set of genes of any single individual organism. Instead a reference provides a haploid mosaic of different DNA sequences from each donor.
Complex traits
Complex traits, also known as quantitative traits, are traits that do not behave according to simple Mendelian inheritance laws. More specifically, their inheritance cannot be explained by the genetic segregation of a single gene. Such traits show a continuous range of variation and are influenced by both environmental and genetic factors. Compared to strictly Mendelian traits, complex traits are far more common, and because they can be hugely polygenic, they are studied using statistical techniques such as quantitative genetics and quantitative trait loci (QTL) mapping rather than classical genetics methods.
Show more
Related publications (80)

Identifying genetic and dietary modulators of metabolic disorders using systems genetics

Xiaoxu Li

Long-term consumption of lipid-rich foods can contribute to common metabolic diseases and systemic low-grade inflammation. However, dietary responses and the development of non-communicable diseases are shaped by genetic factors and gene-by-environment int ...
EPFL2023

Evaluation of genetic demultiplexing of single-cell sequencing data from model species

Clement Helsens

Single-cell sequencing (sc-seq) provides a species agnostic tool to study cellular processes. However, these technologies are expensive and require sufficient cell quantities and biological replicates to avoid artifactual results. An option to address thes ...
LIFE SCIENCE ALLIANCE LLC2023

Genome-Wide Association Study of Alzheimer's Disease Brain Imaging Biomarkers and Neuropsychological Phenotypes in the European Medical Information Framework for Alzheimer's Disease Multimodal Biomarker Discovery Dataset

Christopher Clark

Alzheimer's disease (AD) is the most frequent neurodegenerative disease with an increasing prevalence in industrialized, aging populations. AD susceptibility has an established genetic basis which has been the focus of a large number of genome-wide associa ...
FRONTIERS MEDIA SA2022
Show more
Related MOOCs (6)
Neuroscience Reconstructed: Cell Biology
This course will provide the fundamental knowledge in neuroscience required to understand how the brain is organised and how function at multiple scales is integrated to give rise to cognition and beh
Neuroscience Reconstructed: Cell Biology
This course will provide the fundamental knowledge in neuroscience required to understand how the brain is organised and how function at multiple scales is integrated to give rise to cognition and beh
Neuroscience Reconstructed: Genetics and Brain Development
This course will provide the fundamental knowledge in neuroscience required to understand how the brain is organised and how function at multiple scales is integrated to give rise to cognition and beh
Show more

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.