Quantitative researchQuantitative research is a research strategy that focuses on quantifying the collection and analysis of data. It is formed from a deductive approach where emphasis is placed on the testing of theory, shaped by empiricist and positivist philosophies. Associated with the natural, applied, formal, and social sciences this research strategy promotes the objective empirical investigation of observable phenomena to test and understand relationships.
Exploratory data analysisIn statistics, exploratory data analysis (EDA) is an approach of analyzing data sets to summarize their main characteristics, often using statistical graphics and other data visualization methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling and thereby contrasts traditional hypothesis testing. Exploratory data analysis has been promoted by John Tukey since 1970 to encourage statisticians to explore the data, and possibly formulate hypotheses that could lead to new data collection and experiments.
BiostatisticsBiostatistics (also known as biometry) is a branch of statistics that applies statistical methods to a wide range of topics in biology. It encompasses the design of biological experiments, the collection and analysis of data from those experiments and the interpretation of the results. Biostatistical modeling forms an important part of numerous modern biological theories. Genetics studies, since its beginning, used statistical concepts to understand observed experimental results.
Ensembl genome database projectEnsembl genome database project is a scientific project at the European Bioinformatics Institute, which provides a centralized resource for geneticists, molecular biologists and other researchers studying the genomes of our own species and other vertebrates and model organisms. Ensembl is one of several well known genome browsers for the retrieval of genomic information. Similar databases and browsers are found at NCBI and the University of California, Santa Cruz (UCSC).
Correlation function (statistical mechanics)In statistical mechanics, the correlation function is a measure of the order in a system, as characterized by a mathematical correlation function. Correlation functions describe how microscopic variables, such as spin and density, at different positions are related. More specifically, correlation functions quantify how microscopic variables co-vary with one another on average across space and time. A classic example of such spatial correlations is in ferro- and antiferromagnetic materials, where the spins prefer to align parallel and antiparallel with their nearest neighbors, respectively.
Software industryThe software industry includes businesses for development, maintenance and publication of software that are using different business models, mainly either "license/maintenance based" (on-premises) or "Cloud based" (such as SaaS, PaaS, IaaS, MBaaS, MSaaS, DCaaS etc.). The industry also includes software services, such as training, documentation, consulting and data recovery. The software and computer services industry spends more than 11% of its net sales for Research & Development which is in comparison with other industries the second highest share after pharmaceuticals & biotechnology.
Commercial softwareCommercial software, or seldom payware, is a computer software that is produced for sale or that serves commercial purposes. Commercial software can be proprietary software or free and open-source software. While software creation by programming is a time and labor-intensive process, comparable to the creation of physical goods, the reproduction, duplication and sharing of software as digital goods is in comparison disproportionately easy. No special machines or expensive additional resources are required, unlike almost all physical goods and products.
Pan-genomeIn the fields of molecular biology and genetics, a pan-genome (pangenome or supragenome) is the entire set of genes from all strains within a clade. More generally, it is the union of all the genomes of a clade. The pan-genome can be broken down into a "core pangenome" that contains genes present in all individuals, a "shell pangenome" that contains genes present in two or more strains, and a "cloud pangenome" that contains genes only found in a single strain.
Sequence profiling toolA sequence profiling tool in bioinformatics is a type of software that presents information related to a genetic sequence, gene name, or keyword input. Such tools generally take a query such as a DNA, RNA, or protein sequence or ‘keyword’ and search one or more databases for information related to that sequence. Summaries and aggregate results are provided in standardized format describing the information that would otherwise have required visits to many smaller sites or direct literature searches to compile.