Data PreprocessingData preprocessing can refer to manipulation or dropping of data before it is used in order to ensure or enhance performance, and is an important step in the data mining process. The phrase "garbage in, garbage out" is particularly applicable to data mining and machine learning projects. Data collection methods are often loosely controlled, resulting in out-of-range values, impossible data combinations, and missing values, amongst other issues. Analyzing data that has not been carefully screened for such problems can produce misleading results.
Clinical metagenomic sequencingClinical metagenomic next-generation sequencing (mNGS) is the comprehensive analysis of microbial and host genetic material (DNA or RNA) in clinical samples from patients by next-generation sequencing. It uses the techniques of metagenomics to identify and characterize the genome of bacteria, fungi, parasites, and viruses without the need for a prior knowledge of a specific pathogen directly from clinical specimens.
Exploratory data analysisIn statistics, exploratory data analysis (EDA) is an approach of analyzing data sets to summarize their main characteristics, often using statistical graphics and other data visualization methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling and thereby contrasts traditional hypothesis testing. Exploratory data analysis has been promoted by John Tukey since 1970 to encourage statisticians to explore the data, and possibly formulate hypotheses that could lead to new data collection and experiments.
Data warehouseIn computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for reporting and data analysis and is considered a core component of business intelligence. Data warehouses are central repositories of integrated data from one or more disparate sources. They store current and historical data in one single place that are used for creating analytical reports for workers throughout the enterprise. This is beneficial for companies as it enables them to interrogate and draw insights from their data and make decisions.
Data miningData mining is the process of extracting and discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems. Data mining is an interdisciplinary subfield of computer science and statistics with an overall goal of extracting information (with intelligent methods) from a data set and transforming the information into a comprehensible structure for further use. Data mining is the analysis step of the "knowledge discovery in databases" process, or KDD.
Transport phenomenaIn engineering, physics, and chemistry, the study of transport phenomena concerns the exchange of mass, energy, charge, momentum and angular momentum between observed and studied systems. While it draws from fields as diverse as continuum mechanics and thermodynamics, it places a heavy emphasis on the commonalities between the topics covered. Mass, momentum, and heat transport all share a very similar mathematical framework, and the parallels between them are exploited in the study of transport phenomena to draw deep mathematical connections that often provide very useful tools in the analysis of one field that are directly derived from the others.
Mass transferMass transfer is the net movement of mass from one location (usually meaning stream, phase, fraction or component) to another. Mass transfer occurs in many processes, such as absorption, evaporation, drying, precipitation, membrane filtration, and distillation. Mass transfer is used by different scientific disciplines for different processes and mechanisms. The phrase is commonly used in engineering for physical processes that involve diffusive and convective transport of chemical species within physical systems.
Development theoryDevelopment theory is a collection of theories about how desirable change in society is best achieved. Such theories draw on a variety of social science disciplines and approaches. In this article, multiple theories are discussed, as are recent developments with regard to these theories. Depending on which theory that is being looked at, there are different explanations to the process of development and their inequalities. Modernization theory Modernization theory is used to analyze the processes in which modernization in societies take place.
Nucleic acidNucleic acids are biopolymers, macromolecules, essential to all known forms of life. They are composed of nucleotides, which are the monomer components: a 5-carbon sugar, a phosphate group and a nitrogenous base. The two main classes of nucleic acids are deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). If the sugar is ribose, the polymer is RNA; if the sugar is deoxyribose, a version of ribose, the polymer is DNA. Nucleic acids are chemical compounds that are found in nature.
Preimplantation genetic diagnosisPreimplantation genetic diagnosis (PGD or PIGD) is the genetic profiling of embryos prior to implantation (as a form of embryo profiling), and sometimes even of oocytes prior to fertilization. PGD is considered in a similar fashion to prenatal diagnosis. When used to screen for a specific genetic disease, its main advantage is that it avoids selective abortion, as the method makes it highly likely that the baby will be free of the disease under consideration.