Data scienceData science is an interdisciplinary academic field that uses statistics, scientific computing, scientific methods, processes, algorithms and systems to extract or extrapolate knowledge and insights from noisy, structured, and unstructured data. Data science also integrates domain knowledge from the underlying application domain (e.g., natural sciences, information technology, and medicine). Data science is multifaceted and can be described as a science, a research paradigm, a research method, a discipline, a workflow, and a profession.
BiopsyA biopsy is a medical test commonly performed by a surgeon, interventional radiologist, or an interventional cardiologist. The process involves extraction of sample cells or tissues for examination to determine the presence or extent of a disease. The tissue is then fixed, dehydrated, embedded, sectioned, stained and mounted before it is generally examined under a microscope by a pathologist; it may also be analyzed chemically. When an entire lump or suspicious area is removed, the procedure is called an excisional biopsy.
Granulocyte-macrophage colony-stimulating factorGranulocyte-macrophage colony-stimulating factor (GM-CSF), also known as colony-stimulating factor 2 (CSF2), is a monomeric glycoprotein secreted by macrophages, T cells, mast cells, natural killer cells, endothelial cells and fibroblasts that functions as a cytokine. The pharmaceutical analogs of naturally occurring GM-CSF are called sargramostim and molgramostim. Unlike granulocyte colony-stimulating factor, which specifically promotes neutrophil proliferation and maturation, GM-CSF affects more cell types, especially macrophages and eosinophils.
NeutrophilNeutrophils (also known as neutrocytes, heterophils or polymorphonuclear leukocytes) are a type of white blood cell. More specifically, they form the most abundant type of granulocytes and make up 40% to 70% of all white blood cells in humans. They form an essential part of the innate immune system, with their functions varying in different animals. They are formed from stem cells in the bone marrow and differentiated into subpopulations of neutrophil-killers and neutrophil-cagers.
Data dredgingData dredging (also known as data snooping or p-hacking) is the misuse of data analysis to find patterns in data that can be presented as statistically significant, thus dramatically increasing and understating the risk of false positives. This is done by performing many statistical tests on the data and only reporting those that come back with significant results.
Acute myeloid leukemiaAcute myeloid leukemia (AML) is a cancer of the myeloid line of blood cells, characterized by the rapid growth of abnormal cells that build up in the bone marrow and blood and interfere with normal blood cell production. Symptoms may include feeling tired, shortness of breath, easy bruising and bleeding, and increased risk of infection. Occasionally, spread may occur to the brain, skin, or gums. As an acute leukemia, AML progresses rapidly, and is typically fatal within weeks or months if left untreated.
Testing hypotheses suggested by the dataIn statistics, hypotheses suggested by a given dataset, when tested with the same dataset that suggested them, are likely to be accepted even when they are not true. This is because circular reasoning (double dipping) would be involved: something seems true in the limited data set; therefore we hypothesize that it is true in general; therefore we wrongly test it on the same, limited data set, which seems to confirm that it is true.
EndotheliumThe endothelium (: endothelia) is a single layer of squamous endothelial cells that line the interior surface of blood vessels and lymphatic vessels. The endothelium forms an interface between circulating blood or lymph in the lumen and the rest of the vessel wall. Endothelial cells form the barrier between vessels and tissue and control the flow of substances and fluid into and out of a tissue. Endothelial cells in direct contact with blood are called vascular endothelial cells whereas those in direct contact with lymph are known as lymphatic endothelial cells.
Large numbersLarge numbers are numbers significantly larger than those typically used in everyday life (for instance in simple counting or in monetary transactions), appearing frequently in fields such as mathematics, cosmology, cryptography, and statistical mechanics. They are typically large positive integers, or more generally, large positive real numbers, but may also be other numbers in other contexts. Googology is the study of nomenclature and properties of large numbers.
Exploratory data analysisIn statistics, exploratory data analysis (EDA) is an approach of analyzing data sets to summarize their main characteristics, often using statistical graphics and other data visualization methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling and thereby contrasts traditional hypothesis testing. Exploratory data analysis has been promoted by John Tukey since 1970 to encourage statisticians to explore the data, and possibly formulate hypotheses that could lead to new data collection and experiments.