InstrumentationInstrumentation is a collective term for measuring instruments, used for indicating, measuring and recording physical quantities. It is also a field of study about the art and science about making measurement instruments, involving the related areas of metrology, automation, and control theory. The term has its origins in the art and science of scientific instrument-making. Instrumentation can refer to devices as simple as direct-reading thermometers, or as complex as multi-sensor components of industrial control systems.
Statistical model validationIn statistics, model validation is the task of evaluating whether a chosen statistical model is appropriate or not. Oftentimes in statistical inference, inferences from models that appear to fit their data may be flukes, resulting in a misunderstanding by researchers of the actual relevance of their model. To combat this, model validation is used to test whether a statistical model can hold up to permutations in the data.
Cross-validation (statistics)Cross-validation, sometimes called rotation estimation or out-of-sample testing, is any of various similar model validation techniques for assessing how the results of a statistical analysis will generalize to an independent data set. Cross-validation is a resampling method that uses different portions of the data to test and train a model on different iterations. It is mainly used in settings where the goal is prediction, and one wants to estimate how accurately a predictive model will perform in practice.
Pulp (paper)Pulp is a lignocellulosic fibrous material prepared by chemically or mechanically separating cellulose fibers from wood, fiber crops, waste paper, or rags. Mixed with water and other chemical or plant-based additives, pulp is the major raw material used in papermaking and the industrial production of other paper products. Before the widely acknowledged invention of papermaking by Cai Lun in China around 105 AD, paper-like writing materials such as papyrus and amate were produced by ancient civilizations using plant materials which were largely unprocessed.
Regression validationIn statistics, regression validation is the process of deciding whether the numerical results quantifying hypothesized relationships between variables, obtained from regression analysis, are acceptable as descriptions of the data. The validation process can involve analyzing the goodness of fit of the regression, analyzing whether the regression residuals are random, and checking whether the model's predictive performance deteriorates substantially when applied to data that were not used in model estimation.
Environmental effects of paperThe environmental effects of paper are significant, which has led to changes in industry and behaviour at both business and personal levels. With the use of modern technology such as the printing press and the highly mechanized harvesting of wood, disposable paper became a relatively cheap commodity, which led to a high level of consumption and waste. The rise in global environmental issues such as air and water pollution, climate change, overflowing landfills and clearcutting have all lead to increased government regulations.
Training, validation, and test data setsIn machine learning, a common task is the study and construction of algorithms that can learn from and make predictions on data. Such algorithms function by making data-driven predictions or decisions, through building a mathematical model from input data. These input data used to build the model are usually divided into multiple data sets. In particular, three data sets are commonly used in different stages of the creation of the model: training, validation, and test sets.
Sampling (statistics)In statistics, quality assurance, and survey methodology, sampling is the selection of a subset or a statistical sample (termed sample for short) of individuals from within a statistical population to estimate characteristics of the whole population. Statisticians attempt to collect samples that are representative of the population. Sampling has lower costs and faster data collection compared to recording data from the entire population, and thus, it can provide insights in cases where it is infeasible to measure an entire population.
Sample size determinationSample size determination is the act of choosing the number of observations or replicates to include in a statistical sample. The sample size is an important feature of any empirical study in which the goal is to make inferences about a population from a sample. In practice, the sample size used in a study is usually determined based on the cost, time, or convenience of collecting the data, and the need for it to offer sufficient statistical power.
Statistical inferenceStatistical inference is the process of using data analysis to infer properties of an underlying distribution of probability. Inferential statistical analysis infers properties of a population, for example by testing hypotheses and deriving estimates. It is assumed that the observed data set is sampled from a larger population. Inferential statistics can be contrasted with descriptive statistics. Descriptive statistics is solely concerned with properties of the observed data, and it does not rest on the assumption that the data come from a larger population.