Student's t-testA t-test is a type of statistical analysis used to compare the averages of two groups and determine if the differences between them are more likely to arise from random chance. It is any statistical hypothesis test in which the test statistic follows a Student's t-distribution under the null hypothesis. It is most commonly applied when the test statistic would follow a normal distribution if the value of a scaling term in the test statistic were known (typically, the scaling term is unknown and therefore a nuisance parameter).
Receiver operating characteristicA receiver operating characteristic curve, or ROC curve, is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied. The ROC curve is the plot of the true positive rate (TPR) against the false positive rate (FPR), at various threshold settings. The ROC can also be thought of as a plot of the power as a function of the Type I Error of the decision rule (when the performance is calculated from just a sample of the population, it can be thought of as estimators of these quantities).
Rough setIn computer science, a rough set, first described by Polish computer scientist Zdzisław I. Pawlak, is a formal approximation of a crisp set (i.e., conventional set) in terms of a pair of sets which give the lower and the upper approximation of the original set. In the standard version of rough set theory (Pawlak 1991), the lower- and upper-approximation sets are crisp sets, but in other variations, the approximating sets may be fuzzy sets. The following section contains an overview of the basic framework of rough set theory, as originally proposed by Zdzisław I.
False positives and false negativesA false positive is an error in binary classification in which a test result incorrectly indicates the presence of a condition (such as a disease when the disease is not present), while a false negative is the opposite error, where the test result incorrectly indicates the absence of a condition when it is actually present. These are the two kinds of errors in a binary test, in contrast to the two kinds of correct result (a and a ).
Supervised learningSupervised learning (SL) is a paradigm in machine learning where input objects (for example, a vector of predictor variables) and a desired output value (also known as human-labeled supervisory signal) train a model. The training data is processed, building a function that maps new data on expected output values. An optimal scenario will allow for the algorithm to correctly determine output values for unseen instances. This requires the learning algorithm to generalize from the training data to unseen situations in a "reasonable" way (see inductive bias).
False positive rateIn statistics, when performing multiple comparisons, a false positive ratio (also known as fall-out or false alarm ratio) is the probability of falsely rejecting the null hypothesis for a particular test. The false positive rate is calculated as the ratio between the number of negative events wrongly categorized as positive (false positives) and the total number of actual negative events (regardless of classification). The false positive rate (or "false alarm rate") usually refers to the expectancy of the false positive ratio.
Feature engineeringFeature engineering or feature extraction or feature discovery is the process of extracting features (characteristics, properties, attributes) from raw data. Due to deep learning networks, such as convolutional neural networks, that are able to learn it by itself, domain-specific- based feature engineering has become obsolete for vision and speech processing.
Universal setIn set theory, a universal set is a set which contains all objects, including itself. In set theory as usually formulated, it can be proven in multiple ways that a universal set does not exist. However, some non-standard variants of set theory include a universal set. Many set theories do not allow for the existence of a universal set. There are several different arguments for its non-existence, based on different choices of axioms for set theory. In Zermelo–Fraenkel set theory, the axiom of regularity and axiom of pairing prevent any set from containing itself.
Comparative advantageIn an economic model, agents have a comparative advantage over others in producing a particular good if they can produce that good at a lower relative opportunity cost or autarky price, i.e. at a lower relative marginal cost prior to trade. Comparative advantage describes the economic reality of the work gains from trade for individuals, firms, or nations, which arise from differences in their factor endowments or technological progress.
Fuzzy setIn mathematics, fuzzy sets (a.k.a. uncertain sets) are sets whose elements have degrees of membership. Fuzzy sets were introduced independently by Lotfi A. Zadeh in 1965 as an extension of the classical notion of set. At the same time, defined a more general kind of structure called an L-relation, which he studied in an abstract algebraic context. Fuzzy relations, which are now used throughout fuzzy mathematics and have applications in areas such as linguistics , decision-making , and clustering , are special cases of L-relations when L is the unit interval [0, 1].