Data scienceData science is an interdisciplinary academic field that uses statistics, scientific computing, scientific methods, processes, algorithms and systems to extract or extrapolate knowledge and insights from noisy, structured, and unstructured data. Data science also integrates domain knowledge from the underlying application domain (e.g., natural sciences, information technology, and medicine). Data science is multifaceted and can be described as a science, a research paradigm, a research method, a discipline, a workflow, and a profession.
Regression analysisIn statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one or more independent variables (often called 'predictors', 'covariates', 'explanatory variables' or 'features'). The most common form of regression analysis is linear regression, in which one finds the line (or a more complex linear combination) that most closely fits the data according to a specific mathematical criterion.
Condorcet methodA Condorcet method (pronkɒndɔrˈseɪ; kɔ̃dɔʁsɛ) is an election method that elects the candidate who wins a majority of the vote in every head-to-head election against each of the other candidates, that is, a candidate preferred by more voters than any others, whenever there is such a candidate. A candidate with this property, the pairwise champion or beats-all winner, is formally called the Condorcet winner. The head-to-head elections need not be done separately; a voter's choice within any given pair can be determined from the ranking.
Robust statisticsRobust statistics are statistics with good performance for data drawn from a wide range of probability distributions, especially for distributions that are not normal. Robust statistical methods have been developed for many common problems, such as estimating location, scale, and regression parameters. One motivation is to produce statistical methods that are not unduly affected by outliers. Another motivation is to provide methods with good performance when there are small departures from a parametric distribution.
Linear regressionIn statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is called simple linear regression; for more than one, the process is called multiple linear regression. This term is distinct from multivariate linear regression, where multiple correlated dependent variables are predicted, rather than a single scalar variable.
Decision-making softwareDecision-making software (DM software) is software for computer applications that help individuals and organisations make choices and take decisions, typically by ranking, prioritizing or choosing from a number of options. An early example of DM software was described in 1973. Before the advent of the World Wide Web, most DM software was spreadsheet-based, with the first web-based DM software appearing in the mid-1990s. Nowadays, many DM software products (mostly web-based) are available – e.g.
Schulze methodThe Schulze method (ˈʃʊltsə) is an electoral system developed in 1997 by Markus Schulze that selects a single winner using votes that express preferences. The method can also be used to create a sorted list of winners. The Schulze method is also known as Schwartz Sequential dropping (SSD), cloneproof Schwartz sequential dropping (CSSD), the beatpath method, beatpath winner, path voting, and path winner.
System of units of measurementA system of units of measurement, also known as a system of units or system of measurement, is a collection of units of measurement and rules relating them to each other. Systems of measurement have historically been important, regulated and defined for the purposes of science and commerce. Instances in use include the International System of Units or () (the modern form of the metric system), the British imperial system, and the United States customary system.
Borda countThe Borda count is a family of positional voting rules which gives each candidate, for each ballot, a number of points corresponding to the number of candidates ranked lower. In the original variant, the lowest-ranked candidate gets 0 points, the next-lowest gets 1 point, etc., and the highest-ranked candidate gets n − 1 points, where n is the number of candidates. Once all votes have been counted, the option or candidate with the most points is the winner.
Logistic regressionIn statistics, the logistic model (or logit model) is a statistical model that models the probability of an event taking place by having the log-odds for the event be a linear combination of one or more independent variables. In regression analysis, logistic regression (or logit regression) is estimating the parameters of a logistic model (the coefficients in the linear combination).