Computational statisticsComputational statistics, or statistical computing, is the bond between statistics and computer science. It means statistical methods that are enabled by using computational methods. It is the area of computational science (or scientific computing) specific to the mathematical science of statistics. This area is also developing rapidly, leading to calls that a broader concept of computing should be taught as part of general statistical education.
HomoiconicityIn computer programming, homoiconicity (from the Greek words homo- meaning "the same" and icon meaning "representation") is a property of some programming languages. A language is homoiconic if a program written in it can be manipulated as data using the language, and thus the program's internal representation can be inferred just by reading the program itself. This property is often summarized by saying that the language treats code as data.
Array programmingIn computer science, array programming refers to solutions that allow the application of operations to an entire set of values at once. Such solutions are commonly used in scientific and engineering settings. Modern programming languages that support array programming (also known as vector or multidimensional languages) have been engineered specifically to generalize operations on scalars to apply transparently to vectors, matrices, and higher-dimensional arrays.
Comparison of statistical packagesThe following tables compare general and technical information for a number of statistical analysis packages. Support for various ANOVA methods Support for various regression methods. Support for various time series analysis methods. Support for various statistical charts and diagrams.
S (programming language)S is a statistical programming language developed primarily by John Chambers and (in earlier versions) Rick Becker and Allan Wilks of Bell Laboratories. The aim of the language, as expressed by John Chambers, is "to turn ideas into software, quickly and faithfully". The modern implementation of S is R, a part of the GNU free software project. S-PLUS, a commercial product, was formerly sold by TIBCO Software. S is one of several statistical computing languages that were designed at Bell Laboratories, and first took form between 1975–1976.
StataStata (ˈsteit@, , alternatively ˈstaet@, occasionally stylized as STATA) is a general-purpose statistical software package developed by StataCorp for data manipulation, visualization, statistics, and automated reporting. It is used by researchers in many fields, including biomedicine, economics, epidemiology, and sociology. Stata was initially developed by Computing Resource Center in California and the first version was released in 1985. In 1993, the company moved to College Station, TX and was renamed Stata Corporation, now known as StataCorp.
Free statistical softwareFree statistical software is a practical alternative to commercial packages. Many of the free to use programs aim to be similar in function to commercial packages, in that they are general statistical packages that perform a variety of statistical analyses. Many other free to use programs were designed specifically for particular functions, like factor analysis, power analysis in sample size calculations, classification and regression trees, or analysis of missing data.
Ggplot2ggplot2 is an open-source data visualization package for the statistical programming language R. Created by Hadley Wickham in 2005, ggplot2 is an implementation of Leland Wilkinson's Grammar of Graphics—a general scheme for data visualization which breaks up graphs into semantic components such as scales and layers. ggplot2 can serve as a replacement for the base graphics in R and contains a number of defaults for web and print display of common scales. Since 2005, ggplot2 has grown in use to become one of the most popular R packages.
RStudioRStudio is an integrated development environment for R, a programming language for statistical computing and graphics. It is available in two formats: RStudio Desktop is a regular desktop application while RStudio Server runs on a remote server and allows accessing RStudio using a web browser. The RStudio integrated development environment (IDE) is available with the GNU Affero General Public License version 3. The AGPL v3 is an open source license that guarantees the freedom to share the code.
Heat mapA heat map (or heatmap) is a 2-dimensional data visualization technique that represents the magnitude of individual values within a dataset as a color. The variation in color may be by hue or intensity. "Heat map" is a relatively new term, but the practice of shading matrices has existed for over a century. Heat maps originated in 2D displays of the values in a data matrix. Larger values were represented by small dark gray or black squares (pixels) and smaller values by lighter squares.