Modes of convergenceIn mathematics, there are many senses in which a sequence or a series is said to be convergent. This article describes various modes (senses or species) of convergence in the settings where they are defined. For a list of modes of convergence, see Modes of convergence (annotated index) Note that each of the following objects is a special case of the types preceding it: sets, topological spaces, uniform spaces, TAGs (topological abelian groups), normed spaces, Euclidean spaces, and the real/complex numbers.
Monotone convergence theoremIn the mathematical field of real analysis, the monotone convergence theorem is any of a number of related theorems proving the convergence of monotonic sequences (sequences that are non-decreasing or non-increasing) that are also bounded. Informally, the theorems state that if a sequence is increasing and bounded above by a supremum, then the sequence will converge to the supremum; in the same way, if a sequence is decreasing and is bounded below by an infimum, it will converge to the infimum.
Topologies on spaces of linear mapsIn mathematics, particularly functional analysis, spaces of linear maps between two vector spaces can be endowed with a variety of topologies. Studying space of linear maps and these topologies can give insight into the spaces themselves. The article operator topologies discusses topologies on spaces of linear maps between normed spaces, whereas this article discusses topologies on such spaces in the more general setting of topological vector spaces (TVSs).
Variational autoencoderIn machine learning, a variational autoencoder (VAE) is an artificial neural network architecture introduced by Diederik P. Kingma and Max Welling. It is part of the families of probabilistic graphical models and variational Bayesian methods. Variational autoencoders are often associated with the autoencoder model because of its architectural affinity, but with significant differences in the goal and mathematical formulation. Variational autoencoders are probabilistic generative models that require neural networks as only a part of their overall structure.
Formal proofIn logic and mathematics, a formal proof or derivation is a finite sequence of sentences (called well-formed formulas in the case of a formal language), each of which is an axiom, an assumption, or follows from the preceding sentences in the sequence by a rule of inference. It differs from a natural language argument in that it is rigorous, unambiguous and mechanically verifiable. If the set of assumptions is empty, then the last sentence in a formal proof is called a theorem of the formal system.
Radius of convergenceIn mathematics, the radius of convergence of a power series is the radius of the largest disk at the center of the series in which the series converges. It is either a non-negative real number or . When it is positive, the power series converges absolutely and uniformly on compact sets inside the open disk of radius equal to the radius of convergence, and it is the Taylor series of the analytic function to which it converges.
Formal systemA formal system is an abstract structure used for inferring theorems from axioms according to a set of rules. These rules, which are used for carrying out the inference of theorems from axioms, are the logical calculus of the formal system. A formal system is essentially an "axiomatic system". In 1921, David Hilbert proposed to use such a system as the foundation for the knowledge in mathematics. A formal system may represent a well-defined system of abstract thought.
Mean squared errorIn statistics, the mean squared error (MSE) or mean squared deviation (MSD) of an estimator (of a procedure for estimating an unobserved quantity) measures the average of the squares of the errors—that is, the average squared difference between the estimated values and the actual value. MSE is a risk function, corresponding to the expected value of the squared error loss. The fact that MSE is almost always strictly positive (and not zero) is because of randomness or because the estimator does not account for information that could produce a more accurate estimate.
Continuous uniform distributionIn probability theory and statistics, the continuous uniform distributions or rectangular distributions are a family of symmetric probability distributions. Such a distribution describes an experiment where there is an arbitrary outcome that lies between certain bounds. The bounds are defined by the parameters, and which are the minimum and maximum values. The interval can either be closed (i.e. ) or open (i.e. ). Therefore, the distribution is often abbreviated where stands for uniform distribution.
Stochastic gradient descentStochastic gradient descent (often abbreviated SGD) is an iterative method for optimizing an objective function with suitable smoothness properties (e.g. differentiable or subdifferentiable). It can be regarded as a stochastic approximation of gradient descent optimization, since it replaces the actual gradient (calculated from the entire data set) by an estimate thereof (calculated from a randomly selected subset of the data).