Wave interferenceIn physics, interference is a phenomenon in which two coherent waves are combined by adding their intensities or displacements with due consideration for their phase difference. The resultant wave may have greater intensity (constructive interference) or lower amplitude (destructive interference) if the two waves are in phase or out of phase, respectively. Interference effects can be observed with all types of waves, for example, light, radio, acoustic, surface water waves, gravity waves, or matter waves as well as in loudspeakers as electrical waves.
Interaction informationThe interaction information is a generalization of the mutual information for more than two variables. There are many names for interaction information, including amount of information, information correlation, co-information, and simply mutual information. Interaction information expresses the amount of information (redundancy or synergy) bound up in a set of variables, beyond that which is present in any subset of those variables. Unlike the mutual information, the interaction information can be either positive or negative.
Conditional mutual informationIn probability theory, particularly information theory, the conditional mutual information is, in its most basic form, the expected value of the mutual information of two random variables given the value of a third. For random variables , , and with support sets , and , we define the conditional mutual information as This may be written in terms of the expectation operator: . Thus is the expected (with respect to ) Kullback–Leibler divergence from the conditional joint distribution to the product of the conditional marginals and .
Data PreprocessingData preprocessing can refer to manipulation or dropping of data before it is used in order to ensure or enhance performance, and is an important step in the data mining process. The phrase "garbage in, garbage out" is particularly applicable to data mining and machine learning projects. Data collection methods are often loosely controlled, resulting in out-of-range values, impossible data combinations, and missing values, amongst other issues. Analyzing data that has not been carefully screened for such problems can produce misleading results.
Natural language processingNatural language processing (NLP) is an interdisciplinary subfield of linguistics and computer science. It is primarily concerned with processing natural language datasets, such as text corpora or speech corpora, using either rule-based or probabilistic (i.e. statistical and, most recently, neural network-based) machine learning approaches. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them.
Protein structure predictionProtein structure prediction is the inference of the three-dimensional structure of a protein from its amino acid sequence—that is, the prediction of its secondary and tertiary structure from primary structure. Structure prediction is different from the inverse problem of protein design. Protein structure prediction is one of the most important goals pursued by computational biology; and it is important in medicine (for example, in drug design) and biotechnology (for example, in the design of novel enzymes).
Structural genomicsStructural genomics seeks to describe the 3-dimensional structure of every protein encoded by a given genome. This genome-based approach allows for a high-throughput method of structure determination by a combination of experimental and modeling approaches. The principal difference between structural genomics and traditional structural prediction is that structural genomics attempts to determine the structure of every protein encoded by the genome, rather than focusing on one particular protein.
Threading (protein sequence)In molecular biology, protein threading, also known as fold recognition, is a method of protein modeling which is used to model those proteins which have the same fold as proteins of known structures, but do not have homologous proteins with known structure. It differs from the homology modeling method of structure prediction as it (protein threading) is used for proteins which do not have their homologous protein structures deposited in the Protein Data Bank (PDB), whereas homology modeling is used for those proteins which do.
Coherence (physics)In physics, coherence expresses the potential for two waves to interfere. Two monochromatic beams from a single source always interfere. Physical sources are not strictly monochromatic: they may be partly coherent. Beams from different sources are mutually incoherent. When interfering, two waves add together to create a wave of greater amplitude than either one (constructive interference) or subtract from each other to create a wave of minima which may be zero (destructive interference), depending on their relative phase.
Variation of informationIn probability theory and information theory, the variation of information or shared information distance is a measure of the distance between two clusterings (partitions of elements). It is closely related to mutual information; indeed, it is a simple linear expression involving the mutual information. Unlike the mutual information, however, the variation of information is a true metric, in that it obeys the triangle inequality. Suppose we have two partitions and of a set into disjoint subsets, namely and .