Open sourceOpen source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized software development model that encourages open collaboration. A main principle of open-source software development is peer production, with products such as source code, blueprints, and documentation freely available to the public.
Data integrationData integration involves combining data residing in different sources and providing users with a unified view of them. This process becomes significant in a variety of situations, which include both commercial (such as when two similar companies need to merge their databases) and scientific (combining research results from different bioinformatics repositories, for example) domains. Data integration appears with increasing frequency as the volume (that is, big data) and the need to share existing data explodes.
Big dataBig data primarily refers to data sets that are too large or complex to be dealt with by traditional data-processing application software. Data with many entries (rows) offer greater statistical power, while data with higher complexity (more attributes or columns) may lead to a higher false discovery rate. Though used sometimes loosely partly because of a lack of formal definition, the interpretation that seems to best describe big data is the one associated with a large body of information that we could not comprehend when used only in smaller amounts.
Binary relationIn mathematics, a binary relation associates elements of one set, called the domain, with elements of another set, called the codomain. A binary relation over sets X and Y is a new set of ordered pairs (x, y) consisting of elements x in X and y in Y. It is a generalization of the more widely understood idea of a unary function. It encodes the common concept of relation: an element x is related to an element y, if and only if the pair (x, y) belongs to the set of ordered pairs that defines the binary relation.
Universal algebraUniversal algebra (sometimes called general algebra) is the field of mathematics that studies algebraic structures themselves, not examples ("models") of algebraic structures. For instance, rather than take particular groups as the object of study, in universal algebra one takes the class of groups as an object of study. Algebraic structure In universal algebra, an algebra (or algebraic structure) is a set A together with a collection of operations on A. An n-ary operation on A is a function that takes n elements of A and returns a single element of A.
TriplestoreA triplestore or RDF store is a purpose-built database for the storage and retrieval of triples through semantic queries. A triple is a data entity composed of subject–predicate–object, like "Bob is 35" or "Bob knows Fred". Much like a relational database, information in a triplestore is stored and retrieved via a query language. Unlike a relational database, a triplestore is optimized for the storage and retrieval of triples. In addition to queries, triples can usually be imported and exported using Resource Description Framework (RDF) and other formats.
Structure (mathematical logic)In universal algebra and in model theory, a structure consists of a set along with a collection of finitary operations and relations that are defined on it. Universal algebra studies structures that generalize the algebraic structures such as groups, rings, fields and vector spaces. The term universal algebra is used for structures of first-order theories with no relation symbols. Model theory has a different scope that encompasses more arbitrary first-order theories, including foundational structures such as models of set theory.
Signature (logic)In logic, especially mathematical logic, a signature lists and describes the non-logical symbols of a formal language. In universal algebra, a signature lists the operations that characterize an algebraic structure. In model theory, signatures are used for both purposes. They are rarely made explicit in more philosophical treatments of logic.
Semantic tripleA semantic triple, or RDF triple or simply triple, is the atomic data entity in the Resource Description Framework (RDF) data model. As its name indicates, a triple is a set of three entities that codifies a statement about semantic data in the form of subject–predicate–object expressions (e.g., "Bob is 35", or "Bob knows John"). This format enables knowledge to be represented in a machine-readable way. Particularly, every part of an RDF triple is individually addressable via unique URIs — for example, the statement "Bob knows John" might be represented in RDF as: http://example. Data miningData mining is the process of extracting and discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems. Data mining is an interdisciplinary subfield of computer science and statistics with an overall goal of extracting information (with intelligent methods) from a data set and transforming the information into a comprehensible structure for further use. Data mining is the analysis step of the "knowledge discovery in databases" process, or KDD.