In computing, the utility diff is a data comparison tool that computes and displays the differences between the contents of files. Unlike edit distance notions used for other purposes, diff is line-oriented rather than character-oriented, but it is like Levenshtein distance in that it tries to determine the smallest set of deletions and insertions to create one file from the other. The utility displays the changes in one of several standard formats, such that both humans or computers can parse the changes, and use them for patching.
Typically, diff is used to show the changes between two versions of the same file. Modern implementations also support s. The output is called a "diff", or a patch, since the output can be applied with the Unix program . The output of similar file comparison utilities is also called a "diff"; like the use of the word "grep" for describing the act of searching, the word diff became a generic term for calculating data difference and the results thereof. The POSIX standard specifies the behavior of the "diff" and "patch" utilities and their file formats.
diff was developed in the early 1970s on the Unix operating system, which was emerging from Bell Labs in Murray Hill, New Jersey. The first released version shipped with the 5th Edition of Unix in 1974, and was written by Douglas McIlroy, and James Hunt. This research was published in a 1976 paper co-written with James W. Hunt, who developed an initial prototype of . The algorithm this paper described became known as the Hunt–Szymanski algorithm.
McIlroy's work was preceded and influenced by Steve Johnson's comparison program on GECOS and Mike Lesk's program. also originated on Unix and, like , produced line-by-line changes and even used angle-brackets (">" and "
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Git (gɪt) is a distributed version control system that tracks changes in any set of s, usually used for coordinating work among programmers who are collaboratively developing source code during software development. Its goals include speed, data integrity, and support for distributed, non-linear workflows (thousands of parallel branches running on different computers). Git was originally authored by Linus Torvalds in 2005 for development of the Linux kernel, with other kernel developers contributing to its initial development.
Apache Subversion (often abbreviated SVN, after its command name svn) is a software versioning and revision control system distributed as open source under the Apache License. Software developers use Subversion to maintain current and historical versions of files such as source code, web pages, and documentation. Its goal is to be a mostly compatible successor to the widely used Concurrent Versions System (CVS). The open source community has used Subversion widely: for example, in projects such as Apache Software Foundation, FreeBSD, SourceForge, and from 2006 to 2019, GCC.
A longest common subsequence (LCS) is the longest subsequence common to all sequences in a set of sequences (often just two sequences). It differs from the longest common substring: unlike substrings, subsequences are not required to occupy consecutive positions within the original sequences. The problem of computing longest common subsequences is a classic computer science problem, the basis of data comparison programs such as the diff utility, and has applications in computational linguistics and bioinformatics.