In formal language theory, the empty string, or empty word, is the unique string of length zero.
Formally, a string is a finite, ordered sequence of characters such as letters, digits or spaces. The empty string is the special case where the sequence has length zero, so there are no symbols in the string.
There is only one empty string, because two strings are only different if they have different lengths or a different sequence of symbols.
In formal treatments, the empty string is denoted with ε or sometimes Λ or λ.
The empty string should not be confused with the empty language ∅, which is a formal language (i.e. a set of strings) that contains no strings, not even the empty string.
The empty string has several properties:
|ε| = 0. Its string length is zero.
ε ⋅ s = s ⋅ ε = s. The empty string is the identity element of the concatenation operation. The set of all strings forms a free monoid with respect to ⋅ and ε.
εR = ε. Reversal of the empty string produces the empty string.
The empty string precedes any other string under lexicographical order, because it is the shortest of all strings.
In context-free grammars, a production rule that allows a symbol to produce the empty string is known as an ε-production, and the symbol is said to be "nullable".
In most programming languages, strings are a data type. Strings are typically stored at distinct memory addresses (locations). Thus, the same string (for example, the empty string) may be stored in two or more places in memory.
In this way, there could be multiple empty strings in memory, in contrast with the formal theory definition, for which there is only one possible empty string. However, a string comparison function would indicate that all of these empty strings are equal to each other.
Even a string of length zero can require memory to store it, depending on the format being used. In most programming languages, the empty string is distinct from a null reference (or null pointer) because a null reference points to no string at all, not even the empty string.