In computer science, a parsing expression grammar (PEG) is a type of analytic formal grammar, i.e. it describes a formal language in terms of a set of rules for recognizing strings in the language. The formalism was introduced by Bryan Ford in 2004 and is closely related to the family of top-down parsing languages introduced in the early 1970s.
Syntactically, PEGs also look similar to context-free grammars (CFGs), but they have a different interpretation: the choice operator selects the first match in PEG, while it is ambiguous in CFG. This is closer to how string recognition tends to be done in practice, e.g. by a recursive descent parser.
Unlike CFGs, PEGs cannot be ambiguous; a string has exactly one valid parse tree or none. It is conjectured that there exist context-free languages that cannot be recognized by a PEG, but this is not yet proven. PEGs are well-suited to parsing computer languages (and artificial human languages such as Lojban), but not natural languages where the performance of PEG algorithms is comparable to general CFG algorithms such as the Earley algorithm.
Formally, a parsing expression grammar consists of:
A finite set N of nonterminal symbols.
A finite set Σ of terminal symbols that is disjoint from N.
A finite set P of parsing rules.
An expression eS termed the starting expression.
Each parsing rule in P has the form A ← e, where A is a nonterminal symbol and e is a parsing expression. A parsing expression is a hierarchical expression similar to a regular expression, which is constructed in the following fashion:
An atomic parsing expression consists of:
any terminal symbol,
any nonterminal symbol, or
the empty string ε.
Given any existing parsing expressions e, e1, and e2, a new parsing expression can be constructed using the following operators:
Sequence: e1 e2
Ordered choice: e1 / e2
Zero-or-more: e*
One-or-more: e+
Optional: e?
And-predicate: &e
Not-predicate: !e
Group: (e)
Operator priorities are as follows, based on Table 1 in:
The fundamental difference between context-free grammars and parsing expression grammars is that the PEG's choice operator is ordered.
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
We teach the fundamental aspects of analyzing and interpreting computer languages, including the techniques to build compilers. You will build a working compiler from an elegant functional language in
A decentralized system is one that works when no single party is in charge or fully trusted. This course teaches decentralized systems principles while guiding students through the engineering of thei
This course will offer students a broad but hands-on introduction to technologies of human self-organization.
In formal language theory, a grammar (when the context is not given, often called a formal grammar for clarity) describes how to form strings from a language's alphabet that are valid according to the language's syntax. A grammar does not describe the meaning of the strings or what can be done with them in whatever context—only their form. A formal grammar is defined as a set of production rules for such strings in a formal language. Formal language theory, the discipline that studies formal grammars and languages, is a branch of applied mathematics.
Top-down parsing in computer science is a parsing strategy where one first looks at the highest level of the parse tree and works down the parse tree by using the rewriting rules of a formal grammar. LL parsers are a type of parser that uses a top-down parsing strategy. Top-down parsing is a strategy of analyzing unknown data relationships by hypothesizing general parse tree structures and then considering whether the known fundamental structures are compatible with the hypothesis.
In computer science, parsing reveals the grammatical structure of linear input text, as a first step in working out its meaning. Bottom-up parsing recognizes the text's lowest-level small details first, before its mid-level structures, and leaving the highest-level overall structure to last. The bottom-up name comes from the concept of a parse tree, in which the most detailed parts are at the bottom of the upside-down tree, and larger structures composed from them are in successively higher layers, until at the top or "root" of the tree a single unit describes the entire input stream.
Explores parsing text into trees using parser combinators in Scala, covering filtering, transforming, sequencing, alternatives, recursion, spaces handling, lexing, monadic nature, and for-notation.
Introduces the CYK parsing algorithm for context-free grammars, explaining its handling of ambiguity and importance in parsing general grammars.
Covers object-oriented parser combinators in Scala for building parsers, including JSON parsing and testing.
As a universal expression of human creativity, music is capable of conveying great subtlety and complexity. Crucially, this complexity is not encoded in the score or in the sounds, but is rather construed in the mind of the listener in the form of nuanced ...
It is well known that hyperedge-replacement grammars can generate NP-complete graph languages even under seemingly harsh restrictions. This means that the parsing problem is difficult even in the non-uniform setting, in which the grammar is considered to b ...
ACADEMIC PRESS INC ELSEVIER SCIENCE2021
, ,
While theoretical and empirical insights suggest that the capacity to represent and process complex syntax is crucial in language as well as other domains, it is still unclear whether specific parsing mechanisms are also shared across domains. Focusing on ...