In computer science, a parsing expression grammar (PEG) is a type of analytic formal grammar, i.e. it describes a formal language in terms of a set of rules for recognizing strings in the language. The formalism was introduced by Bryan Ford in 2004 and is closely related to the family of top-down parsing languages introduced in the early 1970s.
Syntactically, PEGs also look similar to context-free grammars (CFGs), but they have a different interpretation: the choice operator selects the first match in PEG, while it is ambiguous in CFG. This is closer to how string recognition tends to be done in practice, e.g. by a recursive descent parser.
Unlike CFGs, PEGs cannot be ambiguous; a string has exactly one valid parse tree or none. It is conjectured that there exist context-free languages that cannot be recognized by a PEG, but this is not yet proven. PEGs are well-suited to parsing computer languages (and artificial human languages such as Lojban), but not natural languages where the performance of PEG algorithms is comparable to general CFG algorithms such as the Earley algorithm.
Formally, a parsing expression grammar consists of:
A finite set N of nonterminal symbols.
A finite set Σ of terminal symbols that is disjoint from N.
A finite set P of parsing rules.
An expression eS termed the starting expression.
Each parsing rule in P has the form A ← e, where A is a nonterminal symbol and e is a parsing expression. A parsing expression is a hierarchical expression similar to a regular expression, which is constructed in the following fashion:
An atomic parsing expression consists of:
any terminal symbol,
any nonterminal symbol, or
the empty string ε.
Given any existing parsing expressions e, e1, and e2, a new parsing expression can be constructed using the following operators:
Sequence: e1 e2
Ordered choice: e1 / e2
Zero-or-more: e*
One-or-more: e+
Optional: e?
And-predicate: &e
Not-predicate: !e
Group: (e)
Operator priorities are as follows, based on Table 1 in:
The fundamental difference between context-free grammars and parsing expression grammars is that the PEG's choice operator is ordered.
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
In computer science, parsing reveals the grammatical structure of linear input text, as a first step in working out its meaning. Bottom-up parsing recognizes the text's lowest-level small details first, before its mid-level structures, and leaving the highest-level overall structure to last. The bottom-up name comes from the concept of a parse tree, in which the most detailed parts are at the bottom of the upside-down tree, and larger structures composed from them are in successively higher layers, until at the top or "root" of the tree a single unit describes the entire input stream.
In computer-based language recognition, ANTLR (pronounced antler), or ANother Tool for Language Recognition, is a parser generator that uses a LL(*) algorithm for parsing. ANTLR is the successor to the Purdue Compiler Construction Tool Set (PCCTS), first developed in 1989, and is under active development. Its maintainer is Professor Terence Parr of the University of San Francisco. PCCTS 1.00 was announced April 10, 1992. ANTLR takes as input a grammar that specifies a language and generates as output source code for a recognizer of that language.
In formal language theory, a grammar (when the context is not given, often called a formal grammar for clarity) describes how to form strings from a language's alphabet that are valid according to the language's syntax. A grammar does not describe the meaning of the strings or what can be done with them in whatever context—only their form. A formal grammar is defined as a set of production rules for such strings in a formal language. Formal language theory, the discipline that studies formal grammars and languages, is a branch of applied mathematics.
We teach the fundamental aspects of analyzing and interpreting computer languages, including the techniques to build compilers. You will build a working compiler from an elegant functional language in
The "Introduction to Applied Data Science" (I2ADS) course is aimed at students of all levels to train them in the core computer science software stack and techniques forming the pillars of open & repr
A decentralized system is one that works when no single party is in charge or fully trusted. This course teaches decentralized systems principles while guiding students through the development and tes
Introduces the CYK parsing algorithm for context-free grammars, explaining its handling of ambiguity and importance in parsing general grammars.
Explores parsing text into trees using parser combinators in Scala, covering filtering, transforming, sequencing, alternatives, recursion, spaces handling, lexing, monadic nature, and for-notation.
Covers object-oriented parser combinators in Scala for building parsers, including JSON parsing and testing.
Dynamic programming is an algorithmic technique to solve problems that follow the Bellman’s principle: optimal solutions depends on optimal sub-problem solutions. The core idea behind dynamic programm