Concept

Lexer hack

Summary
In computer programming, the lexer hack is a common solution to the problems in parsing ANSI C, due to the reference grammar being context-sensitive. In C, classifying a sequence of characters as a variable name or a type name requires contextual information of the phrase structure, which prevents one from having a context-free lexer. The problem is that in the following code, the lexical class of A cannot be determined without further contextual information: (A)*B This code could be multiplication of two variables, in which case A is a variable; written unambiguously: A * B Alternatively, it could be casting the dereferenced value of B to the type A, in which case A is a typedef name; written unambiguously: (A) (B) In more detail, in a compiler, the lexer performs one of the earliest stages of converting the source code to a program. It scans the text to extract meaningful tokens, such as words, numbers, and strings. The parser analyzes sequences of tokens attempting to match them to syntax rules representing language structures, such as loops and variable declarations. A problem occurs here if a single sequence of tokens can ambiguously match more than one syntax rule. This ambiguity can happen in C if the lexer does not distinguish between variable and typedef identifiers. For example, in the C expression: (A) * B the lexer may find these tokens: left parenthesis identifier 'A' right parenthesis operator '' identifier 'B' The problem is precisely that the lexical class of A cannot be determined without further context: the parser can interpret this as variable A multiplied by B or as type A casting the dereferenced value of B. This is known as the "typedef-name: identifier" problem, due to the name of the problematic production rule. The solution generally consists of feeding information from the semantic symbol table back into the lexer. That is, rather than functioning as a pure one-way pipeline from the lexer to the parser, there is a backchannel from semantic analysis back to the lexer.
About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.