In computer science, the syntax of a computer language is the rules that define the combinations of symbols that are considered to be correctly structured statements or expressions in that language. This applies both to programming languages, where the document represents source code, and to markup languages, where the document represents data.
The syntax of a language defines its surface form. Text-based computer languages are based on sequences of characters, while visual programming languages are based on the spatial layout and connections between symbols (which may be textual or graphical). Documents that are syntactically invalid are said to have a syntax error. When designing the syntax of a language, a designer might start by writing down examples of both legal and illegal strings, before trying to figure out the general rules from these examples.
Syntax therefore refers to the form of the code, and is contrasted with semantics – the meaning. In processing computer languages, semantic processing generally comes after syntactic processing; however, in some cases, semantic processing is necessary for complete syntactic analysis, and these are done together or concurrently. In a compiler, the syntactic analysis comprises the frontend, while the semantic analysis comprises the backend (and middle end, if this phase is distinguished).
Computer language syntax is generally distinguished into three levels:
Words – the lexical level, determining how characters form tokens;
Phrases – the grammar level, narrowly speaking, determining how tokens form phrases;
Context – determining what objects or variables names refer to, if types are valid, etc.
Distinguishing in this way yields modularity, allowing each level to be described and processed separately and often independently. First, a lexer turns the linear sequence of characters into a linear sequence of tokens; this is known as "lexical analysis" or "lexing". Second, the parser turns the linear sequence of tokens into a hierarchical syntax tree; this is known as "parsing" narrowly speaking.