Introduction to Lexical Analysis

A Look at the First Phase of Compilation

Jonathan En & Lexical Analysis

  • Jonathan En introduces a screen cast on Lexical Analysis
  • The screen cast provides a brief introduction to Lexi Analysis
  • Jonathan explains what Lex is and how it works
  • The screen cast includes examples and a hands-on demo
  • Lexical Analysis is the first phase of compilation

Compilation Process

  • Source file or character stream
  • Front end: Lexical Analysis
  • Parser: Syntax Analysis
  • Semantic Analysis & Code Generation
  • Target Code Generation

Lexical Analysis in Detail

  • Lex takes a source code as input
  • Generates a stream of tokens
  • Tokens are groups of characters with specific meanings
  • Tokens are fed into the parser for syntax analysis
  • Lex is a scanner generator

Introduction to Lex

  • Lex is a scanner generator
  • Simplifies the creation of a scanner
  • Describes regular expressions and actions
  • Generates a table-driven scanner
  • Alternative: Flex (Open source implementation)

Structure of a Lex Input File

  • Three parts: optional dimensions, definitions, global C code
  • Separate parts with the '%%' token
  • Part 1: Control dimensions, definitions, global C code
  • Part 2: Regular expressions and actions
  • Part 3: Optional C code

Example: Lex Input File

  • Simple Lex input file for token recognition
  • Define patterns and associate actions
  • Tokens: colon, keywords, identifiers, integers
  • Run Lex to generate scanner code
  • Compile and run the scanner program

Sophisticated Regular Expressions

  • Examples of sophisticated regular expressions
  • Match strings, ranges, prefixes, exclusions
  • Use dot for any character, plus for repetitions
  • Flexibility in defining patterns

Hands-on Example: Processing a Configuration File

  • Demonstration of integrating Lex into a C program
  • Process textual configuration file
  • Parsing name-value pairs
  • Define symbols for tokens
  • Utilize Yylex function for token recognition