36CHAPTER 2. FINITE AUTOMATAExample 2.4.4 Now we will describe the language L represented by the expression (b U (aa)a*)*. This language obviously does not contain the word a, no word w E L can begin with ab, end with ba, or have a subword bab. I End ExampleExample 2.4.5 The regular expression we are going to define specifies all identifiers in a C-like programming language. Any such identifier begins with a letter, which may be followed by a string consisting of letters and numeric digits. Let expression [a - z] stand for the regular expression (a U b U c U... U z), [A - Z] stand for the regular expression (A U B U C... U Z), and [0 - 9] stand for the regular expression (0 U 1 U 2 U ... U 9) (dots stand for missing letters and numbers). Then the regular expression ([a - z] U [A - Z])([a - z] U [A - Z] U [0 - 9])* represents all identifiers. Some programming languages permit the use of the underscore character between letters and numeric digits. For example; Ab_55_c, ab_b_5. Every character must be followed by a letter or a numeric digit. The regular expression generating all such identifiers is ([a - z] U [A - Z])((_([a - z] U [A - Z] U [0 - 9]))* ([a - z] U [A - Z] U [0 - 9])*)* A part of every compiler is a lexical analyzer, which, in particular, finds all identifiers in the text of the source program. More specifically, the lexical analyzer uses a list of regular expressions that are used to determine if strings of characters are identifiers, that is, if they match at least one of the given expressions. This is one of multiple applications of regular expressions in programming. End Example Note that concatenation and union are associative operations. In other words, (LIL 2 )L 3 = LI(L 2 L3 ) and (L1 U L 2) U L3 = L 1 U (L 2 U L 3). This means that we can omit parentheses in subexpressions of the type (aO3), or, say, a U (03 U -y). For example, we can write (abb U ab U a)* instead of (((ab)b) U (ab)) U a)*. Now we are in a position to design an algorithm A that converts any regular expression a into a diagram of a finite automaton A such that L(a) = L(A). Note that any regular expression is built up from ground components, for example, the singletons e, a E E and 0, to which operations U, concatenation, and the Kleene star are being applied. The required algorithm operates as follows. 1. It converts every ground singleton a or e or 0 (if any) to an automaton accepting just this singleton; (see Figure 2.29).