INTRODUCTION
System Programming is a technique for design and implementation of system program or a system software. It deals with the operation of the computer itself.
System Software : System software provides the basic functions for computer usage and helps run the computer hardware and system. It includes a combination of the following: device drivers, operating systems etc. System software is responsible for managing a variety of independent hardware components, so that they can work together harmoniously. Its purpose is to put less burden on the application software programmer.
Different types of System software include :
1) Assembler
2) Loader
3) Linker
5) Operating System
6) Interpreter
7) Emulator
8) Simulator
DESIGN OF THE COMPILER
Lexical Analysis
The first phase of scanner works as a text scanner. This phase scans the source code as a stream of characters and converts it into meaningful lexemes. Lexical analyzer represents these lexemes in the form of tokens as:<token-name, attribute-value>
Syntax Analysis
The next phase is called the syntax analysis or parsing. It takes the token produced by lexical analysis as input and generates a parse tree (or syntax tree). In this phase, token arrangements are checked against the source code grammar, i.e. the parser checks if the expression made by the tokens is syntactically correct.
Semantic Analysis
Semantic analysis checks whether the parse tree constructed follows the rules of language. For example, assignment of values is between compatible data types, and adding string to an integer. Also, the semantic analyzer keeps track of identifiers, their types and expressions; whether identifiers are declared before use or not etc. The semantic analyzer produces an annotated syntax tree as an output.
Intermediate Code Generation
After semantic analysis the compiler generates an intermediate code of the source code for the target machine. It represents a program for some abstract machine. It is in between the high-level language and the machine language. This intermediate code should be generated in such a way that it makes it easier to be translated into the target machine code.
Code Optimization
The next phase does code optimization of the intermediate code. Optimization can be assumed as something that removes unnecessary code lines, and arranges the sequence of statements in order to speed up the program execution without wasting resources (CPU, memory).
Code Generation
In this phase, the code generator takes the optimized representation of the intermediate code and maps it to the target machine language. The code generator translates the intermediate code into a sequence of (generally) re-locatable machine code. Sequence of instructions of machine code performs the task as the intermediate code would do.
PRACTICAL 1 Study About LEX analyzer
Lex is a program generator designed for lexical processing of character input streams. It accepts a high-level, problem oriented specification for character string matching, and produces a program in a general purpose language which recognizes regular expressions. The regular expressions are specified by the user in the source specifications given to Lex. The Lex written code recognizes these expressions in an input stream and partitions the input stream into strings matching the expressions. At the boundaries between strings program sections provided by the user are executed. The Lex source file associates the regular expressions and the program fragments. As each expression appears in the input to the program written by Lex, the corresponding fragment is executed.
Lex turns the user's expressions and actions (called source in this memo) into the host general-purpose language; the generated program is named yylex. The yylex program will recognize expressions in a stream (called input in this memo) and perform the specified actions for each expression as it is detected. See Figure 1.
We use FLEX in ubuntu for running lex programs. Our Input file consists of three sections, separated by a line with just `%%' in it:
%{
definitions
%}
%%
rules
%%
user code
flex is a tool for generating scanners: programs which recognized lexical patterns in text. flex reads the given input files, or its standard input if no file names are given, for a description of a scanner to generate. The description is in the form of pairs of regular expressions and C code, called rules. flex generates as output a C source file, `lex.yy.c', which defines a routine `yylex()'. This file is compiled and linked with the `-ll' library to produce an executable. When the executable is run, it analyzes its input for occurrences of the regular expressions. Whenever it finds one, it executes the corresponding C code.
Running a Lex program, for compiling a lex program
1. write the lex program in a file and save it as file.l (where file is the name of the file).
2. open the terminal and navigate to the directory where you have saved the file.l
3. type - flex file.l
4. then type - gcc lex.yy.c -ll
5. then type - ./a.out
Lex Patterns
This includes all letter, digits, and _, plus several others They are, in effect, taken to be single character regular expressions
[abcde]
A single character regular expression consisting of any one of the characters in between the square backets [ ]
A range of characters may be specified by using a hyphen, like this [A-Za-z]
To include the hyphen put it first or last in the group, like this [-A-Za-z]
To include a ], put it first in the group, like this []abc-]
If the first character is ^, it means any character except those listed.
So the second part of our example
[^ \n<>"]
means "anything but a space, newline, quote, less-than or greater-than" .
\
Any character following the \ loses it's special meaning, and is taken literally.
The ^ and $ characters constrain the regular expression to the start or end of the line, respectively.
+ and *
The + and * imply repetition of the preceding single character regular expression.
+ means "one or more occurrences of".
* means "zero or more occurrences of".
{3,5}
The range expression {n1,n2} also implies repetition of the preceding regular expression.
{3,5} means "3 to 5 occurrences of".
{2,} means "2 or more occurrences of".
{3} means "exactly 3 occurrences of".
?
The ? implies that the preceding single character regular expression is optional.
In effect, it means "zero or one occurances of".
( and )
The round backets imply grouping, such that the regular expression between the brackets is treated as if it was a single character (or nearly enough). See the discussion of precedence in the flex man-page for more information.
. Any single character except a newline (\n)
The | is used to specify a "logical OR" of two regular expressions. The exact text which is OR'd is governed by the precedence rules, so it's best to use brackets.
To perform below programs two packages are required to install in ubuntu
1) flex
2) bison
PRACTICAL 2 Write LEX Program to count the numbers of identifiers
int c ;
%}
%%
[a-zA-Z]+[(0-9)+(a-zA-Z)]* {c++;}
%%
{
yylex();
printf("Below are the count of identifiers\n");
printf("Identifier=%d",c);
return 0;
}
%{
int s;
int c;
int n;
int w;
int l;
int sp;
%}
%%
[a-z] {s++;}
[A-Z] {c++;}
[0-9] {no++;}
[ ]+ {w++;}
[\n] {l++;w++;}
. {sp++;}
%%
int main()
{
printf("WRITE YOUR INPUT:");
yylex();
printf("\nSmall letters: %d",s);
printf("\nCapital letters: %d",c);
printf("\nNumbers: %d",no);
printf("\nNumber of words: %d",w);
printf("\nNumber of lines: %d",l);
printf("\nSpecial Characters: %d",sp);
return 0;
}
PRACTICAL 8 Write a program using lex and yacc to check whether a given string is a correct variable or not
%{
#include"y.tab.h"
%}
%%
[0-9]+ {return D;}
[a-zA-Z]+ {return L;}
%%
%{
#include<stdio.h>
%}
%token L D
%%
v:L|L rest;
rest:L rest|D rest|L|D;
%%
int main()
{
yyparse();
printf("Variable is valid");
}
int yyerror(char *s)
{
printf("Variable is not valid ");
}
Compilation step
PRACTICAL 9 Write a program using lex and yacc to check given arithmetic expression is valid or not
Compilation step
PRACTICAL 3 Write a LEX program to count number of small letters, capital letters, numbers, word, lines and special symbols using LEX program without taking input from file.
int s;
int c;
int n;
int w;
int l;
int sp;
%}
%%
[a-z] {s++;}
[A-Z] {c++;}
[0-9] {no++;}
[ ]+ {w++;}
[\n] {l++;w++;}
. {sp++;}
%%
int main()
{
printf("WRITE YOUR INPUT:");
yylex();
printf("\nSmall letters: %d",s);
printf("\nCapital letters: %d",c);
printf("\nNumbers: %d",no);
printf("\nNumber of words: %d",w);
printf("\nNumber of lines: %d",l);
printf("\nSpecial Characters: %d",sp);
return 0;
}
PRACTICAL 4 Write a LEX program to count number of small letters, capital letters, numbers, word, lines and special symbols using LEX program with taking input from file.
%{
int s;
int c;
int no;
int w;
int l;
int sp;
%}
%%
[a-z] {s++;}
[A-Z] {c++;}
[0-9] {no++;}
[ ]+ {w++;}
[\n] {l++;w++;}
. {sp++;}
%%
int main()
{
yyin=fopen("2.txt","r");
yylex();
printf("\nSmall letters: %d",s);
printf("\nCapital letters: %d",c);
printf("\nNumbers: %d",no);
printf("\nNumber of words: %d",w);
printf("\nNumber of lines: %d",l);
printf("\nSpecial Characters: %d",sp);
return 0;
}
PRACTICAL 5
Write a LEX program to count number comment lines - single or multiple
%{
int s;
int m;
%}
%%
"//"(.*)\n {s++;}
"/*"(.*)\n(.*)"*/" {m++;}
%%
int main()
printf("Number of multiple line comment %d \n",m);
return 0;
}
%{
int s;
int m;
%}
%%
"//"(.*)\n {s++;}
"/*"(.*)\n(.*)"*/" {m++;}
%%
int main()
{
yyin=fopen("2.txt","r");
yylex();
printf("Number of Single line comment %d \n",s);printf("Number of multiple line comment %d \n",m);
return 0;
}
PRACTICAL 6 Write a LEX program to count number of vowels and consonants
%{
int v;
int c;
%}
%%
[a,e,i,o,u,A,E,I,O,U] {v++;}
[a-zA-Z] {c++;}
%%
int main()
{
yylex();
printf("Number of Vowels: %d",v);
printf("Number of Consonants: %d",c);
return 0;
}
PRACTICAL 7 Write any two program using MACRO in C language
%{
int v;
int c;
%}
%%
[a,e,i,o,u,A,E,I,O,U] {v++;}
[a-zA-Z] {c++;}
%%
int main()
{
yylex();
printf("Number of Vowels: %d",v);
printf("Number of Consonants: %d",c);
return 0;
}
PRACTICAL 7 Write any two program using MACRO in C language
PRACTICAL 8 Write a program using lex and yacc to check whether a given string is a correct variable or not
file1.l
%{
#include"y.tab.h"
%}
%%
[0-9]+ {return D;}
[a-zA-Z]+ {return L;}
%%
file2.y
%{
#include<stdio.h>
%}
%token L D
%%
v:L|L rest;
rest:L rest|D rest|L|D;
%%
int main()
{
yyparse();
printf("Variable is valid");
}
int yyerror(char *s)
{
printf("Variable is not valid ");
}
Compilation step
yacc -d file2.y
lex file1.l
gcc lex.yy.c y.tab.c -ll
./a.out
PRACTICAL 9 Write a program using lex and yacc to check given arithmetic expression is valid or not
file1.l
%{
#include "y.tab.h"
%}
%%
[0-9]+ {return digit;}
[a-zA-Z]+ {return id;}
@,. {return yytext[0];}
%%
#include "y.tab.h"
%}
%%
[0-9]+ {return digit;}
[a-zA-Z]+ {return id;}
@,. {return yytext[0];}
%%
file2.y
%{
#include <stdio.h>
%}
%token id digit;
%left '+''-''*''/'
%%
expr:expr'+'expr|expr'-'expr|expr'*'expr|expr'/'expr|digit|id;
%%
void main()
{
printf("Enter any arithmetic expression:");
yyparse();
printf("Expression is valid");
}
int yyerror(char *s)
{
printf("Expression is not valid");
exit(0);
}
#include <stdio.h>
%}
%token id digit;
%left '+''-''*''/'
%%
expr:expr'+'expr|expr'-'expr|expr'*'expr|expr'/'expr|digit|id;
%%
void main()
{
printf("Enter any arithmetic expression:");
yyparse();
printf("Expression is valid");
}
int yyerror(char *s)
{
printf("Expression is not valid");
exit(0);
}
Compilation step
yacc -d file2.y
lex file1.l
gcc lex.yy.c y.tab.c -ll
./a.out
PRACTICAL 9 Write a program using lex to count the positive and negative number in the input
%{
int postiveno=0;
int negtiveno=0;
%}
%%
[+]?[0-9]+ postiveno++;
[-][0-9]+ negtiveno++;
. ;
%%
void main()
{
yylex();
printf("\nNo. of positive numbers: %d",postiveno);
printf("\nNo. of Negative numbers: %d",negtiveno);
}
PRACTICAL 10 Write a program using lex to count the no of printf and scanf and replace them with other string
%{
int p=0,s=0;
%}
%%
"printf" {p++;fprintf(yyout,"%s","cout");}
"scanf" {s++;fprintf(yyout,"%s","cin");}
%%
void main()
{
yyin=fopen("1.c","r");
yyout=fopen("2.c","w");
yylex();
printf("total scanf in a prg is=%d\n",s);
printf("total printf in a prg is=%d\n",p);
}
PRACTICAL 9 Write a program using lex to count the positive and negative number in the input
%{
int postiveno=0;
int negtiveno=0;
%}
%%
[+]?[0-9]+ postiveno++;
[-][0-9]+ negtiveno++;
. ;
%%
void main()
{
yylex();
printf("\nNo. of positive numbers: %d",postiveno);
printf("\nNo. of Negative numbers: %d",negtiveno);
}
PRACTICAL 10 Write a program using lex to count the no of printf and scanf and replace them with other string
%{
int p=0,s=0;
%}
%%
"printf" {p++;fprintf(yyout,"%s","cout");}
"scanf" {s++;fprintf(yyout,"%s","cin");}
%%
void main()
{
yyin=fopen("1.c","r");
yyout=fopen("2.c","w");
yylex();
printf("total scanf in a prg is=%d\n",s);
printf("total printf in a prg is=%d\n",p);
}
1.c needs to have a c program. It has to be created before running above code