Description College of Computing and Informatics Project Deadline: Sunday 12/05/2024 @ 23:59 [Total Mark is 14] Student

Description

College of Computing and Informatics
Project
Deadline: Sunday 12/05/2024 @ 23:59
[Total Mark is 14]
Student Details:
CRN:
Name:
Name:
Name:
ID:
ID:
ID:
Instructions:
• You must submit two separate copies (one Word file and one PDF file) using the Assignment Template on
Blackboard via the allocated folder. These files must not be in compressed format.
• It is your responsibility to check and make sure that you have uploaded both the correct files.
• Zero mark will be given if you try to bypass the SafeAssign (e.g. misspell words, remove spaces between
words, hide characters, use different character sets, convert text into image or languages other than English
or any kind of manipulation).
• Email submission will not be accepted.
• You are advised to make your work clear and well-presented. This includes filling your information on the cover
page.
• You must use this template, failing which will result in zero mark.
• You MUST show all your work, and text must not be converted into an image, unless specified otherwise by
the question.
• Late submission will result in ZERO mark.
• The work should be your own, copying from students or other resources will result in ZERO mark.
• Use Times New Roman font for all your answers.
General Instructions
Pg. 01
General Instructions







You are allowed to work in a group of 3 students for this assessment activity.
Each submitted part of this Project should contain two components:
o A brief report describing the development progress
o Source code of implementation
The source code should be separated into modules using an understandable coding style
and naming rules.
The report should contain instructions about how to run your code. If any extra environment
needs to be setup for your code, please give full instruction in your report.
The report should list all your test cases, including the expected result for each test case.
If we cannot run your code, a demo will be required and you will be notified through email.
You can use any programming language for implementation in the project e.g. Java, C etc.
The submission must contain a .zip file:


of all your code
a set of input files to be used for testing purpose, as well as a printout of the resulting output
of the program for each input file
Grading Scheme:
Work
Component
Part 1
Part 2
Part 3
Part 4
Deliverables
1.
2.
1.
2.
1.
2.
1.
2.
Documentation
Implementation
Documentation
Implementation
Documentation
Implementation
Documentation
Implementation
Marks / Comments
3.5
3.5
3.5
3.5
If the deliverables are complete and the
scanner is working as expected.
If the deliverables are complete and syntax
analysis phase is implemented correctly
If the deliverables are complete and semantic
analysis phase is implemented correctly
If the deliverables are complete and code
generation phase is implemented correctly
Part 1: Scanner
Pg. 02
Part 1: Scanner
Design and implement a scanner for a programming language whose lexical specifications are
given below. The scanner identifies and outputs tokens (valid words and punctuation) in the
source program. Its output is a token that can thereafter be used by the syntax analyzer to verify
that the program is syntactically correct.
When invoked, the scanner should extract the next token from the source program and should
be able to output a token even if the input does not form a correct program. The syntax of the
language will be specified later in Part 2.
Atomic lexical elements of the language:
id ::= letter alphanum*
alphanum ::= letter | digit | _
integer ::= nonzero digit* | 0
float ::= integer fraction [e[+|−] integer]
fraction ::= .digit* nonzero | .0
letter ::= a..z |A..Z
digit ::= 0..9
nonzero ::= 1..9
Operators, punctuation and reserved words:
==

=
;
,
.
:
::
+
*
/
=
&&
!
||
(
)
{
}
[
]
/*
*/
//
if
then
else
for
class
integer
float
read
write
return
main
Notes:



You are responsible for providing appropriate test cases that test for a wide variety of valid
and invalid cases.
It is up to you to analyze this lexical definition and figure out if there are ambiguities or other
errors. Any changes to this definition, if any, must be justified and should not result in
diminishing the expressive power of the language. Also, you have to design the lexical
analyzer in a flexible way, so that it can easily be adapted if changes are needed during the
designing of syntax analyzer.
The tokens // and /* followed */ by denote comments in the source code
Part 1: Scanner
Pg. 03
Deliverables:
Document
• Lexical specifications: Identify the lexical specification you used in the design of the
scanner, as well as any changes that you may have applied to original lexical specifications.
• Finite state automaton: Finite state machine describing the operation of your scanner.
• Error reporting and recovery: Identify all the possible lexical errors that the scanner can
encounter. Identify and explain the error recovery technique that you implement.
• Design: Give a brief overview of the overall structure of your solution, as well as a brief
description of the role of each component of your implementation.
• Implementation Tools: Identify all the tools/libraries/techniques that you have used in your
analysis or implementation and justify why you have used these particular ones as opposed
to others.
Implementation
1. Scanner:
1.1. Develop a scanner that recognizes the above-mentioned tokens.
1.2. It should be a function that returns a data structure containing the information about the
next token identified in the source program file.
1.3. The data structure should contain information such as (1) the token type (2) its value
(or lexeme) when applicable and (3) its location in the source code. Fully describe this
structure in your documentation.
2. Driver:
2.1. Include a driver that repeatedly calls the lexical analysis function and prints the token
type of each token until the end of the source program is reached.
2.2. The scanner should optionally print the token stream to a file in the AtoCC format.
2.3. Another file (for verification purposes) should contain representative error messages
each time an error is encountered in the input program.
2.4. The scanner should not stop after encountering an error. Error messages should be
clear and should identify the location of the error in the source code.
3. Test cases:
3.1. Include many test cases that test a wide variety of valid and invalid cases.
Selected examples:
0123
[Invalid number:0123] OR [integer:0][integer:123]
01.23
12.340
12345.6789e-123
12345
abc
abc_1
_abc1
1abc
[Invalid number:01.23] OR [integer:0][float:1.23]
[Invalid number:12.340] OR [float:12.34][integer:0]
[float:12345.6789e-123]
[integer:12345]
[id:abc]
[id:abc_1]
[Invalid identifier:_abc1]
[Invalid identifier:1abc] OR [integer:1][id:abc]
Part 2: Syntax Analysis
Pg. 04
Part 2: Syntax Analysis
This section is about the design and implement a syntax analyzer for the language specified by
the grammar specified on the next page.
Deliverables:
Documentation
1. Transformed Grammar into LL(1):
1.1. Remove all the EBNF notations and replace them by right-recursive list-generating
productions.
1.2. Analyze the syntactical definition. List in your documentation all the ambiguities and left
recursions or any error you may have found in the grammar.
1.3. Modify the productions so that the left recursions and ambiguities are removed without
modifying the language.
1.4. Include in your documentation the set of productions that can be parsed using the topdown predictive parsing method, i.e. an LL(1) grammar.
2. FIRST and FOLLOW sets:
2.1. Derive the FIRST and FOLLOW sets for each non-terminal in your transformed
grammar.
3. Design:
3.1. Give a brief overview of the overall structure of your solution, as well as a brief
description of the role of each component of your implementation.
4. Implementation Tools:
4.1. Identify all the tools/libraries/techniques that you have used in your analysis or
implementation and justify why you have used these particular ones as opposed to
others.
Implementation
• Parser: Implement a predictive parser (recursive descent or table-driven) for your modified
set of grammar rules. The result of the parser should be the creation of a tree data structure
representing the parse tree as identified by the parsing process. This tree will become the
intermediate representation used by the two following assignments.
• Derivation Output: Your parser should write to a file the derivation that proves that the source
program can be derived from the starting symbol.
• Error-Reporting: Parser should properly identify all errors in the input program and print a
meaningful message to the user for each error encountered. The error messages should be
informative on the nature of the errors, as well as the location of errors in the input file.
• Error Recovery: The parser should implement an error recovery method that permits to
report all errors present in the source code.
• Test Cases: Write a set of source files that enable to test the parser for all syntactical
structures involved in the language. Include cases testing for a variety of different errors to
demonstrate the accuracy of your error reporting and recovery.
• Note that the grammar analysis/transformation process can be greatly simplified by the use
of tools such as the kfgEdit AtoCC tool.
Part 2: Syntax Analysis
Pg. 05
Grammar:
The syntactical definition is using the following conventions:
• Terminals (lexical elements, or tokens) are represented in single quotes ‘thisWay’.
• Non-terminals are represented in italics, thisWay.
• The empty phrase is represented by EPSILON.
• EBNF-style repetition notation is represented using curly brackets {This way}. It
represents zero or more occurrence of the sentential form enclosed in the brackets.
• EBNF-style optionality notation is represented using square brackets [This way]. It
represents zero or one occurrence of the sentential form enclosed in the brackets.
• The non-terminal is the starting symbol of the grammar.
Except from the EBNF constructs, the grammar is expressed using the syntax used by the
kfgEdit AtoCC toolkit application.
prog -> {classDecl} {funcDef} ‘main’ funcBody ‘;’
classDecl -> ‘class’ ‘id’ [‘:’ ‘id’ {‘,’ ‘id’}] ‘{‘ {varDecl} {funcDecl} ‘}’ ‘;’
funcDecl -> type ‘id’ ‘(‘ fParams ‘)’ ‘;’
funcHead -> type [‘id’ ‘sr’] ‘id’ ‘(‘ fParams ‘)’
funcDef -> funcHead funcBody ‘;’
funcBody -> ‘{‘ {varDecl} {statement} ‘}’
varDecl -> type ‘id’ {arraySize} ‘;’
statement -> assignStat ‘;’
| ‘if’ ‘(‘ expr ‘)’ ‘then’ statBlock ‘else’ statBlock ‘;’
| ‘for’ ‘(‘ type ‘id’ assignOp expr ‘;’ relExpr ‘;’ assignStat ‘)’ statBlock ‘;’
| ‘read’ ‘(‘ variable ‘)’ ‘;’
| ‘write’ ‘(‘ expr ‘)’ ‘;’
| ‘return’ ‘(‘ expr ‘)’ ‘;’
assignStat -> variable assignOp expr
statBlock -> ‘{‘ {statement} ‘}’ | statement | EPSILON
expr -> arithExpr | relExpr
relExpr -> arithExpr relOp arithExpr
arithExpr -> arithExpr addOp term | term
sign -> ‘+’ | ‘-‘
term -> term multOp factor | factor
factor -> variable
| functionCall
| ‘intNum’ | ‘floatNum’
| ‘(‘ arithExpr ‘)’
| ‘not’ factor
| sign factor
variable -> {idnest} ‘id’ {indice}
functionCall -> {idnest} ‘id’ ‘(‘ aParams ‘)’
idnest -> ‘id’ {indice} ‘.’
| ‘id’ ‘(‘ aParams ‘)’ ‘.’
indice -> ‘[‘ arithExpr ‘]’
arraySize -> ‘[‘ ‘intNum’ ‘]’
type -> ‘integer’ | ‘float’ | ‘id’
fParams -> type ‘id’ {arraySize} {fParamsTail} | EPSILON
aParams -> expr {aParamsTail} | EPSILON
fParamsTail -> ‘,’ type ‘id’ {arraySize}
aParamsTail -> ‘,’ expr
assignOp -> ‘=’
relOp -> ‘eq’ | ‘neq’ | ‘lt’ | ‘gt’ | ‘leq’ | ‘geq’
addOp -> ‘+’ | ‘-‘ | ‘or’
multOp -> ‘*’ | ” | ‘and’
Part 2: Syntax Analysis
Pg. 06
Tokens
Keywords:
Operators:
main | class |
if | then | else | for | read | write | return |
integer | float
eq (==) | neq () | lt ()
| leq (=)
+
||*
|/
not (!)
| and (&&)
| or (||)
=
sr (::)
Punctuation:
: | , | . | ; | [ | ] | { | } | ( | )
id:
floatNum:
intNum:
follows specification for program identifiers found in Part 1
follows specification for float literals found in Part 1
follows specification for integer literals found in Part 1
Part 3: Semantic Analysis
Pg. 07
Part 3: Semantic Analysis
In this section, you have to implement a semantic analyzer for the language described in Part 2.
This in turn implies implementation of two interrelated sub-phases:
1. Implementing semantic actions for the generation of symbol tables
2. Implementing semantic checking and type-checking semantic actions.
Some of the characteristics of the language semantics and the tasks required for their
implementation/documentation are as follows:
• The symbol table helps resolving the typing and cross-referencing of the identifiers used in
the program, taking into consideration the typing and scoping of the identifiers as expressed
in the analyzed program.
• The symbol table contains an entry for all identifiers (variables, functions, classes) defined
in its own scope. There are scopes for each class definition, free or member function
definition, and a global scope for the whole program.
• A program includes a set of free functions including only one main function. Information
about the free functions must be available in the symbol table before function calls can be
semantically verified. In order to allow functions to be called before they are defined, you
need to implement at least two passes:
o Pass 1: Constructs the symbol tables
o Pass 2: Uses the information generated in the first pass.
• Classes represent the encapsulation of a user-defined data type and declarations for its
member functions. The member functions are all defined before the free functions. As with
free functions, two passes (as described above) have to be used to allow a class to refer to
another class that is declared after it. As member function are only declared in the class
declaration and defined later, two passes are necessary to link the function declaration’s
symbol table entry with its corresponding local symbol table.
• If a class has an inheritance list, the symbol table of its directly inherited class(es) should be
linked in this class, so that inherited members are effectively considered as members of the
class, even though they are part of a separate scope. Inherited members with the same
name and kind (i.e. variable or function) as a class member should be shadowed by the
class member. In any case, circular class dependencies must be reported a semantic error.
• It is allowed to have two members with the same name in two different classes or functions,
as they are not defined in the same scope.
• There is no point in checking for indexes out of bounds, as indexes can be expressions,
whose value cannot be determined at compile time.
• Variables declared inside functions (local variables) or classes (data members) are
considered local and thus can only be used in the current function or class scope. Data
members can be used in all member functions of their respective class. This requires a
nested symbol table structure:
o The global symbol table, representing all the symbols defined in the global scope,
exists until the end of the compilation process.
o All local symbol tables represent sub-scopes and should be bound to their respective
elements in the current symbol table.
o A local symbol table is created at the beginning of the compilation of any function or
class. Therefore, you have to associate with each variable and function identifier a
symbol table record that contains its properties, and include it in the symbol table
representing the scope in which this identifier is declared. You have to keep in mind
that you might have to change the structure of these records later in the design of
Part 3: Semantic Analysis
Pg. 08

code generation phase of the compiler. Make sure you can change the record
structure with minimal changes to your symbol table manipulating functions.
When using operators in expressions, attribute migration must be done to determine the type
of sub-expressions. For simplicity of code generation later, it is suggested that it should be
semantically invalid to have operands of arithmetic operators to be of different types. For the
same reason, both operands of an assignment must be of the same type.
Deliverables:
Implementation
1. Symbol table creation phase:
1.1. Implement the data structures and functions for the symbol tables so that they can
support the type checking semantic actions.
1.2. Using syntax-directed translation, implement AST traversal that activates semantic
actions so that:
1.2.1.A new table is created at the beginning of the program for the global scope.
1.2.2.A new entry is created in the global table for each class declared in the program.
These entries should contain links to local tables for these classes.
1.2.3.An entry in the appropriate table is created for each variable defined in the
program, i.e. class data members or function’s local variables.
1.2.4.An entry in the appropriate table is created for each function definition (free
functions and member functions). These entries should be links to local tables for
these functions.
1.2.5.During symbol table creation, there are some semantic errors that are detected
and reported, such as multiply declared identifiers in the same scope, as well
warnings for shadowed inherited members.
1.2.6.All declared member functions should have a corresponding function definition.
1.2.7.The content of the symbol tables should be output into a file in order to
demonstrate their correctness/completeness.
1.2.8.In functions, (including member functions in classes) variables should be defined
before being used in the statements. If not, an “undeclared variable” error
message is issued.
1.2.9.An identifier cannot be declared twice in the same scope. In such a case, a
“multiply declared identifier” message is issued.
2. Semantic/type checking phase:
2.1. Using syntax-directed translation, implement AST tree traversal that triggers semantic
actions so that the following semantic checks are done, which should result in error
reporting if the check fails:
2.1.1. Type checking is applied on expressions, as well as on assignment and return
statements.
2.1.2. Any id referred to is defined in the scope where it is used (failure results in the
following error messages: undefined local variable, undefined free function,
undefined member, undefined class)
2.1.3.Function calls are made with the right number and type of parameters.
Expressions passed as parameters in a function call must be of the same type as
declared in the function declaration.
Part 3: Semantic Analysis
Pg. 09
2.1.4. Referring to an array variable should be made using the same number of
dimensions as declared in the variable declaration. Expressions used as an
index must be of integer type.
2.1.5.Circular class dependencies (through data members\inheritance) are reported as
semantic error.
2.1.6.The “.” operator should be used only on variables of a class type. If so, its right
operand must be a member of that class. If not, a “undefined member” should be
issued.
2.1.7. The type of the operand of a return statement must be the same as declared in
its function’s return type, as declared in the function’s declaration.
3. Error reporting:
3.1. All semantic errors/warnings present in the entire program should be reported
properly, mentioning the location in the program source file where the error was
located.
3.2. Only existing errors should be reported.
3.3. Errors should be reported in synchronized order, even if different phases are
implemented and errors are found in different phases.
4. Output of symbol tables data structure: After the symbol table structure has been
completely created, the program should output it to a file, allowing to easily verify if the
symbol table correctly corresponds to the input program.
Documentation
1. Analysis:
1.1. Description of the semantic rules used in the implementation (see above).
2. Design:
2.1. Description of the purpose of each phase involved in the implementation. For each
phase, mapping of semantic actions to AST nodes, along with a description of the
effect/role of each semantic action.
2.2. Justification of the overall structure of the solution and the roles of the individual
components used in the applied solution.
3. Implementation Tools:
3.1. Description of tools/libraries/techniques used in the analysis/implementation.
3.2. Successful use of tools/libraries/techniques used in the analysis/implementation.
Example Symbol Table:
Part 4: Code Generation
Pg. 10
Part 4: Code Generation
In this section, you have to implement a code generation phase for the language as described
in sections 2 and 3. Below is a list of different specific constructs/concepts for which code
generation needs to be implemented for all the aspects of the language to become executable:
1. Memory allocation: As variables are declared, memory space must be allocated that will
eventually be used to store their values. Memory cells are either identified using a unique
Moon assembly code label, or an offset relative to the stack frame on which the variable will
reside during execution of the scope in which it is declared. The size of the allocated memory
block should depend on the declared type/kind of variable (integer, float, array, object, array
of objects, etc.)
1.1. Allocate memory for basic types (integer, float).
1.2. Allocate memory for arrays of basic types.
1.3. Allocate memory for objects.
1.4. Allocate memory for objects with inheritance.
1.5. Allocate memory for objects having object members.
1.6. Allocate memory for arrays of objects.
2. Functions: The code generated for functions should be associated with a Moon assembly
code subroutine, i.e. a block of code that can be jumped to using a label, then when the
function resolves, jump back to the instruction following the initial jump. Parameters should
be passed/received during the function call. The return value should be passed to the calling
function when the function resolves. In order to allow recursive function calls, a function call
stack mechanism needs to be implemented, which implies that memory allocation for
variables must all be done using offsets relative to the bottom of the current function’s stack
frame. If a call to a member function is made, the member function should have access to
the data members of the object from which it was called.
2.1. Branch to a function’s code block, execute the code, branch back to the calling function.
2.2. Branch to a function that has been branched upon.
2.3. Pass parameters as local values to the function’s code block.
2.4. Upon function resolution, pass the return value back to the calling function.
2.5. Function call stack mechanism and recursive function calls.
2.6. Call to member functions.
2.7. Call to deeply nested member function.
3. Statements: Implementation of Moon code for every kind of statement as defined in the
grammar: assignment conditional statement, loop statement, input/output statement, return
statement. Correct implementation the specific branching mechanisms for control flow
statements.
3.1. Assignment statement: correct assignment of the resulting value of an expression to a
variable, independently of what is the expression.
3.2. Conditional statement: correct implementation of branching mechanism, including for
imbricated conditional statements.
3.3. Loop statement: correct implementation of branching mechanism, including for
imbricated loop statements.
3.4. Input/output statement: execution of a read() statement should result in the user being
prompted for a value from the keyboard by the Moon program, and assign this value to
the parameter passed to the read() statement. Execution of a write() statement should
print to the Moon console the result of evaluating the expression passed as a parameter
to the write() statement.
Part 4: Code Generation
Pg. 11
3.5. Return statement: execution of the return statement should result in passing the value
of the expression passed to the return statement back to the calling function.
4. Aggregate data members access: Aggregate data types such as arrays and objects contain
a group of data values. Code must be generated so that contained member values in such
an aggregated value can be accessed when referred to as factors in an expression, or the
left-hand side of an assignment statement. This includes access to array elements and
object members.
4.1. For arrays of basic types (integer and float), access to an array’s elements, single or
multidimensional.
4.2. For arrays of objects, access to an array’s elements, single or multidimensional.
4.3. For objects, access to members of basic types.
4.4. For objects, access to members of array types, as well as the elements of the array.
4.5. For objects, access to members of object types, as well as the elements of this object.
4.6. For objects, access to deeply nested objects. 4.7 For objects, access to the members
of a superclass.
5. Expressions: Computing of the resulting value of an entire expression, including the simple
case when an expression is either a variable name of even a single literal value, up to
complex expressions involving a kind of operators, array indexed with expressions, deeply
nested object members, etc. This involves memory allocation for temporary results, register
allocation/deallocation
5.1. Computing the value of an entire expression.
5.2. Expression is a single variable or literal value.
5.3. Expressions involving all of: arithmetic, relational and logic operators in one expression
5.4. Expression involving an array factor whose indexes are themselves expressions.
5.5. Expression involving an object factor referring to deeply nested object members.
5.6. Memory allocation for temporary results.
5.7. Register allocation/deallocation scheme
Deliverables:
Documentation
1. Analysis
1.1. Description of the register allocation/deallocation scheme used in the implementation.
1.2. Description of the memory usage scheme used in the implementation (tag-based,
stack-based), including for temporary variables, function calls (free functions or member
functions, data members, and calculation of variables’ allocated memory).
1.3. Description of the purpose of each phase involved in the implemented code generation.
For each phase, mapping of semantic actions to AST nodes, along with a description
of the effect/role of each semantic action.
2. Design
2.1. Description/rationale of the overall structure of the solution and the roles of the
individual components used in the applied solution.
3. Implementation Tools
3.1. Description of tools/libraries/techniques used in the analysis/implementation.
3.2. Successful use of tools/libraries/techniques used in the analysis/implementation.

Purchase answer to see full
attachment

Share This Post

Email
WhatsApp
Facebook
Twitter
LinkedIn
Pinterest
Reddit

Order a Similar Paper and get 15% Discount on your First Order

Related Questions

Description Writing the final training report according to the attached instructions  Major: E-commerce,  Below you will find the directed tasks that

Description Writing the final training report according to the attached instructions  Major: E-commerce,  Below you will find the directed tasks that I perform in this training, and you can add some that are related to the Major. Collaborating with sellers to gather accurate and comprehensive product information, including descriptions, specifications,