Tutorial¶
From a user perspective the interesting modules are parse_asl_file
which exports functionality to parse the different kinds of asl files and the
asl2c
which exports a function (asl_to_c()
) that can convert
pseudocode ASL to C code. In this short tutorial we will look at how to
integrate both modules in order to produce an c interpreter for the instruction
set discussed in the introduction section.
For reference, the decoder file is:
__decode arithmetic
case (30 +: 2) of
when ('00') => __UNUSED
when ('01') =>
__field op 29 +: 2
__field operand1 8 +: 8
__field operand2 0 +: 8
case (op, operand1, operand2) of
when ('10', _, _) => __encoding subtract
when ('11', _, _) => __encoding add
when ('10') =>
__field op 24 +: 8
__field operand 16 +: 8
case () of
when () => __encoding increment
when ('11') => __UNUSED
and the instruction file is:
__instruction addition
__encoding add
__decode
integer op1 = UInt(operand1);
integer op2 = UInt(operand2);
__encoding increment
__decode
integer op1 = UInt(operand);
integer op2 = 1;
__execute
integer result = op1 + op2;
__instruction subtraction
__encoding subtract
__decode
integer op1 = UInt(operand1);
integer op2 = UInt(operand2);
__execute
integer result = op1 - op2;
At first we will parse the instruction file in order to get the decode and execute code associated with each instruction. All we have to do here is fill two maps, one that maps from encoding name to decoder code, the other that maps from encoding name to execute code. Usually what you want to do there is just save the ASL code and transform it later once the context and the available variable names are known, but here the example so simple that we can do the transformation right in the listener of the instruction file.
In order to get the c code from the ASL, we will use the asl_to_c()
function.
The instruction listener would probably look something like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | class InstrListener(NopInstrsListener):
def __init__(self):
self.encodings = []
self.decode_map = {}
self.execute_map = {}
def listen_instruction(self, name):
self.encodings = []
return True
def listen_encoding(self, name):
self.encodings.append(name)
return True
def listen_decode(self, code):
vars, c_code = asl_to_c(code, [])
decls = []
for varname, var in vars.items():
if not var[0] and var[1] is not None and var[2] is not None:
if var[1].type == ASLType.Kind.int:
decls.append("int64_t {0} = {1};"
.format(varname, var[2]))
c_code = decls + c_code
self.decode_map[self.encodings[-1]] = c_code
def listen_execute(self, code):
# Here we don't expect any declarations
c_code = asl_to_c(code, [("result", 64)])[1]
for encoding in self.encodings:
self.execute_map[encoding] = c_code
|
As documented, asl_to_c returns an array of c code lines which we store in the
corresponding map. In case of listen_decode()
some of the variables might
be initialized to constants and therefor don’t appear in the C code. Those
variables need to be added manually to the code. Here we pull a shortcut
assuming all variables are of type integer.
Now, all that is left is to parse the decoder using another listener and generating code this time.
The decoder listener looks something like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | class DecoderListener(NopDecodeListener):
def __init__(self, decodemap, executemap):
self.decode_map = decodemap
self.execute_map = executemap
self.fields_stack = []
def listen_case(self, fields):
self.fields_stack.append([f.name if f.name
else "((word >> {0}) & ((1 << {1}) - 1))"
.format(f.start, f.run)
for f in fields])
return True
def after_listen_case(self, fields):
print('{\nassert(0);\n}')
self.fields_stack.pop()
def listen_when(self, values):
components = []
for i, value in enumerate(values):
if value.value is not None:
components.append("{0} == {1}"
.format(self.fields_stack[-1][i],
int(value.value, 2)))
if not components:
print("if (1) {")
else:
print("if (({0})) {{".format(") && (".join(components)))
return True
def after_listen_when(self, values):
print("} else")
def listen_field(self, name, start, run):
print("u_int64_t {0} = (word >> {1}) & ((1 << {2}) - 1);"
.format(name, start, run))
def listen_encoding(self, name):
print("\n".join(self.decode_map[name]))
print("\n".join(self.execute_map[name]))
def listen_unused(self):
print('assert(0);')
|
This leaves us with the code that puts all the pieces together. In essence this generates a static prefix and a suffix, between which the generated code is emitted. Note that the generated code is quite unreadable as it doesn’t use indentation. By looking at the depth of the field stack, the appropriate amount of indentation can be inferred.
It would probably look something like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | prefix = """#include <stdio.h>
#include <sys/types.h>
#include <assert.h>
#define UInt(num) ((u_int64_t)num)
int main() {
int64_t result;
u_int32_t word = 0x80FE0000;
"""
print(prefix)
l1 = InstrListener()
parse_asl_instructions_file('arithmetic-instrs.asl', l1)
l2 = DecoderListener(l1.decode_map, l1.execute_map)
parse_asl_decoder_file('arithmetic-decoder.asl', l2)
suffix = """printf("0x%lx\\n", result);
return 0;
}
"""
print(suffix)
|
Despite taking some shortcuts here and there, we have seen a complete example of generating an interpreter for the small instruction set. The generated code will print out 0xFF.