Tutorial

From a user perspective the interesting modules are parse_asl_file which exports functionality to parse the different kinds of asl files and the asl2c which exports a function (asl_to_c()) that can convert pseudocode ASL to C code. In this short tutorial we will look at how to integrate both modules in order to produce an c interpreter for the instruction set discussed in the introduction section.

For reference, the decoder file is:

__decode arithmetic
    case (30 +: 2) of
        when ('00') => __UNUSED
        when ('01') =>
            __field op 29 +: 2
            __field operand1 8 +: 8
            __field operand2 0 +: 8
            case (op, operand1, operand2) of
                when ('10', _, _) => __encoding subtract
                when ('11', _, _) => __encoding add
        when ('10') =>
            __field op 24 +: 8
            __field operand 16 +: 8
            case () of
                when () => __encoding increment
        when ('11') => __UNUSED

and the instruction file is:

__instruction addition
    __encoding add
        __decode
            integer op1 = UInt(operand1);
            integer op2 = UInt(operand2);

    __encoding increment
        __decode
            integer op1 = UInt(operand);
            integer op2 = 1;

    __execute
        integer result = op1 + op2;

__instruction subtraction
    __encoding subtract
        __decode
            integer op1 = UInt(operand1);
            integer op2 = UInt(operand2);

    __execute
        integer result = op1 - op2;

At first we will parse the instruction file in order to get the decode and execute code associated with each instruction. All we have to do here is fill two maps, one that maps from encoding name to decoder code, the other that maps from encoding name to execute code. Usually what you want to do there is just save the ASL code and transform it later once the context and the available variable names are known, but here the example so simple that we can do the transformation right in the listener of the instruction file.

In order to get the c code from the ASL, we will use the asl_to_c() function.

The instruction listener would probably look something like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
class InstrListener(NopInstrsListener):

    def __init__(self):
        self.encodings = []
        self.decode_map = {}
        self.execute_map = {}

    def listen_instruction(self, name):
        self.encodings = []
        return True

    def listen_encoding(self, name):
        self.encodings.append(name)
        return True

    def listen_decode(self, code):
        vars, c_code = asl_to_c(code, [])
        decls = []
        for varname, var in vars.items():
            if not var[0] and var[1] is not None and var[2] is not None:
                if var[1].type == ASLType.Kind.int:
                    decls.append("int64_t {0} = {1};"
                                 .format(varname, var[2]))
        c_code = decls + c_code
        self.decode_map[self.encodings[-1]] = c_code

    def listen_execute(self, code):
        # Here we don't expect any declarations
        c_code = asl_to_c(code, [("result", 64)])[1]
        for encoding in self.encodings:
            self.execute_map[encoding] = c_code

As documented, asl_to_c returns an array of c code lines which we store in the corresponding map. In case of listen_decode() some of the variables might be initialized to constants and therefor don’t appear in the C code. Those variables need to be added manually to the code. Here we pull a shortcut assuming all variables are of type integer.

Now, all that is left is to parse the decoder using another listener and generating code this time.

The decoder listener looks something like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
class DecoderListener(NopDecodeListener):

    def __init__(self, decodemap, executemap):
        self.decode_map = decodemap
        self.execute_map = executemap
        self.fields_stack = []

    def listen_case(self, fields):
        self.fields_stack.append([f.name if f.name
                                  else "((word >> {0}) & ((1 << {1}) - 1))"
                                  .format(f.start, f.run)
                                  for f in fields])
        return True

    def after_listen_case(self, fields):
        print('{\nassert(0);\n}')
        self.fields_stack.pop()

    def listen_when(self, values):
        components = []
        for i, value in enumerate(values):
            if value.value is not None:
                components.append("{0} == {1}"
                                  .format(self.fields_stack[-1][i],
                                          int(value.value, 2)))

        if not components:
            print("if (1) {")
        else:
            print("if (({0})) {{".format(") && (".join(components)))
        return True

    def after_listen_when(self, values):
        print("} else")

    def listen_field(self, name, start, run):
        print("u_int64_t {0} = (word >> {1}) & ((1 << {2}) - 1);"
              .format(name, start, run))

    def listen_encoding(self, name):
        print("\n".join(self.decode_map[name]))
        print("\n".join(self.execute_map[name]))

    def listen_unused(self):
        print('assert(0);')

This leaves us with the code that puts all the pieces together. In essence this generates a static prefix and a suffix, between which the generated code is emitted. Note that the generated code is quite unreadable as it doesn’t use indentation. By looking at the depth of the field stack, the appropriate amount of indentation can be inferred.

It would probably look something like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
prefix = """#include <stdio.h>
#include <sys/types.h>
#include <assert.h>

#define UInt(num) ((u_int64_t)num)

int main() {
int64_t result;
u_int32_t word = 0x80FE0000;
"""
print(prefix)

l1 = InstrListener()
parse_asl_instructions_file('arithmetic-instrs.asl', l1)

l2 = DecoderListener(l1.decode_map, l1.execute_map)
parse_asl_decoder_file('arithmetic-decoder.asl', l2)

suffix = """printf("0x%lx\\n", result);
return 0;
}
"""
print(suffix)

Despite taking some shortcuts here and there, we have seen a complete example of generating an interpreter for the small instruction set. The generated code will print out 0xFF.