Introduction

What is ASL?

ASL stands for ARM Specification Language. When the ARM instruction set specification was released in machine readable format they also released tools to parse it and generate files in an intermediate language (think XML, but more readable) that is very succinct and easy to understand. This language, called ASL, happens to be a superset of the language they already use for the pseudocode that describes what the individual instructions do. Confusingly, the pseudocode language will also be referred to as ASL (sometimes it is also referred to as “pseudocode ASL” or simply pseudocode).

The structure of ASL is quite similar to python in that it uses indentation to to structure the code.

In order to be useful the ASL file can have different structures.

Decoder files

For instance there is the decoder asl file which describes the decoding tree of an instruction. The top level directive of a decoding tree is __decode which specifies the “instruction set”. Inside of that there are nested case statements (each with a couple of when cases) which cover all the possible encodings. In the last leaf when statements it is either specified that the encoding is invalid or what the encoding is called.

Instruction files

Another type of asl file is an instruction file where all the encodings are described in more detail. For instance each encoding can have associated pseudocode for encoding and decoding. Also encodings can be grouped into instructions which share pseudocode for encoding/decoding and pseudocode that describes the execution of the instruction.

For examples of how these files look, please take a look at the next section.

Example files

Lets look a at simple example of an encoding for three instructions:

  • add (opcode: 01100000)
  • increment (opcode: 10000000)
  • subtract (opcode: 01000000)

We are decoding a four byte word. Increment takes the second byte as an operand. Add and subtract take the last two bytes as operands.

The example decoder file would look like this:

__decode arithmetic
    case (30 +: 2) of
        when ('00') => __UNUSED
        when ('01') =>
            __field op 29 +: 2
            __field operand1 8 +: 8
            __field operand2 0 +: 8
            case (op, operand1, operand2) of
                when ('10', _, _) => __encoding subtract
                when ('11', _, _) => __encoding add
        when ('10') =>
            __field op 24 +: 8
            __field operand 16 +: 8
            case () of
                when () => __encoding increment
        when ('11') => __UNUSED

Since increment and add some functionality they can be grouped together in the instruction file.

For our hypoghetical instruction set the instructions file would look something like:

__instruction addition
    __encoding add
        __decode
            integer op1 = UInt(operand1);
            integer op2 = UInt(operand2);

    __encoding increment
        __decode
            integer op1 = UInt(operand);
            integer op2 = 1;

    __execute
        integer result = op1 + op2;

__instruction subtraction
    __encoding subtract
        __decode
            integer op1 = UInt(operand1);
            integer op2 = UInt(operand2);

    __execute
        integer result = op1 - op2;

Processed ASL

Python-like indentation cannot be described by context-free grammars. This seems to be a problem since context-free grammars are the tool for the job to describe programming languages. Fortunately, python-like indentation can be turned into something that allows description via context-free grammars via a pre-processing step. This step inserts extra “BEGIN” and “END” tokens every time the indentation is increased or decreased respectively. Additionally for easier handling newlines are replaced with a NEWLINE token. ASL that has been pre-processed in this way is referred to “processed ASL” in the documentation.