Justin Fletcher's DCPU-16 assembler
===================================
Introduction
------------
'dasm.pl' is a DCPU-16 assembler and disassembler kit. The DCPU-16 instruction
set v1.1 is implemented, with a few extensions. This is not a macro assembler,
yet. The assembler supports the syntax given as examples as part of the
specification.
Usage
-----
To assemble a file, use:
./dasm.pl .dasm .dump
The dump format produced is a linked textual dump, in the same format as that
given in the DCPU-16 specification.
To disassemble a file, use:
./dasm.pl .dump .dasm
The output disassembly file should be able to be reassembled.
The internal structures used by the file can be displayed:
./dasm.pl .dasm
Source assembly file format
---------------------------
The source assembly format is given as an example in the DCPU-16
specification. Operands (called 'values' in the specification) take
the form specified in the specification, with the exception that the [SP]
type specification is not supported (PUSH, PEEK and POP are far clearer).
The following format is expected:
;
- comment lines, which will be ignored.
Comments may also be appended to all lines
.
- directives to the assembler, outside of the specification of the
CPU. See below.
:
- address labels may appear at the start of any line, or on their
own line and will refer to the current assembly location.
,
- the basic opcodes support two operands in line with the specification.
Extended opcodes ('non-basic' in the specification) at present only
support a single operand, although the bulk of the instruction space
is free so may vary.
- special opcodes are supported by this assembler, which allow literal
values to be written to the code. These are not defined by the DCPU
specification, see below.
:Directives:
Directives indicate to the assembler how the code should be processed. The
following directives are supported at present:
.ORIGIN
- change the assembly location to the address given.
.INCLUDE "file"
- include a file at the current location.
The file's format is determined by its extension, so it is possible
to mix source file formats (eg include an address dump within the
source file being assembled, or a second assembler file).
:Special opcodes:
Special 'opcodes' provide shortcuts or necessary additions to the assembly
format. The 'opcodes' supported at present are:
DAT [,]*
- Writes explicit values to the core at the current assembly location.
This 'opcode' is required such that disassembled code which contains
undefined code without valid opcode translations can still be
reassembled.
:Expressions:
Anywhere that a value is required, an expression may be used. Expressions
can include brackets, but do not obey precedence other than the use of
brackets themselves. (Poor code on my part - maybe I should have thought
more carefully about it). Expressions may use symbols as parts of addition.
For example, the following code fragment is valid:
SET pc, label+1
:label
DAT 0
SET pc, pc ; continues from here, skipping the 0
Address file format
-------------------
Address file format is a simple memory dump format which can be used for
input or output. It is based on the format used in the DCPU-16
specification. The start of any line can be specified as:
:
- indicates the address that assigned values start at.
where no address is supplied, the values to be stored will continue
from the previous address, or 0 if no address is given.
Following this, a whitespace separated list of hex values may be given, to
be stored in ascending order from the address specified.
When output, values will be written in maximum of 8 word blocks, terminating
at a 8-word address multiple.
This format is used as part of the regression test to check that values are
assembled correctly.
Structure
---------
The assembler (and disassembler) are an IO handler provided in the DASMIO
directory. These modules are able to manipulate the core in the DASM objects
in order to provide details. There's a lot of collusion in the code and I've
not tidied it up as much as I'd like. Ideally the DASM module should provide
methods to access its innards, rather than the IO modules poking them
directly.
Additional file formats can be created by adding new modules in the same
directory. In order to read or write different file formats the module
should store values (or read from) the core and update the symbol table
and relocations hashrefs to contain relevant data. I've only supported very
simple additive relocation, which should be sufficient for partial linking
of relocatable code segments (however this hasn't been started yet - it might
need more thought).
Although the intention is that symbols be able to be relocated (for address
symbols) or fixed (for constant symbols), the relocation has not been
implemented, and the definition of constant symbols is not implemented.
The nesting of include files are tracked so that errors are reported by the
files that caused them and those that included the files.
Example code
------------
Some small example code is provided in the 'examples' directory, as the
beginnings of a regression test for the assembler (and dumper). In order
to test the code, use:
make test
Future developments
-------------------
A future version of the assembler may provide the following features:
* Register aliasing (eg replacing 'Z' with 'SB' or similar)
* Implicit workspace references (eg 'memlimit' meaning '[24 + Z]',
as might be used to implement a static base or workspace register).
* Macro support (eg making 'MOV a,b' mean 'SET a,b', or 'RTS' mean
'SET PC, POP')
* Conditional assembly directives (eg .IF, .ELSE, etc) (although we
could just use the C pre-processor or m4).
* Constant definitions.
* Partially link content support.
* Other IO formats, depending on what other assemblers and tools use.
* VM interpreter.
I have intentionally not looked at any other implementations or information
other than the CPU documentation, so it is possible that other people have
gone in completely different directions.
License
-------
The DASM assembler kit is released under the simplified BSD license because
it provides the greatest degree of freedom in using the source, unlike more
restrictive licenses such as the GPL. Hopefully others will feel the same
way and keep the code free.
Copyright (c) 2012, Justin Fletcher
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.