UASM - Universal Cross Assembler Custom Computer Consultants 10 April 1986 UASM - Universal Cross Assembler 8051 6805 Z8 10 April 1986 1. INTRODUCTION General Description UASM is a program which facilitates the writing of assemblers for single chip microcomputers. It has only the rudimentary features of decoding the instruction set and a few pseudo operations. It is intended for programs that can be contained in a single file which is typical of the short programs written for these types of processors. In particular there are no provisions for macros or include files. The performance when generating both an object file and a listing is about 700 statements per minute when using a hard disk. Program Execution UASM has several different names depending on which processor it has been built for. It is invoked by entering the name of the program ( asm05 for the 6805 version) followed by the name of the source file followed by the option switches. The allowable options are at the present time are : o creates an object output file from the name of the source file and the extension ".HEX". l creates a listing file from the name of the source file and the extension ".LST". n suppress the listing function. If the l and the n options are both absent then the listing goes to the terminal. If no source file is specified then the input is taken from the terminal and ouput is written to the terminal. The assembler is entered directly into pass 2 with no symbol table. This feature was mainly used for program debugging and has not been removed. Program Operation UASM is a two pass assembler. On the first pass it reads statements from the input file and builds a symbol table. On the second pass it reads the statements from the input file and creates the object output file and the listing file using the symbol table information it created in the first pass. At the end of the listing file the content of the symbol table is added. The object ouput is in Intel Hex format with a maximum of 34 bytes to a record. When displayed on an 80 column screen the records will exactly fill on line. The listing file is formatted with 55 statement lines and four header lines per page and assumes 11 inch pages and 6 lines per inch. This leaves a combined top and bottom margin of seven lines. Files The UASM system was written in the C language and consists of the following set of files : STDIO.H Standard I/O Header File UASM.H UASM header file; same for all versions UASM.C Source code for the part common to all the versions. 51SYM.C Symbols and processing routines for the 8051. 6805SYM.C Symbols and processing routines for the 6805. Z8SYM.C Symbols and processing routines for the Z8. ASM51.EXE 8051 assembler execution file. ASM05.EXE 6805 assembler execution file. ASMZ8.EXE Z8 assembler execution file. 51VAL.MAC 8051 validation/example file. 6805VAL.MAC 6805 validation/example file. Z8VAL.MAC Z8 validation/example file. UASM.DOC This File 2. Assembler Conventions Statement Format The primary component in an assembly language program is the statement, which consists of an instruction and its operands. The instruction specifies and action to be taken and the operands specify the data to be acted upon. An assembly language statement can include four fields: Statement Labels Instructions or Pseudo Operations Operands Comments The statement label and comments are optional. The statement may have zero or more operands depending on the instruction. The fields in a statement are separated from each other by one or more delimiters. Delimiters include spaces, tabs, commas, semi-colons, colons, newlines. To accomodate various programming styles the rules on delimiters are very relaxed. In general blanks separate fields and commas separate operands. Colons after labels are optional and delimiters for the comment field are also optional. This can lead to problems with evaluating expressions if the first word in a comment happens to be a recognizeable symbol. Therefore use of a comment delimiter (";") is strongly recommended. Comments may occupy an entire line when the first character is either a semi-colon (";") or an asterisk ("*"). Program Labels and Identifiers A program label is any identifier that begins in the first position of a line. An identifier is any string of alphanumeric characters whose first character is alphabetic. A program label may be followed by an optional colon. A character is alphbetic if it is either upper case or lower case. A character is alphanumeric if it is either alphabetic or numeric or underscore. upper case = { A B C D E F G H I J K L M N O P Q R S T U V W X Y Z } lower case = { a b c d e f g h i j k l m n o p q r s t u v w x y z } numeric = { 0 1 2 3 4 5 6 7 8 9 } alphabetic = { upper case } or { lower case } alphanumeric = { alphabetic } or { numeric } or { _ } identifier = { alphabetic } followed by one or more { alphanumeric } program label = identifier in first position of a line Each time an identifier is used it must be written in exactly the same way or it will be considered a different identifier. Identifiers may be any length but only 31 characters are significant. The following are valid and unique identifiers: begin BEGIN bEgIn label12 LABEL12 lAbEl12 group1: GROUP1: gRoUp1: In addition to their statement labeling function identifiers also represent constants and variable names. They may even define a value for the symbols which represent instructions and pseudo operations, and no conflict will result. Instructions The instruction field contains a mnemonic symbol which specifies one of the processors instructions or a pseudo operation. A pseudo operation is an instruction to the assembler to perform some action. Instructions must be preceeded and followed by one or more spaces to separate them from the label and operand fields. Pseudo operations are described in the appendicies Operands Depending on the instruction there can be zero, one or more operands. Multiple operands are separated from each other by commas. Operands supply the information the instruction needs to carry out its operation. An operand can be : Immediate data The address of a location from which data is to be taken (source address) The address of a location where data is to be put(destination address) The address of a program location to which program control is to be passed. A condition code, used to direct the flow of program control. Each of the different assemblers has specific rules on the ordering of the operands. See the specific appendix for details. Comments Comments may occur by themselves on a line or be placed after the operands field. When they occur by themselves on a line the first character of the line must be a semi-colon (";") or an asterisk ("*"). When they are part of a line they may follow the operand field with or without a specdific delimiter other than blanks. The use of a delimiter is recommended if there exists the possibility of evaluating part of the comment as an expression. Constants Numeric constants may be expressed in binary, octal, decimal and hexidecimal. Decimal is the default radix for evaluation of numeric constants. The default radix may be overridden by the use of a suffix on the end of the number. The suffix is always a lower case letter. The following are valid suffixes: b - binary q - o - octal d - decimal h - x - hexidecimal The convention of using lower case letters for the radix specifier requires that hexidecimal digits for 10 through 15 be designated by the upper case letters A through F. A character constant is specified by enclosing the character in single quotes. Data Variables A data variable is a memory location which contains a variable piece of data. There can be separate memory areas for different kinds of data. The data variable name can be associated with registers, bits, internal memory, and external memory. Location Counter The value of the location counter is represented by the dollar sign. The dollar sign may be freely used in expressions. Memory Segments In the single chip microprocessors served by this assembler there are multiple address spaces. These separate address spaces are called segments and are indicated to the assembler by appropriate pseudo operations. The available segments and pseudo operations are shown in the following table: Pseudo Operation 8051 6805 Z8 CSEG Yes Yes Yes DSEG Yes Yes Yes BSEG Yes No No RSEG No No Yes XSEG Yes No Yes NOTE : Object output will be placed only in the code and external segments. Expressions and Operators Expressions are formed by combining constants and data variable names with arithmetic and logical operators. Evaluation is left to right with precedence. The operators are grouped according to the following table: HIGHEST ( ) + - ~ * / % . + - < > & ^ LOWEST | Parentheses ( ) Parentheses are used to force the evaluation of the sub expression within. They alter the normal left to right evaluation of an expression. In the following expression: A - ( B + C ) the sub-expression B + C is evaluated and then added to A. Without the parent...
fred1144