some little asm the basics of for you have phun. By PC_THE_GREAT Usage of ASM: Decompilation, Cracking, Following of compiled programs, understanding virii, for fun and of course for understanding programming better. ((( Suppose))) I suppose you have a basics knowledge of binaries , fetch - execute cycle and knowledge about registers. If not well you'll need to learn that first. (((The assembly process))) What the hell is Assembly? Assembly language was developped in order to make it easier to write, debug and maintain programs. Using mnemonic codes for each instruction rather than a binary machine code is one obvious way of making instructions more comprehensible. Another way to make programs easier to write and understand is to use symbols instead of actual machine code addresses to refer to locations storing datda or instructions. These symbols are also referred to as symbolic addresses. Thus, instead of having to remember that location134 is being used to hold, say, the radius of a circle, the coder uses an assembly language directive to reserve byte for 'radius' and the name 'radius' is then used in all the assembly language instructions in the program. e.g: _________________________________________________________ LDA Radius ; load the contents of 'Radius' into the accumulator : : HALT RADIUS: .RB1 ; directive to reserve 1 byte of store for radius. __________________________________________________________ (((Labels))) Using labels in an assembler program also makes the program easier to understand. Instead of writting JMP 1000 ; jump to the instruction held at location 1000 We would write, for e.g JMP LOOP1 ; jump to the intruction labelled LOOP1 : : LOOP1 (another inst) (((Assembler directives))) Directives are instruction to the assembler program itself, which don't have a machine language counterpart. These directives are recognized by the assembler program and action is taken at the translation time, not at run time. Different assemblers have different ways of distinguishing directives from mnemonic operation code-- preceding them by a full stop (.) or an asterisk (*) are 2 common ways. (((relocatable code))) On some smaller systems , the user has to specify exactly where in memory the program is to be loaded. On larger multiprogramming systems, however , this would be very inconvenient (trust me. . .) as the OS would always have to ensure that the program was loaded into the same place in memory. Therefore, most programs are written to be relocatable-- That is, the OS decides at run time where the program will be put in memory, and all addresses are worked out relative to the start of the program. As a particular program may be swapped in and out of memory several times before it is completed, it could be relocated several times during one run. (((Macro instruction))) A macro is a single instruction representing a group of instructions, written either by the manufacturer to preform some common task such as input and output , or by the user itself. The instructions comprising the macro may be written at the start of the program, and whenver the macro-instruction is subsequently used in the program, the corresponding instructions are inserted into it's place by the assembler. Uhmm the way I like to think of a macro is to think one macro is a Procedure name, that is when you call that single procedure name it runs all it's codes, well those having programmed in Pascal would understand it more clearly. So we can see that a macro is uhmm an example of an open subroutine, as opposed to a closed subroutine which is entered by using a call statement or equivalentinstruction , thus causing a branch to the routine. E.g of a macro to add the contents of 2 locations and store the result in a 3rd location: .MACRO ADDUP NUM1, NUM2, RES LDA NUM1 ADD NUM2 STA RES .ENDMACRO In that program if the coder wishes to add the contents of P and Q and store the results in R , he'll write ADDUP P,Q,R At Assembly, this instruction can be replaced by 3 instructions LDA P ADD Q STA R _______________________________________________ TWO-PASS Assemblers In a Two-pass assembler, the source code, containing directives, assembler mnemonics and comments , is scanned twice. On the first pass , the following tasks is performed: > Comments get removed > All symbols are put in a symbol table, giving the name and memmory address ( either absolute or relative to the start of the program) of the symbols. > Directives will be translated and executed. > Macros are replaced by the acrtual instructions which the macro represents. > any errors found will be put in an error table. <<>>> > Each menmonic code is replced with its machine code equivalent, by referring to a table of OP codes in memory \. > Each symbolic address is replaced with its equivalent machine code address by referring to the symbol table created in the 1st pass. > Decimal or character items are converted into machine code and inserted into the intructions using them > Any erros detected during this stage are reported. (((linking loaders))) A linking loader ( linker) is program which may have been assembled separately. E.g , an assembly language program may call a subroutinewhich performs a sort. The linking loader therefore has to load the sort program into memory at a particular address, and insert this address into the CALL instruction in the object code produced by the assembler. A relocating loader can load the object code anywhere in memory, provided the programmer hasused no absolute addresses and the object codeis in relocatable code. (((data transfer instructions ))) E.g Of data transfer instructions include: > Moving data from memory to a regsiter > moving data from a register to a memory or to an output unit. > Moving data from an output unit to a regsiter. Typically 2,3 or 4 mnemonics characters are used for all machine code instruction, such as MOV R1, R2 ;move content of register R2 to R1 LD R1, #32 ; load the number 32 into R1 STO X, R1 ; Store the contents of R1 in memory loaction X (((Arithmetic instructions))) Some microprocessors offer only addition and subtraction as the basic operations. Others offer a moer comprehensive set such as : ADD addition SUB Subtraction MPY multiplication DIV division INC increment DEC decrement NEG sign change ABS absolute value After an arithmetic operation has been carried out, it is often useful to be able to test the result to see whether it was say 0 or negative, or whether 'carry' or overflow occured. The status register in some processors includes 4 bits referred to as N, Z, V and C which are set to 1 or 0 depending on the result of thge previous operation, as follows: If result is negative, N = 1 If result is zero, Z = 1 If overflow occured, V = 1 If carry occured , C= 1 These bit are called status bits or condition codes. Conditional branch instructions such as BZ ( branch if zero) check the status of the relevant status bit ( often called as Flags..lol I myself prefer to call them as flag) and branch accordingly. e.g to add 2 numbers ( written in 6502 assembly language) CLC ; Clear the carry bit CLD ; Clear the decimal bit LDA LOC1 ; Load the first operand into register A ADC LOC2 ; add the second operand STA LOC3 ; Srore result in LOC3 Note that this particular asm language only has an 'add with cary' instruction, automatically adding into the register the contents of thecarry bit -- we do not want that to happen here, so we first make sure carry bit is 0. Also, there are 2 possible modes of addtion , decimal mode for BCD arithmetic and binary mode for binary arithmetic, indicated by a 'decimal bit'. We want binary arithmetic here and the decimal bit needs to be set to zerobefore the operation. Uhmm ((( Adding larger numbers))) Using only 8 bits restricts us to numbers in the range -128 to 127 qhich is clearly not enough for many apps. Frequently 2 or more registers will be used to hold each number, and it is up to the programmer to keep track of where the 'high half' and 'low half' of a 2-byte number is stored. e.g: CLC;clear the carry bit CLD;clear the decimal bit LDA LOC1 + 1;load low half of the 1st operand into register A ADC LOC2 + 1;add low half of second operand STA RES + 1;store low half of result LDA LOC1;load high half of the first operand ADC LOC2;add high half of second operand STA RES;store high half of result The above is just the Basics so you'll have to learn more, that's just the basics to just make you start loving ASM. -PC_THE_GREAT pc_the_Great@huo.i-p.com pc@hackers.com PC_THE_GREAT1@servihoo.com