[Home]
[Search]
[Contents]
Conventions for both 16- and 32-bit compilations are covered. When describing register usage and contents, the name of the corresponding 32-bit register appears in parentheses after the name of a 16-bit register.
For information about the advantages of writing assembly language code inline, instead of assembling it separately, see Using the Inline Assembler.
extern "C"
{
int assembler_routine(int x);
}
This method tells the C++ compiler through its function prototype
that your assembly language routine uses C linkage. This is the
easiest method of specifying C linkage and does not involve any
change to the naming of the assembly language routine.
For complete information, see the section "Creating Routines With C++ Linkage" in this chapter.
The best method of specifying C linkage is to enclose in braces with an extern "C" {} statement the prototypes of your assembly language functions, as shown in the section "The Easy Way to Call Assembly Code from C++" above. The advantage is that you can use the same routine with both C and C++ modules.
Provided you include the header file containing the function prototypes in all the source files that use the assembly language routines, you will not even have to reassemble their code.
Similarly, when calling a C++ function from an assembly language routine, declare that function as having C linkage in your C++ program. The only exception to this rule is member functions, which cannot be given C linkage.
In all but the Tiny model, the STACK segment is set to 128 bytes in length. This is enough to allow the operating system to start up the program. Code in the C++ startup module, c. asm, then allocates a full stack elsewhere. The 128 bytes are subsequently used to store the program command line so that it is addressable using the DS register. In the Tiny model, the STACK segment is zero bytes in length.
All BSS segments are cleared to 0 by the startup module, regardless of the memory model in use.
For the Tiny, Small, and Medium models, there are two schemes for allocation of the near heap. These schemes are selected by the value of the global variable _okbigbuf. For more information on memory allocation, see "Choosing a memory model" in Chapter 2, "Compiling Code."
The pseudo-ops for defining the code and data segments for each memory model are different. Therefore, use the macros begcode, endcode, begdata, and enddata defined in macros. asm for each memory model. The general layout for an .asm source file is:
INCLUDE MACROS.ASM ;define memory model macros ;EXTRN statements for C/C++ ;functions to call go here begdata ;define start of data ;EXTRN statements for ;external data globals go here enddata ;define end of data segment begcode modulename ;define start of code ;executable code goes here endcode modulename ;define end of code segment END ;define end of module
When C linkage is in effect, floats are returned in DX, AX, and doubles are returned in AX, BX, CX, DX, where AX contains the most significant 16 bits, and DX contains the least significant. When C++ linkage is in effect, the compiler creates a temporary copy of the variable on the stack and returns a pointer to it. Both these techniques are reentrant. In AL, 1-byte structs are returned, 2-byte structs in AX, and 4-byte structs in DX, AX. With larger structures, the method used depends on the linkage system in use for the function. For C linkage, when a function returns a structure, it actually returns a pointer to the structure, which is in the static data segment. This means that C functions that return structures are not reentrant. C++ linkage creates a temporary copy on the stack and returns a pointer to it which is reentrant.
When C Linkage is in effect, floats are returned in EAX and doubles in EDX, EAX, where EDX contains the most significant 32 bits and EAX contains the least significant 32 bits. When C++ linkage is in effect, the compiler creates a temporary copy of the variable on the stack and returns a pointer to it. Both these techniques are reentrant.
1-byte structs are returned in AL, 2-byte structs in AX, and 4-byte structs in EAX. With larger structures, the compiler creates a temporary copy of the variable on the stack and returns a (reentrant) pointer to it.
For 32-bit C++ code, where a struct has no constructors or destructors declared for it, 1-byte structs are returned in AL, 2-byte structs in AX, 4-byte structs in EAX, and 8-byte structs in EDX: EAX.
Warning: In previous versions of DMC++, small structs without constructors in 32-bit C++ code were passed through a hidden pointer to the return value. The change described above was made for compatibility with Microsoft. Due to this change, if you build part of an application with the current version of DMC++, you need to rebuild all of the application; otherwise, crash bugs could be introduced.
Data should be aligned along 16-bit boundaries to maximize speed on 16-bit buses.
Data should be aligned along 32-bit boundaries to maximize speed on 32-bit buses.
Table 5-1 Macros defined in macros.asm begcode Define start of code segment endcode Define end of code segment begdata Define start of initialized data segment enddata Define end of initialized data segment SIZEPTR Default pointer size in bytes (2 for Tiny, Small, Medium models, 4 for Compact, Large, Phar Lap, and DOSX models) P Offset of first parameter from BP (EBP) SPTR Non-zero if pointers are near by default (Tiny, Small, Medium, Phar Lap, and DOSX memory models) LPTR Non-zero if pointers are far by default (Compact and Large memory models) LCODE Non-zero if large code (Medium or Large memory models) SSeqDS Non-zero if SS == DS ESeqDS Non-zero if ES == DS uses Pushes registers that must be saved unuse Pops saved registers
The subroutine returns by popping EBX (32-bit code only), DI (EDI) and SI (ESI), deallocating space on the stack for the local variables, popping off the old value of BP (EBP), and returning. The calling code then removes the parameters from the stack.
Table 5-2 Normal organization of stack frame High memory Previous stack frame Parameters Return address BP (EBP) -> Old value of BP (EBP) Local variables and temporaries SI (ESI) SP (ESP) -> DI (EDI) Low memoryThe stack grows downward (toward lower addresses).
Here is the C++ program:
extern "C" void gotoxy(int x, int y); // essential!
// normally in a header file
int main()
{
gotoxy(10, 20); // set cursor position at ROW 10, COL 20
return 1;
}
After compiling the C++ program to an object file, use the utility
OBJ2ASM to produce the assembly language equivalent below:
_TEXT segment _main: mov AX,014h ; move 20 into AX push AX ; push on stack (2 BYTES) mov AX,0Ah ; move 10 into AX push AX ; push on stack (2 BYTES) callm _gotoxy ; call gotoxy() function add SP,4 ; adjust stack ptr. (4 BYTES) ret _TEXT endsor for a 32-bit memory model:
_TEXT segment _main: push 014h ; push 20 on stack (4 BYTES) push 0Ah ; push 10 on stack (4 BYTES) callm _gotoxy ; call gotoxy() function add ESP,8 ; adjust stack ptr. (8 BYTES) ret _TEXT endsSince the function gotoxy has been defined as using C linkage, the variables are pushed on the stack from right to left.
First, the column (20) is pushed on the stack. Next, the row (10) is pushed on the stack. Finally, the call to gotoxy is made, pushing the instruction pointer (IP) on the stack. Note that the 32-bit version pushes the parameters directly onto the stack, whereas the 16-bit version first moves them into AX. The table below shows some of the advantages of generating 32-bit code.
Table 5-3 Stack frame generating 32-bit code High memory BP+4 (EBP+8) 20 BP+2 (EBP+4) 10 BP+0 (EBP+0) IP (EIP) return address Low memoryThe compiler prepends an underscore to the function _main and _gotoxy. The _TEXT segment is the CODE segment. Table 5-3 shows how the stack looks after the call to _gotoxy(10,20):
The assembly language function below defines a set of utility macros for MASM 5. 0 and above that is supplied with the compiler and normally installed in the INCLUDE directory. All macros are defined in macros. asm. The 32-bit version is controlled by whether the macro DOS386 is defined.
include macros.asm ; pull in defs of macros begcode gotoxy ; define start of code seg called gotoxy ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; C++ interface routine, C linkage. ; Puts cursor at row, col. ; Usage: ; void gotoxy(int row, int col); IFNDEF DOS386 public _gotoxy ; make gotoxy global _gotoxy proc near ; define start of func push BP ; save old stack frame mov BP,SP ; set BP to point to old BP mov DH,P[BP] ; DH = row mov DL,P[BP+2] ; DL = col mov AH,2 ; BIOS function set cursor pos. xor BX,BX ; page 0 int 10h ; BIOS video interrupt pop BP ; restore old BP ret ; return to caller _gotoxy endp ; define end of func ELSE public _gotoxy ; make gotoxy global _gotoxy proc near ; define start of func push EBP ; save old stack frame mov EBP,ESP ; set EBP to point to old BP usesThe assembly language function begins by first pushing BP (EBP) onto the stack and moving the stack pointer into BP (EBP). This permits access to the variables pushed onto the stack by the calling function.; saves registers that are used mov DH,P[EBP] ; DH = row mov DL,P[EBP 4] ; DL = col mov AH,2 ; BIOS function set cursor pos. xor EBX,EBX ; page 0 int 10h ; BIOS video interrupt unuse ; note reverse order pop EBP ; restore old EBP ret ; return to caller _gotoxy endp ; define end of func ENDIF endcode gotoxy ;define end of code seg end
This is done by using BP (EBP) to point to offsets within the stack. In the example above, MOV DH, P[BP] (MOV DH, P[EBP]) obtains the row number from the stack and places it in DH. The next diagram shows the variables and their positions on the stack. P expands to 4 for the Tiny, Small, and Compact models; 6 for the Medium, and Large models; and 8 for the Phar Lap and DOSX models. It is the offset from BP (EBP) to the first parameter on the stack.
Table 5-4 Stack for tiny, small, and compact memory models High memory BP+6 (EBP+12) 20 BP+4 (EBP+8) 10 BP+2 (EBP+4) IP (EIP) return address BP+0 (EBP+0) Previous BP (EBP) Low memoryAfter completing the function, restore BP (EBP) and return to the calling function. The ret will pop IP off the stack and begin execution at the instruction following the calln _gotoxy. The next instruction in the calling function is ADD SP, 4. (ADD ESP, 8) This instruction resets the stack pointer to the position it occupied before the parameters were pushed.
The above example pertains to the Tiny, Small, Compact, and the 32-bit models. If the Large or Medium memory models are used, the far call also pushes CS onto the stack. This changes the position of the variables on the stack to those shown below:
Table 5-5 Stack for large and medium memory models High memory BP+8 20 BP+6 10 BP+4 CS return segment BP+2 IP return address BP+0 Previous BP Low MemoryUsing the P macro (defined in macros.asm) compensates for these differences.
// C++ MODULE
extern var1;
int var2;
extern "C" int func1(int *p, int a);// essential!
int func2(int *pa, int a)
{
int b;
*pa = b;
var2 = b + var1 + func1(&b, a);
return a - var2;
}
Here is the corresponding assembly language module:
; Assembler MODULE include MACROS.ASM IFNDEF DOS386 begdata ; define start of data seg extrn _var1:word _var2 dw 0 ; allocate var2 enddata ; end of data segment IF LCODE ; if large code model extrn _func1:far ; then far function ELSE extrn _func1:near ; else near function ENDIF begcode func2 public _func2 ; make func2 global IF LCODE _func2 proc far ; define function func2 ELSE _func2 proc near ; define function func2 ENDIF push BP ; save old frame pointer mov BP,SP ; set new frame pointer sub SP,2 ; create room for b mov AX,-2[BP] ; AX = b IF SPTR ; if small memory model mov BX,P[BP] ; BX = pa mov [BX],AX ; *pa = b ELSE ; else large memory model les BX,P[BP] ; ES:BX = pa mov ES:[BX],AX ; *pa = b ENDIF push P+SIZEPTR[BP] ; push a onto stack IF LPTR ; if far pointers push SS ; push segment of b ENDIF lea AX,-2[BP] ; AX = offset of b push AX call func1 ; call func1(& b, a) add SP,SIZEPTR+ 2 ; restore the stack add AX,_var1 ; func1 returned result in AX add AX,-2[BP] ; AX = b+ var1+ func1(a) mov _var2,AX mov AX,p+SIZEPTR[BP]; AX = a sub AX,_var2 ; AX = a -var2 mov SP,BP ; dump local variables pop BP ; restore old frame pointer ret ; AX has return value _func2 endp ; end of function func2 endcode func2 ; end of code segment ELSE begdata ; start of data seg extrn _var1:dword _var2 dd 0 ; allocate var2 enddata ; end of data segment extrn _func1:near ; near function begcode func2 public _func2 ; make func2 global proc _func2 near ; define function func2 push EBP ; save old frame pointer mov EBP,ESP ; set new frame pointer sub ESP,4 ; create room for b usesEXTERN statements for code should be outside the begcode/ endcode pairs; otherwise, a message about fix up errors from the linker can be generated when using the Medium or Large models.; preserve EBX mov EAX,-4[EBP] ; EAX = b mov EBX,P[EBP] ; EBX = pa mov [EBX],EAX ; *pa = b push P+SIZEPTR[EBP] ; push a onto stack lea EAX,-4[EBP] ; EAX = offset of b push EAX call near ptr func1 ; call func1(& b, a) add ESP,SIZEPTR+4 ; restore the stack add EAX,_var1 ; func1 returned result in EAX add EAX,-4[EBP] ; EAX = b + var1 + func1(a) mov _var2,EAX mov EAX,p+SIZEPTR[EBP] ; EAX = a sub EAX,_var2 ; EAX = a - var2 unuse ; restore EBX mov ESP,EBP ; dump local variables pop EBP ; restore old frame ptr. ret ; EAX has return value _func2 endp ; end of function func2 endcode func2 ; end of code segment endif END ; end of module
Digital Mars recommends that you write your assembly language functions inline or use C linkage (that is, declare them as extern "C"), rather than use C++ linkage. If you must use C++ linkage, see the book Microsoft Object Mapping Specification for implementation details.
MASM /MX /DI8086? module;where ? is one of S, M, C, L, or V, corresponding to the appropriate memory model. The Small model is the default. You can see this by looking at the file macros. asm. Do not define I8086T for Tiny model programs; use I8086S instead. (Remember that the only difference between Tiny and Small programs is how they are linked, not how they are compiled or assembled.) For 32-bit programs, define the symbol as /DDOS386.
The /MX switch is necessary so that all global names are case sensitive. Do not use the /ML switch; it causes some versions of MASM to assemble 8087 opcodes incorrectly.
The /R switch enables the assembling of 8087 opcodes.
DMC++ offers built-in support for MASM (Versions 5.0 and higher; Version 5.1 is recommended). If a file argument to the compiler ends in .asm, the compiler tries to assemble it with MASM. If you specify a memory model, the compiler passes the appropriate define to MASM. The compiler passes -g, -D, -v, and -I options to MASM as the corresponding MASM switches.
sc -mp test
Table 5-6 Register variables _EAX _AX _AH _AL _EBX _BX _BH _BL _ECX _CX _CH _CL _EDX _DX _DH _DL _ESI _SI _EDI _DI _EBP _BP _ESP _SPThe extended registers are not available in 16-bit compilations.
The register variables have the following types:
Registers Type Byte registers unsigned char Word registers unsigned short Extended registers unsigned longKeep the following limitations in mind when you use register variables:
Note: The __emit__ function replaces the asm() function supported in Zortech 3.1.
Calls to __emit__ have the form:
__emit__(arg1, arg2, . . .);The type of each argument determines the number of bytes stored, with this exception: If the argument is of type int and has a value in the range 0 to 255, only one byte is stored. Therefore, to store sizeof(int) bytes, cast the argument to unsigned:
__emit__(1,(unsigned) 23,6);or use the u postfix:
__emit__(1,23u, 6);