www.digitalmars.com [Home] [Search] [Contents]

Using Assembly Language Functions

This chapter describes how to call assembly language functions from both C and C++ and how to create an interface to assembly language modules. It explains conventions for function return values, register usage, and data alignment at the assembly language level.

Conventions for both 16- and 32-bit compilations are covered. When describing register usage and contents, the name of the corresponding 32-bit register appears in parentheses after the name of a 16-bit register.

For information about the advantages of writing assembly language code inline, instead of assembling it separately, see Using the Inline Assembler.

What's in This Chapter

Implications of Type-Safe Linkage

Type-safe linkage affects how you call assembly language functions from a C++ program. You cannot use the standard C-to-assembly language interface for C++ functions for the following reasons: For more information, see "Type-Safe Linkage" in Mixing Languages.

The Easy Way to Call Assembly Code from C++

In many cases in which you want to call an assembly language routine from a C++ function, you can use the following method, which does not require you to worry about function-naming or parameter-passing conventions:
	extern "C"
	{
	    int assembler_routine(int x); 
	} 
This method tells the C++ compiler through its function prototype that your assembly language routine uses C linkage. This is the easiest method of specifying C linkage and does not involve any change to the naming of the assembly language routine.

For complete information, see the section "Creating Routines With C++ Linkage" in this chapter.

Using existing assembly language modules

If you already have some assembly language routines written for use with DMC or Microsoft C, you can almost certainly use them with DMC++. However, you will need an ANSI C standard header file containing the function prototypes for these routines, and you will need to modify it to declare the functions as taking C linkage.

The best method of specifying C linkage is to enclose in braces with an extern "C" {} statement the prototypes of your assembly language functions, as shown in the section "The Easy Way to Call Assembly Code from C++" above. The advantage is that you can use the same routine with both C and C++ modules.

Provided you include the header file containing the function prototypes in all the source files that use the assembly language routines, you will not even have to reassemble their code.

Similarly, when calling a C++ function from an assembly language routine, declare that function as having C linkage in your C++ program. The only exception to this rule is member functions, which cannot be given C linkage.

Organization of Object Files

Digital Mars .com files are not the same as those produced by other compilers. In most other compilers, CS==SS==DS for .com files, and the entire size of the program, plus stack and heap, must be less than 64KB. In Digital Mars .com files, only the size of the code plus DGROUP areas must be less than 64KB. Considerably larger .com programs can thus be created. Also, the only difference between a Digital Mars Tiny model program and a Small model program is how it is linked.

In all but the Tiny model, the STACK segment is set to 128 bytes in length. This is enough to allow the operating system to start up the program. Code in the C++ startup module, c. asm, then allocates a full stack elsewhere. The 128 bytes are subsequently used to store the program command line so that it is addressable using the DS register. In the Tiny model, the STACK segment is zero bytes in length.

All BSS segments are cleared to 0 by the startup module, regardless of the memory model in use.

For the Tiny, Small, and Medium models, there are two schemes for allocation of the near heap. These schemes are selected by the value of the global variable _okbigbuf. For more information on memory allocation, see "Choosing a memory model" in Chapter 2, "Compiling Code."

Layout of Assembly Language Modules

To work with DMC++, assembly language code must be divided into code and data segments. Executable code and functions callable from C or C++ go into the code segment. Static and global data declarations go into the data segment.

The pseudo-ops for defining the code and data segments for each memory model are different. Therefore, use the macros begcode, endcode, begdata, and enddata defined in macros. asm for each memory model. The general layout for an .asm source file is:

	INCLUDE MACROS.ASM	;define memory model macros
				;EXTRN statements for C/C++ 
				;functions to call go here 

	begdata			;define start of data
				;EXTRN statements for 
				;external data globals go here 

	enddata			;define end of data segment 

	begcode modulename	;define start of code
				;executable code goes here 

	endcode modulename	;define end of code segment 

	END			;define end of module 

Function Return Values for 16-Bit Models

For the 16-bit memory models (Tiny, Small, Medium, Compact, and Large), near pointers, ints, unsigned ints, and shorts are returned in AX. Chars are returned in AL. Longs and unsigned longs are returned in DX, AX, where DX contains the most significant 16 bits and AX contains the least significant 16 bits. Far pointers are returned in DX, AX, where DX has the segment portion and AX has the offset.

When C linkage is in effect, floats are returned in DX, AX, and doubles are returned in AX, BX, CX, DX, where AX contains the most significant 16 bits, and DX contains the least significant. When C++ linkage is in effect, the compiler creates a temporary copy of the variable on the stack and returns a pointer to it. Both these techniques are reentrant. In AL, 1-byte structs are returned, 2-byte structs in AX, and 4-byte structs in DX, AX. With larger structures, the method used depends on the linkage system in use for the function. For C linkage, when a function returns a structure, it actually returns a pointer to the structure, which is in the static data segment. This means that C functions that return structures are not reentrant. C++ linkage creates a temporary copy on the stack and returns a pointer to it which is reentrant.

Function Return Values for 32-Bit Models

Near pointers, ints, unsigned ints, longs, and unsigned longs are returned in EAX. Chars are returned in AL; shorts are returned in AX. Far pointers are returned in DX, EAX, where DX contains the segment and EAX contains the offset. long longs are returned in EDX, EAX.

When C Linkage is in effect, floats are returned in EAX and doubles in EDX, EAX, where EDX contains the most significant 32 bits and EAX contains the least significant 32 bits. When C++ linkage is in effect, the compiler creates a temporary copy of the variable on the stack and returns a pointer to it. Both these techniques are reentrant.

1-byte structs are returned in AL, 2-byte structs in AX, and 4-byte structs in EAX. With larger structures, the compiler creates a temporary copy of the variable on the stack and returns a (reentrant) pointer to it.

For 32-bit C++ code, where a struct has no constructors or destructors declared for it, 1-byte structs are returned in AL, 2-byte structs in AX, 4-byte structs in EAX, and 8-byte structs in EDX: EAX.

Warning: In previous versions of DMC++, small structs without constructors in 32-bit C++ code were passed through a hidden pointer to the return value. The change described above was made for compatibility with Microsoft. Due to this change, if you build part of an application with the current version of DMC++, you need to rebuild all of the application; otherwise, crash bugs could be introduced.

Register usage and data alignment for 16-bit models

When interfacing to 16-bit memory models, assembly language functions can change the values in AX, BX, CX, DX, or ES. Functions must preserve the values in SI, DI, BP, SP, SS, CS, and DS. The direction flag must always be set to forward.

Data should be aligned along 16-bit boundaries to maximize speed on 16-bit buses.

Register usage and data alignment for 32-bit models

When interfacing to 32-bit memory models, assembly language functions can change the values in EAX, ECX, EDX, or ES. Functions must preserve the values in EBX, ESI, EDI, EBP, ESP, SS, FS, CS, and DS. The direction flag must always be set to forward.

Data should be aligned along 32-bit boundaries to maximize speed on 32-bit buses.

Macros in macros.asm

There are macros defined in macros.asm that aid in the development of memory model-independent assembly language files. The macros are:
Table 5-1 Macros defined in macros.asm
 
begcode	Define start of code segment 

endcode	Define end of code segment 

begdata Define start of initialized data segment 

enddata Define end of initialized data segment 

SIZEPTR Default pointer size in bytes (2 for Tiny, Small, Medium 
	models, 4 for Compact, Large, Phar Lap, and DOSX 
	models) 

P	Offset of first parameter from BP (EBP) 

SPTR	Non-zero if pointers are near by default (Tiny, Small, 
	Medium, Phar Lap, and DOSX memory models) 

LPTR	Non-zero if pointers are far by default (Compact and 
	Large memory models) 

LCODE	Non-zero if large code (Medium or Large memory 
	models) 

SSeqDS	Non-zero if SS == DS 

ESeqDS	Non-zero if ES == DS 

uses	Pushes registers that must be saved 

unuse	Pops saved registers 

Creating Routines with C Linkage

Calling an assembly language routine directly from a C function is much easier than calling an assembly language routine from C++. Subroutine linkage The BP register (EBP for 32-bit compilations) is dedicated to pointing to the current stack frame. A subroutine with C linkage is called by pushing the arguments onto the stack from right to left; then the subroutine is called. The called subroutine saves the old BP (EBP) on the stack, sets BP (EBP) to point to it, allocates space on the stack for all local variables, and pushes SI (ESI) and DI (EDI) if they are needed by the function. 32-bit code must also save the EBX register. The body of the subroutine is then executed.

The subroutine returns by popping EBX (32-bit code only), DI (EDI) and SI (ESI), deallocating space on the stack for the local variables, popping off the old value of BP (EBP), and returning. The calling code then removes the parameters from the stack.

Organization of the stack frame

The stack frame of a function is the current state of the stack and variables in it at a given point in the execution of the function. The table below shows the normal organization of the stack frame.
Table 5-2 Normal organization of stack frame 

			High memory 
			Previous stack frame 
			Parameters 
			Return address 
	BP (EBP) ->	Old value of BP (EBP) 
			Local variables and temporaries 
			SI (ESI) 
	SP (ESP) ->	DI (EDI) 
			Low memory 
The stack grows downward (toward lower addresses).

Small model example

The example below shows a short C++ program that calls an assembly language function using C linkage. This function sets the cursor position to the coordinates x, y. All macros are expanded, and the calling function is translated to assembly language to further show how the compiler translates a function with C linkage. The utility to translate the function is obj2asm. exe.

Here is the C++ program:

	extern "C" void gotoxy(int x, int y); // essential! 
	// normally in a header file 

	int main()
	{ 
	    gotoxy(10, 20); // set cursor position at ROW 10, COL 20 
	    return 1;
	} 
After compiling the C++ program to an object file, use the utility OBJ2ASM to produce the assembly language equivalent below:
	_TEXT segment
	_main: 
		mov AX,014h	; move 20 into AX
		push AX		; push on stack (2 BYTES) 
		mov AX,0Ah	; move 10 into AX
		push AX		; push on stack (2 BYTES) 
		callm _gotoxy	; call gotoxy() function
		add SP,4	; adjust stack ptr. (4 BYTES) 
		ret
	_TEXT ends 
or for a 32-bit memory model:
	_TEXT segment
	_main: 
		push 014h	; push 20 on stack (4 BYTES)
		push 0Ah	; push 10 on stack (4 BYTES) 
		callm _gotoxy	; call gotoxy() function
		add ESP,8	; adjust stack ptr. (8 BYTES) 
		ret
	_TEXT ends 
Since the function gotoxy has been defined as using C linkage, the variables are pushed on the stack from right to left.

First, the column (20) is pushed on the stack. Next, the row (10) is pushed on the stack. Finally, the call to gotoxy is made, pushing the instruction pointer (IP) on the stack. Note that the 32-bit version pushes the parameters directly onto the stack, whereas the 16-bit version first moves them into AX. The table below shows some of the advantages of generating 32-bit code.

Table 5-3 Stack frame generating 32-bit code 
			High memory 
	BP+4 (EBP+8)	20 
	BP+2 (EBP+4)	10 
	BP+0 (EBP+0)	IP (EIP) return address 
			Low memory 
The compiler prepends an underscore to the function _main and _gotoxy. The _TEXT segment is the CODE segment. Table 5-3 shows how the stack looks after the call to _gotoxy(10,20):

The assembly language function below defines a set of utility macros for MASM 5. 0 and above that is supplied with the compiler and normally installed in the INCLUDE directory. All macros are defined in macros. asm. The 32-bit version is controlled by whether the macro DOS386 is defined.

	include macros.asm	; pull in defs of macros
	begcode gotoxy		; define start of code seg called gotoxy

	;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 
	; C++ interface routine, C linkage.
	; Puts cursor at row, col. 
	; Usage:
	; void gotoxy(int row, int col); 

	IFNDEF DOS386 
	public _gotoxy		; make gotoxy global
	_gotoxy proc near	; define start of func 
		push BP		; save old stack frame
		mov BP,SP	; set BP to point to old BP 
		mov DH,P[BP]	; DH = row
		mov DL,P[BP+2]	; DL = col 
		mov AH,2	; BIOS function set cursor pos.
		xor BX,BX	; page 0 
		int 10h		; BIOS video interrupt
		pop BP		; restore old BP 
		ret		; return to caller
	_gotoxy endp		; define end of func 
	ELSE 
	public _gotoxy		; make gotoxy global
	_gotoxy proc near	; define start of func 
		push EBP	; save old stack frame
		mov EBP,ESP	; set EBP to point to old BP 
		uses 	; saves registers that are used
		mov DH,P[EBP]	; DH = row 
		mov DL,P[EBP 4]	; DL = col
		mov AH,2	; BIOS function set cursor pos. 
		xor EBX,EBX	; page 0
		int 10h		; BIOS video interrupt 
		unuse 	; note reverse order
		pop EBP		; restore old EBP 
		ret		; return to caller
	_gotoxy endp		; define end of func 
	ENDIF 
	endcode gotoxy		;define end of code seg end 
The assembly language function begins by first pushing BP (EBP) onto the stack and moving the stack pointer into BP (EBP). This permits access to the variables pushed onto the stack by the calling function.

This is done by using BP (EBP) to point to offsets within the stack. In the example above, MOV DH, P[BP] (MOV DH, P[EBP]) obtains the row number from the stack and places it in DH. The next diagram shows the variables and their positions on the stack. P expands to 4 for the Tiny, Small, and Compact models; 6 for the Medium, and Large models; and 8 for the Phar Lap and DOSX models. It is the offset from BP (EBP) to the first parameter on the stack.

Table 5-4 Stack for tiny, small, and compact memory models 

			High memory 
	BP+6 (EBP+12)	20 
	BP+4 (EBP+8)	10 
	BP+2 (EBP+4)	IP (EIP) return address 
	BP+0 (EBP+0)	Previous BP (EBP) 
			Low memory 
After completing the function, restore BP (EBP) and return to the calling function. The ret will pop IP off the stack and begin execution at the instruction following the calln _gotoxy. The next instruction in the calling function is ADD SP, 4. (ADD ESP, 8) This instruction resets the stack pointer to the position it occupied before the parameters were pushed.

The above example pertains to the Tiny, Small, Compact, and the 32-bit models. If the Large or Medium memory models are used, the far call also pushes CS onto the stack. This changes the position of the variables on the stack to those shown below:

Table 5-5 Stack for large and medium memory models 

		High memory 
	BP+8	20 
	BP+6	10 
	BP+4	CS return segment 
	BP+2	IP return address 
	BP+0	Previous BP 
		Low Memory 
Using the P macro (defined in macros.asm) compensates for these differences.

Model-independent example

This example illustrates an assembly language routine to implement the following C function. The routine is written to make it assemble correctly for any memory model:
	// C++ MODULE
	extern var1; 
	int var2;
	extern "C" int func1(int *p, int a);// essential! 

	int func2(int *pa, int a)
	{   
	    int b;
	    *pa = b; 
	    var2 = b + var1 + func1(&b, a);
	    return a - var2; 
	} 
Here is the corresponding assembly language module:
	; Assembler MODULE
	include MACROS.ASM 

	IFNDEF DOS386 
	begdata			; define start of data seg
	extrn	_var1:word 
	_var2	dw	0	; allocate var2
	enddata			; end of data segment 

	IF LCODE		; if large code model
	extrn	_func1:far	; then far function 
	ELSE
	extrn	_func1:near	; else near function 
	ENDIF 

	begcode	func2
	public	_func2		; make func2 global 
	IF LCODE
	_func2	proc far	; define function func2 
	ELSE
	 _func2	proc near	; define function func2 
	ENDIF
	push	BP		; save old frame pointer 
	mov	BP,SP		; set new frame pointer
	sub	SP,2		; create room for b 
	mov	AX,-2[BP]	; AX = b
	IF SPTR			; if small memory model 
	mov	BX,P[BP]	; BX = pa
	mov	[BX],AX		; *pa = b 
	ELSE			; else large memory model
	les	BX,P[BP]	; ES:BX = pa 
	mov	ES:[BX],AX	; *pa = b
	ENDIF 
	push	P+SIZEPTR[BP]	; push a onto stack
	IF LPTR			; if far pointers 
	push	SS		; push segment of b
	ENDIF 
	lea	AX,-2[BP]	; AX = offset of b
	push	AX 
	call	func1		; call func1(& b, a)
	add	SP,SIZEPTR+ 2	; restore the stack 
	add	AX,_var1	; func1 returned result in AX 
	add	AX,-2[BP]	; AX = b+ var1+ func1(a)
	mov	_var2,AX 
	mov	AX,p+SIZEPTR[BP]; AX = a
	sub	AX,_var2	; AX = a -var2 
	mov	SP,BP		; dump local variables
	pop	BP		; restore old frame pointer
	ret			; AX has return value 
	_func2	endp		; end of function func2
	endcode	func2		; end of code segment 

	ELSE 
	begdata			; start of data seg
	extrn _var1:dword 
	_var2 dd 0		; allocate var2
	enddata			; end of data segment 

	extrn	_func1:near	; near function 
	begcode	func2
	public	_func2		; make func2 global 
	proc	_func2 near	; define function func2
	push	EBP		; save old frame pointer 
	mov	EBP,ESP		; set new frame pointer
	sub	ESP,4		; create room for b 
	uses			; preserve EBX
	mov	EAX,-4[EBP]	; EAX = b 
	mov	EBX,P[EBP]	; EBX = pa
	mov	[EBX],EAX	; *pa = b 
	push	P+SIZEPTR[EBP]	; push a onto stack
	lea	EAX,-4[EBP]	; EAX = offset of b 
	push	EAX
	call	near ptr func1	; call func1(& b, a) 
	add	ESP,SIZEPTR+4	; restore the stack
	add	EAX,_var1	; func1 returned result in EAX
	add	EAX,-4[EBP]	; EAX = b + var1 + func1(a) 
	mov	_var2,EAX
	mov	EAX,p+SIZEPTR[EBP]	; EAX = a 
	sub	EAX,_var2	; EAX = a - var2
	unuse			; restore EBX 
	mov	ESP,EBP		; dump local variables
	pop	EBP		; restore old frame ptr. 
	ret			; EAX has return value
	_func2 endp		; end of function func2 
	endcode func2		; end of code segment 
	endif 
	END			; end of module 
EXTERN statements for code should be outside the begcode/ endcode pairs; otherwise, a message about fix up errors from the linker can be generated when using the Medium or Large models.

Creating Routines With C++ Linkage

In almost all cases, it is better to use C linkage for assembly language functions that will be called from C++ code. This ensures compatibility with future versions of DMC++ and other compilers and avoids the problems associated with subtle differences in C++ calling conventions in different situations.

Digital Mars recommends that you write your assembly language functions inline or use C linkage (that is, declare them as extern "C"), rather than use C++ linkage. If you must use C++ linkage, see the book Microsoft Object Mapping Specification for implementation details.

Running MASM

When you call MASM, the include file macros.asm sets up macros, depending on which memory model you desire. You indicate the memory model by defining a symbol on the command line:
	MASM /MX /DI8086? module; 
where ? is one of S, M, C, L, or V, corresponding to the appropriate memory model. The Small model is the default. You can see this by looking at the file macros. asm. Do not define I8086T for Tiny model programs; use I8086S instead. (Remember that the only difference between Tiny and Small programs is how they are linked, not how they are compiled or assembled.) For 32-bit programs, define the symbol as /DDOS386.

The /MX switch is necessary so that all global names are case sensitive. Do not use the /ML switch; it causes some versions of MASM to assemble 8087 opcodes incorrectly.

The /R switch enables the assembling of 8087 opcodes.

DMC++ offers built-in support for MASM (Versions 5.0 and higher; Version 5.1 is recommended). If a file argument to the compiler ends in .asm, the compiler tries to assemble it with MASM. If you specify a memory model, the compiler passes the appropriate define to MASM. The compiler passes -g, -D, -v, and -I options to MASM as the corresponding MASM switches.

Support for 386ASM

SC also supports the Phar Lap assembler, 386ASM. To assemble test. asm using 386ASM, use:
	sc -mp test 

Using Register Variables

DMC++ defines the following register variables:
	Table 5-6 Register variables
	_EAX _AX _AH _AL 
	_EBX _BX _BH _BL 
	_ECX _CX _CH _CL 
	_EDX _DX _DH _DL 
	_ESI _SI 
	_EDI _DI 
	_EBP _BP 
	_ESP _SP 
The extended registers are not available in 16-bit compilations.

The register variables have the following types:

	Registers		Type
	Byte registers		unsigned char 
	Word registers		unsigned short 
	Extended registers	unsigned long 
Keep the following limitations in mind when you use register variables:

Using the __emit__ Function

The __emit__ function lets you insert inline machine instructions into your program in byte pairs. Although of limited usefulness in writing large routines (use the inline assembler instead), the __emit__ function is comparable to the inline assembler for implementing simple functions.

Note: The __emit__ function replaces the asm() function supported in Zortech 3.1.

Calls to __emit__ have the form:

	__emit__(arg1, arg2, . . .); 
The type of each argument determines the number of bytes stored, with this exception: If the argument is of type int and has a value in the range 0 to 255, only one byte is stored. Therefore, to store sizeof(int) bytes, cast the argument to unsigned:
	__emit__(1,(unsigned) 23,6); 
or use the u postfix:
	__emit__(1,23u, 6);