LONG: Calling C Routines from Basic -- Part 1 of 2

ID: Q104511


The information in this article applies to:


SUMMARY

Microsoft Basic supports calls to routines written in Microsoft C. This application note describes the necessary syntax for calling Microsoft C routines and contains a series of examples demonstrating the interlanguage calling capabilities between Basic and C.

For more information about interlanguage calling, refer to the Microsoft Mixed-Language Programming Guide for the MS-DOS Operating System.


MORE INFORMATION

WARNING: One or more of the following functions are discussed in this article; VarPtr, VarPtrArray, VarPtrStringArray, StrPtr, ObjPtr. These functions are not supported by Microsoft Technical Support. They are not documented in the Visual Basic documentation and are provided in this Knowledge Base article "as is." Microsoft does not guarantee that they will be available in future releases of Visual Basic.

This two-page document contains the following sections:

Version Compatibility
Making Mixed-Language Calls
Naming Convention Requirements
Calling Convention Requirements
Parameter-Passing Requirements
Basic Arguments
C Arguments
Restrictions on Calls from Basic
Memory Allocation
Incompatible Functions
the Basic Interface to C
Using the Alias Feature
Using the Parameter List
Alternative Basic Interfaces
C Calls to Basic
Compiling and Linking
Data Types: Numerical Formats
Data Types: String Formats
Data Types: Arrays
Data Types: Structures, Records, and User-Defined Types
Common Blocks
Calling MS-DOS I/O Routines Does Not Affect Basic Cursor Position
Debugging Mixed-Language Programs
Compiling and Linking the Sample Programs
Appendix: Common Pitfalls
Passing Numeric Variables from Basic to C by Near Reference
Passing Numeric Variables Between Basic and C by Far Reference
Passing Numeric Variables from Basic to C by Value
Passing a Character from Basic to C
Passing a Basic Variable-Length String to C by Far Reference
Passing a Basic Fixed-Length String to C by Near Reference
Passing a Basic Fixed-Length String to C by Far Reference
Passing a String from C to Basic
Passing a Basic User-Defined Type to C by Near Reference
Passing a Basic User-Defined Type to C by Far Reference
Passing a Basic Integer Array to C by Far Reference
Passing a Basic Array Of Long Integers to C by Far Reference
Passing a Basic Single-Precision Array to C by Far Reference
Passing a Basic Double-Precision Array to C by Far Reference
Passing a Basic Array Of Fixed-Length Strings to C
Passing a Basic Array Of User-Defined Type to C
Passing a Basic Two-Dimensional Integer Array to C by Far Reference
Passing a Common Block from Basic to C by Far Reference
Passing a Fixed-Length String from C to Basic by Far Reference
C Functions Returning Numerics to Basic

VERSION COMPATIBILITY

The following table specifies which versions of Microsoft Basic can be linked with specific versions of Microsoft C or Quick C:

Basic C/C++ Quick C
QuickBasic 4.5 5.1 2.0
Basic PDS 7.1 6.0 2.5
Visual Basic for MS-DOS 1.0 7.0

MAKING MIXED-LANGUAGE CALLS

Mixed-language programming always involves a call; specifically, it involves a function or subprogram call. For example, a Basic main module may need to execute a specific task that you would like to program separately. In addition to calling a Basic subprogram or function, you can call a C function.

Mixed-language calls necessarily involve multiple modules. Instead of compiling all of your source modules with the same compiler, you use different compilers. In the example mentioned above, you would compile the main-module source file with the Basic compiler, compile another source file (written in C) with the C compiler, and then link the two object files.

There are two types of routines that can be called. The principle difference is that some kinds of routines return values, and others do not. (NOTE: In this article, "routine" refers to any function or subprogram procedure that can be called from another module.)

The following table compares the types of routine calls in C and Basic:

Language Returned Value No Returned Value
Basic FUNCTION subprogram (SUB ... END SUB
C function void functionn

Basic DEF FN functions and GOSUB subroutines cannot be called from another language.

NAMING CONVENTION REQUIREMENTS

The term "naming convention" refers to the way that a compiler alters the name of the routine before placing it into an object file.

It is important that you adopt a compatible naming convention when you issue a mixed-language call. If the name of the called routine is stored differently in each object file, the linker will not be able to find a match. Instead, it will report an unresolved external reference.

When Microsoft compilers place machine code into object files, they also include the names of all routines and variables that need to be accessed publicly. That way, the linker can compare the name of a routine called in one module to the name of a routine defined in another module and can recognize a match.

Basic and C use different naming conventions. Basic translates each letter to uppercase and drops the declaration character (%, &, !, #, @, $). Basic recognizes the first 40 characters of a routine name.

C uses a different convention; the C compiler does not translate any letters to uppercase but inserts a leading underscore (_) in front of the name of each routine. C recognizes the first 31 characters of a name.

When linking, it is important not to use the /NOIGNORE linker option. Differences in naming conventions are taken care of for you automatically by mixed-language keywords as long as you do not use the /NOIGNORE linker option. Using this option causes the linker to distinguish among routines with different capitalization; for example, routines named "Prn" and "prn" would cause problems when linking Basic and C programs.

The CL driver automatically uses the /NOIGNORE option when linking. To solve the problems created by this behavior, either link separately with the LINK utility or use all uppercase letters in your C modules (when not using CDECL, Basic translates all routine names to upper-case).

CALLING CONVENTION REQUIREMENTS

The term "calling convention" refers to the way that a language implements a call. The choice of calling convention affects the actual machine instructions that a compiler generates to execute (and return from) a function or subprogram call.

The use of a calling convention affects programming in two ways:
  1. The calling routine uses a calling convention to determine the order in which to pass arguments (parameters) to another routine. The calling convention can usually be specified in a mixed-language interface.


  2. The called routine uses a calling convention to determine the order in which to receive the parameters that were passed to it. In most languages, this convention can be specified in the routine's heading. Basic, however, always uses its own convention to receive parameters.


Basic and C use different calling conventions. Basic's calling convention pushes parameters onto the stack in the order in which they appear in the source code. For example, the Basic statement CALL Calc(A, B) pushes argument A onto the stack before it pushes B. This convention also specifies that the stack is restored by the called routine, just before returning control to the caller. (The stack is restored by removing the parameters.)

The C calling convention pushes parameters onto the stack in the reverse order in which they appear in the source code. For example, the C function calc(a, b); pushes b onto the stack before it pushes a. In contrast with Basic, the C calling convention specifies that a calling routine always restores the stack immediately after the called routine returns control.

When declaring a function in C, the PASCAL keyword can be used to indicate the calling convention used by Basic (Basic and Pascal both use the same calling convention). For example:
extern pascal int function1(int, int);
When declaring a function in Basic, the CDECL keyword can be used to declare the function as using the C calling convention. For example:
DECLARE FUNCTION Function1% CDECL (BYVAL N AS INTEGER)

PARAMETER-PASSING REQUIREMENTS

Microsoft compilers support three methods for passing a parameter, as explained in the following table:

Near reference Passes a variable's near (offset) address. This method gives the called routine direct access to the variable itself. Any change the routine makes to the parameter will be reflected in the calling routine. Basic assumes that all variables are passed by near (offset) address off of DGROUP. This means that all C data passed to Basic must be in the near data segment.
Far reference Passes a variable's far (segmented) address. This method is similar to passing by near reference, except that the segment as well as the offset is passed. This allows the variable to reside anywhere in memory.
By value Passes only the variable's value, not its address. With this method, the called routine knows the value of the parameter but has no access to the original variable. Changes to the value parameter have no effect on the value of the parameter in the calling routine once the called routine terminates.


The fact that there are different parameter-passing methods has two implications for mixed-language programming:
  1. You need to make sure that the called routine and the calling routine use the same method for passing each parameter (argument). In most cases, you must check the parameter-passing defaults used by each language and possibly make adjustments. Each language has keywords or language features that allow you to change the parameter-passing method.


  2. You may want to use a particular parameter-passing method rather than using the default for the language.


Each of these default methods can be overridden, as shown in the following sections.

Basic ARGUMENTS

The default for Basic is to pass all arguments by near reference. This can be overridden by using the SEG directive or by using CALLS instead of CALL. Both of these methods cause Basic to pass both the segment and offset. These methods can be used only to call non-Basic routines because Basic receives all parameters by near reference.

Although Basic can pass parameters to other languages by far reference using either the SEG directive or CALLS, Basic routines can be called only from other languages when parameters are passed by near reference. You cannot DECLARE or CALL a Basic routine with parameters that have SEG attributes. SEG is used only for parameters of non-Basic routines.

Passing Basic Arguments by Value

In Basic, the BYVAL keyword is used to pass arguments by value. An argument is passed by value when the called routine is first declared with a DECLARE statement, and the BYVAL keyword is applied to the argument. For example:
DECLARE SUB CRoutine CDECL (BYVAL a AS INTEGER)

Passing Basic Arguments by Near Reference

The Basic default is to pass by near reference. The use of SEG, BYVAL, or CALLS changes this default.

Passing Basic Arguments by Far Reference

Basic passes each argument in a call by far reference when CALLS is used to invoke a routine. Using SEG to modify a parameter in a preceding DECLARE statement also causes a Basic CALL to pass parameters by far reference.

CALLS cannot be used to call a routine that is named in a DECLARE statement. Because of this, CDECL (an option of the DECLARE statement) cannot be specified when using CALLS. This means that CALLS always passes parameters using the Pascal calling convention.

C ARGUMENTS

The default for C is to pass all arrays by reference (near or far, depending on the memory model) and all other data types by value. C uses far data pointers for compact, large, and huge models, and near data pointers for small and medium models. The C default is to pass everything except arrays by value.

Passing C Arguments by Reference (Near or Far)

In C, passing a pointer to an object is equivalent to passing the object itself by reference. After control is passed to the called function, each reference to the parameter itself is prefixed by * (an asterisk).

To pass a pointer to an object, prefix the parameter in the CALL statement with &. To receive a pointer to an object, prefix the parameter's declaration with *. In the latter case, this may mean adding a second * to a parameter that already has an *. For example, to receive a pointer by value, declare it as:
int *ptr;
but to receive the same pointer by reference, declare it as follows:
int **ptr;
The default for arrays is to pass by reference.

Effect of C Memory Models on Size of Reference

Near reference is the default for passing pointers in small and medium model C. Far reference is the default in the compact, large, and huge models.

All C programs that are called from Basic must be compiled with the medium or large memory models.

Near pointers can be specified with the near keyword, which overrides the default pointer size. However, if you are going to override the default pointer size of a parameter, you must explicitly declare the parameter type in function declarations as well as function definitions. Far pointers can be specified with the far keyword.

RESTRICTIONS ON CALLS FROM Basic

Basic has a much more complex environment and initialization procedure than C. Interlanguage calling between Basic and C is possible only because Basic intercepts a number of C library function calls and handles them in its own way. Because of this, Basic must be the initial environment that the program starts in, and from there, C routines can be called (which can, in turn, call Basic routines). This means that a program cannot start up with C main-module code and then call Basic routines.

Basic creates a host environment in which the C routines can function. However, Basic is limited in its ability to handle some C function calls. These limitations are as follows:

MEMORY ALLOCATION

If your C module is a medium model and you do dynamic memory allocation with malloc(), or if you execute explicit calls to _nmalloc() with any memory model, you must include the following lines in your Basic source code before you call C:
DIM mallocbuf%(2048) COMMON SHARED /NMALLOC/ mallocbuf%()
The array can have any name; only the size of the array is significant. However, the name of the COMMON block must be NMALLOC. In the interpreter environment, you must put this declaration in a module that you load as a Quick library.

The example above has the effect of reserving 4K in the COMMON block NMALLOC (integers take 2 bytes each, and there are 2048 integers allocated). When Basic intercepts C malloc calls, Basic allocates space out of this COMMON block.

WARNING: When you use the Basic statement CLEAR, all space allocated with near malloc calls will be lost. If you use CLEAR at all, use it only before any calls to malloc.

When you make far-memory requests in mixed-language programs, you may find it useful to first call the Basic function SETMEM. This function can be used to reduce the amount of memory Basic is using, thus freeing up memory for far allocations.

INCOMPATIBLE FUNCTIONS

The following C functions are incompatible with Basic and should be avoided:

  1. All forms of spawn() and exec()


  2. system()


  3. getenv()


  4. putenv()


Do not link with the VARSTCK.OBJ module. C provides this module to allocate memory from the stack.

The C graphics libraries GRAPHICS.LIB and PGCHART.LIB are not compatible with Basic. Many of the C graphics routines conflict with the Basic graphics routines. If graphics need to be done, they should be done in Basic. Linking with Microsoft C graphics routines may give "Duplicate Definition" errors, even if you LINK with the /NOE option.

THE Basic INTERFACE TO C

The Basic DECLARE statement provides a flexible and convenient interface to C. When you call a function, the DECLARE statement syntax is as follows:
DECLARE FUNCTION name [CDECL][ALIAS "aliasname"][(parameter-list)]
The name field is the name of the function or subprogram that you want to call, as it appears in the Basic source file. The following are the recommended steps for using the DECLARE statement when calling C:
  1. For each distinct C routine you plan to call, put a DECLARE statement in your Basic source file before the routine is called.


  2. Use CDECL in the DECLARE statement (unless the C routine is declared with the PASCAL or FORTRAN keywords or the /Gc compile switch).


  3. If you are calling a C routine with a name longer than 31 characters, use the ALIAS feature. The use of ALIAS is explained in the following section.


  4. Use the parameter list to determine how each parameter is to be passed. The use of the parameter list is explained in the section, "Using the Parameter List."


  5. Once the routine is properly declared, call it just as you would a Basic subprogram or function.


USING THE ALIAS FEATURE

You may need to use the ALIAS feature because C places the first 31 characters of a name into an object file, whereas Basic places up to 40 characters of a name into an object file.

You do not need the ALIAS feature to remove type declaration characters (%, &, !, #, @, $). Basic automatically removes these characters when it generates object code. Thus, Fact% in Basic matches FACT in C.

The ALIAS keyword directs Basic to place aliasname into the object file, instead of name. The Basic source file still contains calls to name. However, these calls are interpreted as if they were actually calls to aliasname. This is used when a Basic name is longer than 31 characters and must be called from C. For example:
DECLARE FUNCTION QuadraticPolynomialFunctionLeastSquares%_ ALIAS "QUADRATI" (a, b, c)
QUADRATI, the alias name, contains the first eight characters of the name QuadraticPolynomialFunctionLeastSquares%. This causes Basic to place QUADRATI into the object file, thereby mimicking C's behavior. (NOTE: If the CDECL keyword was used in the DECLARE statement, Basic would take care of this automatically.)

USING THE PARAMETER LIST

The parameter list syntax is as follows:
[BYVAL | SEG] variable [AS type]...,
You can use BYVAL or SEG, but not both. Explanations of each field are as follows:
  1. Use the BYVAL keyword to declare a value parameter. When such a function is called, the corresponding argument will be passed by value (the default method for C modules).

    Basic provides two ways of "passing by value." The usual method of passing by value is to use an extra set of parentheses, as in:
    CALL HOLM((A))
    This extra-parentheses method actually creates a temporary value whose address is passed. The BYVAL keyword method provides a true method of passing by value because the value itself, not an address, is passed. Only by using BYVAL will a Basic program be compatible with a C routine that expects a value parameter.


  2. Use the SEG keyword to declare a far-reference parameter. When the function is called, the far (segmented) address of the corresponding argument will be passed.


  3. You can choose any legal name for the variable, but only the type associated with the name has any significance to Basic. As with other variables, the type can be indicated with a type declaration character (%, &, !, #, @, $) or the implicit declaration.


  4. You can use the "AS type" clause to override the type declaration of variable. type can be INTEGER, LONG, SINGLE, DOUBLE, STRING, CURRENCY, , a user-defined type, or ANY (which directs Basic to permit any type of data to be passed as the argument). For example:
    DECLARE FUNCTION Calc2! CDECL (BYVAL a%, BYVAL b%, BYVAL c!)


In the example above, Calc2 is declared as a C routine that takes three arguments: the first two are integers passed by value, and the last is a SINGLE-precision real number passed by value.

ALTERNATIVE Basic INTERFACES

Instead of modifying the behavior of Basic with CDECL, you can modify the behavior of C by applying the PASCAL or FORTRAN keyword to the function definition heading. (These two keywords are functionally equivalent.) Or, you can compile the C module with the /Gc option, which specifies that all C functions, calls, and public symbols use the conventions of Basic.

For example, the following C function uses the Basic conventions to receive an integer parameter:
int pascal fun1(int n)
You can specify parameter-passing methods without using a DECLARE statement or by using a DECLARE statement and omitting the parameter list.
  1. You can make the call with the CALLS statement. The CALLS statement causes each parameter to be passed by far reference.


  2. You can use the BYVAL and SEG keywords in the actual parameter list when you make the call. For example:
    CALL Fun2(BYVAL Term1, BYVAL Term2, SEG Sum)


In the example above, BYVAL and SEG have the same meaning that they have in a Basic DECLARE statement. When you use BYVAL and SEG this way, however, you must be careful because neither the type nor the number of parameters will be checked as they would be in a DECLARE statement.

C CALLS TO Basic

No Basic routine can be executed unless the main program is in Basic because a Basic routine requires an initialization environment that is unique to Basic. C will not perform this special initialization.

However, it is possible for a program to start up in Basic, call a C function that does most of the work of the program, and then call Basic subprograms and functions as needed.

The following rules are recommended when you call Basic from C:
  1. Start up in a Basic main module. You must use the DECLARE statement to provide an interface to the C module.


  2. In the C module, declare the Basic routine as extern and include type information for parameters. Use either the FORTRAN or PASCAL keyword to modify the routine itself or compile the C routine with the /Gc switch.


  3. Make sure that all parameters passed by reference are passed using near pointers. Basic cannot receive parameters by far reference. With near pointers, the program assumes that the data is in the default data segment. If you want to pass data that is not in the default data segment, then first copy the data to a variable that is in the default data segment.


  4. Compile the C module in medium or large model.


COMPILING AND LINKING

After you have written your source files and resolved the issues raised in the above sections, you are ready to compile individual modules and link them together.

Before linking, each program module must be compiled with the appropriate compiler. The C modules must be compiled in medium or large model.

In many cases, linking modules compiled with C and Basic can be done easily. Any of the following measures will ensure that all of the required libraries are linked in the correct order:
  1. Put all language libraries in the same directory as the source files.


  2. List directories containing all needed libraries in the LIB environment variable.


  3. Let the linker prompt you for libraries.


In each of the above cases, the linker finds the libraries in the order that it requires them. If you enter the libraries on the command line, the Basic libraries must precede all others.

When linking, the /NOE switch should be used. This prevents "Duplicate Definition" errors.

/NOE is for NO Extended library search. Normally, if a module in a library uses routines in another module, then both of the modules are automatically pulled in without searching the library for the second module. /NOE makes the linker search for the second module.

When the linker automatically pulls in these secondary modules, it can pull in duplicate modules from two libraries. /NOE causes the linker to search for the secondary modules. This means the linker always pulls in the module from the first library the module resides in. Since two duplicate modules won't be brought in, this solves most duplicate definition errors.

If the linker is still generating duplicate definition errors, then use the /NOD switch as well as the /NOE switch. /NOD is for NO Default library search. An object file contains references to libraries that it will need. The linker automatically pulls in these libraries unless /NOD is used. If the /NOD switch is used, the Basic and C libraries must be explicitly defined on the link command line. For example, to link a medium model Microsoft C program with a stand-alone Microsoft Visual Basic version 1.0 program using the /NOD switch, the following link line could be used:
LINK /NOE /NOD basicprg cprog,,, VBDCL10E.LIB mlibce.lib;

DATA TYPES: NUMERICAL FORMATS

Numerical data formats are the simplest kinds of data to pass between C and Basic. The following chart shows the equivalent data types in each language:

Basic C
x%, INTEGER short, int
... unsigned short, --> Not available
unsigned --> in Basic.
x&, LONG long
x!, SINGLEV float
x#, DOUBLE double
STRING * 1 char (inside a structure)
char (inside a structure) char (outside a structure)


DATA TYPES: STRING FORMATS

Basic String Format

Variable-length strings in Basic use "string descriptors" that contain the length and address of the string value. The format of the string descriptor is proprietary and subject to change.

The Standard and Professional Editions of Microsoft Visual Basic version 1.0 for MS-DOS use far strings. Information on using far strings with other languages is covered in the "Microsoft Visual Basic Professional Edition Features" in Chapter 3, "Mixed-Language Programming"

NOTE: Fixed-length strings do not have string descriptors.

C String Format

C stores strings as simple arrays of bytes and uses a null character (numerical 0, ASCII NUL) as the delimiter. For example, consider the string declared as follows:
char str[] = "hello"
The string is stored in 15 bytes of memory as:

+------------+
|h|e|l|l|o|\0
+------------+

Since str is an array like any other, it is passed by reference, just as other C arrays are.

Passing Strings from Basic

When a Basic string (such as A$) appears in an argument list, Basic passes a string descriptor rather than the string data itself. The Basic string descriptor is not compatible with the string format of C.

The routine that receives the string must be aware that if any Basic routine is called, Basic's string-space management routines may change the location of the string data without warning. In this case, the calling routine must note that the values in the string descriptor may change.

The SADD or SSEGADD and the LEN functions extract parts of the string descriptor. SSEGADD extracts the address of the actual string data, and LEN extracts the length. The results of these functions can then be passed.

Basic should pass the result of the SADD or SSEGADD function by value. Bear in mind that the string's address, not the string itself, is passed by value. This amounts to passing the string itself by reference. The Basic module passes the string's address, and the other module receives the string's address. The address returned by SSEGADD is declared as type LONG, but is actually a far pointer.

Before attempting to pass a Basic string to C, you may want to first append a null byte to the end, with an instruction such as the following:
A$ = A$ + CHR$(0)
The string now conforms to the C string format.

Fixed-Length Strings

Fixed-length strings in Basic are stored simply as arrays of contiguous bytes of characters, with no terminating character. There is no string descriptor for a fixed-length string.

To pass a fixed-length string to a routine, the string must be put into a user-defined type. For example:
TYPE FixType A AS STRING * 10 END TYPE

DATA TYPES: ARRAYS

There are three special problems that you must be aware of when passing arrays between Basic and C:
  1. Arrays are implemented differently in Basic, so you must take special precautions when passing an array from Basic to C.


  2. Arrays are declared differently in C and Basic.


  3. Passed arrays must be created in Basic.


Passing Arrays from Basic



Basic uses an array descriptor, which is similar in some respects to a string descriptor. The array descriptor is necessary because Basic may shift the location of array data in memory. Therefore, you can safely pass arrays from Basic only if you follow three rules:
  1. Pass the array's address by applying the VARPTR function to the first element of the array and passing the result by value (with BYVAL). To pass the far address of the array, apply both the VARPTR and VARSEG functions and pass each result by value (with BYVAL). C gets the address of the first element and considers it the address of the entire array.


  2. The routine that receives the array must not, under any circumstances, make a call back to Basic. If it does, then the location of the array may change, and the address that was passed to the routine becomes meaningless.


  3. Basic can pass any member of an array by value. With this method, the above precautions do not apply.


Array Ordering

Basic and C differ in the way that arrays are ordered (or indexed). This issue affects only arrays with more then one dimension. There are two types of ordering: row-major and column-major.

Basic uses column-major ordering, in which the left-most dimension changes fastest. C uses row-major ordering, in which the rightmost dimension changes fastest.

When you compile a Basic program with the BC command line, you can select the /R compile option, which specifies that row-major order is to be used, rather than column-major order.

Passed Arrays Must Be Created in Basic

Basic keeps track of all arrays by using a special structure called an array descriptor. The array descriptor is unique to Basic and is not available in any other language. Because of this, to pass an array from C to Basic, the array must first be created in Basic and then passed to the C routine. The C routine can then alter the values in the array, but it cannot change the length of the array.

DATA TYPES: STRUCTURES, RECORDS, AND USER-DEFINED TYPES

The C struct type and the Basic user-defined type are equivalent. However, these types may be affected by the storage method. By default, C uses word alignment (unpacked storage) for all data except byte-sized objects and arrays of byte-sized objects. This storage method specifies that occasional bytes can be added as padding, so that word and double-word objects start on an even boundary. (In addition, all nested structures and records start on a word boundary.)

When passing structures, the C routine should be compiled with packing turned on to make it compatible with Basic. This is done by specifying the /Zp switch on the CL compile line.

Additional query words:


Keywords          : 
Version           : MS-DOS:1.0,7.1
Platform          : MS-DOS 
Issue type        : 

Last Reviewed: February 16, 1999