Foreign Language Code

From wiki.visual-prolog.com

Revision as of 23:24, 3 February 2008 by Thomas Linder Puls (talk | contribs) (Unicode link)

"Foreign Language Code" refers to code written in other programming languages than Visual Prolog. Visual Prolog can make direct calls to code written in foreign languages. This tutorial will explain some of the concepts and details. Calling foreign code directly requires interaction with this code on a very low binary level. In the simple case this is rather simple, but it can also be incredible complex. One thing is certain: handling complex situations requires a lot of insight in both the Visual Prolog and the foreign world. But don't let this scare you; in many cases the necessary interaction is actually rather simple.

The principles described in this tutorial will help you to make calls to the Microsoft Win32 Platform API, which will open a large world of possibilities to you.

Key Concepts

Foreign compilers do a number of things different than Visual Prolog, both because they are invented by different people, but also because they have to support the different natures of different languages. It would be impossible for Visual Prolog to interact with all foreign code, simply because it is not possible to know the principles used by any other compiler. So, for Visual Prolog to interact with foreign code, this code will have to behave in certain ways.

Visual Prolog can (among other) call code, which is written in C and compiled with Microsoft's C compiler. But it cannot call any such code; the code will have to be "compatible". I cannot setup strict limits, because it is very often possible to handle "impossible" cases if you are creative.

In order to use all foreign code you must of course have access to the code. In this tutorial we are dealing with code, which is either linked directly into your own program, or which is located in a Dynamic Link Library (DLL).

Next you will have to locate the code. This is done by means of a name. If the code is linked directly into your program, the name you have to use is a "link" name. If the code is located in a DLL you have to use the "export" name. Whether the name is called a link name or an export name makes no difference in the Visual Prolog code, but it may make a difference, when you are trying to find the name in the foreign code/system. I write system, because sometimes the name you have to use does not appear in the code at all. In the sequel I will just use the term link name, for this concept.

Now we have to code and we know where it is. Next we have to pass input parameters and invoke the code and, when the code is executed, we have to retrieve output, and so forth. There exists many different ways to perform this process. Obviously the caller and the callee must agree on how this has to be done. The caller and the callee must agree on the calling convention.

But not only must the caller and the callee agree on how parameters, etc are transferred, it is also important that the caller and the callee both interpret the transferred bytes in the same way. In other words the data representation' must be the same.

The last thing that must be dealt with is memory management. Both the caller and the callee must agree on who allocates memory and who deallocates it, and on when this should happen. And if the deallocator is not the same as the allocator, the deallocator will also have to know how to deallocate the memory.

All in all there are four key issues to care about when calling foreign code:

  • link name
  • calling convention
  • data representation
  • memory management

Calling Convention & Link Name

I will discuss calling convention and link names together because these are related in the sense that traditionally certain compilers use some calling convention together with some scheme for link names.

The link name (or export name) is the name used to identify the foreign code that you want to call. Different compilers use different link names by default, and many compilers have ways of specifying specific link names. In Visual Prolog you can state the link name in an "as" qualification to a predicate declaration:

predicates
    pppp : (integer) as "LinkName".

In the Visual Prolog program the predicate above is named "pppp", but its link name is "LinkName".

Notice that the only class predicates have a link name, which means that it must either be declared in a class or in a class predicates section in an implementation.

Visual Prolog supports a number of different calling conventions. The calling convention is also stated in the predicate declaration, but using the "language" qualification:

predicates
    qqqq : (integer) language c.

By convention C compilers create the link names by adding an underscore in front of the name which is used in the C program. If you use the c calling convention and do not supply a link name, Visual Prolog will also use this convention (notice that up to the build 6107 the compiler actually uses another naming strategy, which means that you have to use "as" to obtain such link names). Meaning that the link name of qqqq above will become "_qqqq". If you state a link name, this will be used literally:

predicates
    rrrr : (integer) language c as "LinkName".

rrrr will use the C calling convention and have the linkname LinkName (i.e. without an underscore).

C++ compilers traditionally use the C calling convention, but they cannot rely on the C link names, because C++ allows overloading. I.e. in C++ the same name can be used for different functions as long as these either have different number of arguments and/or arguments with "distinguishable" types. These different variants will have to have different link names. Therefore C++ compilers create "sophisticated" names based on the C++ name, the number of arguments and the types of these arguments. This process is known as name mangling.

Different compilers use different name mangling algorithms, so these are (typically) not compatible with each other. Therefore it is quite common to either use explicit link names in C++ routines that should be accessed from foreign code, or to enclose the declaration inside an export "C" section, which makes the compiler use the C naming convention.

Visual Prolog also supports a calling convention know as stdcall. This calling convention is used by Microsoft Visual Basic, for Microsoft Win32 Platform API calls, and by many Pascal-family compilers including Borland Delphi. Actually C and C++ programs very often also use this calling convention for routines that has to be accessed from foreign code. Visual Prolog uses the same naming convention for stdcall as for c calls. (Notice that up to build 6107 the compiler actually uses another naming strategy, which means that you have to use "as" to obtain such link names). That means that by default an underscore is added in front of the Prolog name to create the link name, but if you specify a link name, it will be used literally.

Microsoft Win32 Platform API uses the stdcall calling convention, but it uses another naming convention. The names are decorated a little more than in the c calling convention. You can find the details in the Visual Prolog Language reference manual. Visual Prolog has a special calling convention apicall in order to support this naming convention. The apicall is actually the same calling convention as stdcall, but the names are decorated differently. With the apicall calling convention explicit link names stated in an "as" qualification are also decorated. If you need a name, which is decorated differently, you will have to use the stdcall calling convention and specify the decorated name yourself.

Data Representation

It is beyond the scope of this tutorial to deal with data representation in details, because there are many details. But I will give a brief description of how Visual Prolog represents various data.

Input number parameters are passed by value, meaning that the value is pushed directly on the call stack. Output number parameters are passed by reference, meaning that a pointer to where the result should be stored is pushed on the call stack. On the call stack all integral numbers occupy 32bits, whereas reals occupy 64bits.

Characters are represented as numbers.

All other data is represented by a pointer to the real data. Input parameters are passed by pushing the pointer directly on the call stack. Output parameters are passed by pushing a pointer to where the resulting pointer should be stored (i.e. a pointer to a pointer).

The values of a functor domain are (thus) represented as a pointer to a piece of memory, which first holds the functor (represented by a number in 32bits) followed by each of the sub-components. The sub-components are represented like described above, i.e. either as a number directly in the functor, or as a pointer to the actual data.

If a functor domain only has one alternative, then the functor is skipped, as this would always have the same value anyway. (Notice that in Visual Prolog 5 the functor is present unless the domain is declared using the struct keyword, but in the current version of Visual Prolog it is never present).

Functor representation can be different by using the align qualification; please refer to the language reference regarding this.

Strings are represented as a pointer to a zero terminated sequence of characters, like in C.

Binaries are represented as a pointer to the binary data. The value, which equals to the length of the data plus the size of unsigned, is stored immediately before the data.

Example

Assume that you want to call a C routine from your Visual Prolog program. The declaration in C looks like this:

int myRoutine(wchar_t * TheString, int BufferLength, wchar_t * TheResult);

"wchar_t *" means Unicode string in C. So the routine takes three arguments:

  • TheString which is a string
  • Bufferlength which is an integer
  • TheResult which is a string

And it returns an integer. The corresponding Visual Prolog declaration could therefore look like this (assuming that it is declared in an implementation):

class predicates
    myRoutine : (string  TheString, integer BufferLength, string TheResult) -> integer language c.

If the predicate is declared in a class declaration' then the "class" keyword should be removed.

"externally" & Libraries

If you declare a predicate like above the compiler will however give an error stating that it cannot find the clauses for the declared predicate. But since the predicate is not supposed to be implemented in Visual Prolog at all, this is not surprising. What you have to do is to inform the compiler that the code for this predicate should be found somewhere externally. I use the word externally, because this is exactly what you have to say in a so called resolve qualification in the implementation of the class that declared the predicate. If the class is called xxx, then it looks like this:

implement xxx
    resolve
        myRoutine externally
    ...

When you have done this, the compiler will accept the declaration, but you will, perhaps, experience that the linker now gives an error, stating that _myRoutine is undefined. This is because the library, which contains _myRoutine, is not When you have done thispart of your project.

If _myRoutine is situated in a static library (a LIB file), then that file should simply be added to the project and the linker will then extract relevant code from the LIB and add it to the program.

If _myRoutine is situated in a DLL, then you should still have a LIB file, which describes the DLL, and then this LIB file should be added to the project. In this case the linker will place information in you program, which describes which DLL the routine should be found in.

If _myRoutine is situated in a DLL, but you do not have a LIB file corresponding to that DLL, you can still call the routine. You can simply write the name of the DLL in the resolve qualification like this:

implement xxx
    resolve
        myRoutine externally from "myDLL.dll"
    ...

In this case the compiler will actually add code, which loads the DLL dynamically, when myRoutine is invoked the first time. The DLL is not unloaded again (i.e. not before the program exits).

The package pfc/application/useDll can also be used to load and unload DLL's dynamically. How (and why) to use that package is outside the scope of this tutorial.

Memory Management

Numbers and characters are passed directly on the call stack or written directly in the needed piece of memory, so the memory handling of these types is trivial. It simply works.

Data, which are not transferred directly on the call stack, are more problematic. The problem is that the memory in which the data reside must live as long as one of the routines (the caller or the callee) needs the data. But once none of them needs it, the memory should be reclaimed to avoid memory leakage.

It is normally the case that only the one that allocates some memory can reclaim it again, simply because the other actor cannot know how to reclaim it. It is, of course, possible to make sure that both sides use the same mechanism. One side could for example expose its mechanism to the other side (for example as exported routines).

The memory allocation routines used in Visual Prolog are exposed by the runtime system (in Vip6kernel.dll) and can be called from foreign code.

To deal with the problem of knowing when the memory should be reclaimed, the program and the library code normally use some agreed method.

The Typical Solution

In this section I will describe what I believe is the most common way to solve the memory handling problems (as long as we are not involving COM and .Net).

The principle is simple:

  • Input parameters only survive during the call, so if the callee needs to store some of the input for later use, it must copy this input to some memory, which it controls itself.
  • Output is copied to memory buffers, which are provided by the caller, so no memory, which is allocated by the callee, is passed to the caller.

The myRoutine above is a typical example of such a routine. The output is written in TheResult, which is actually a sting that is allocated by the caller. That is why it appears to be input to the routine and that is also why the length of the "buffer" is passed (i.e. BufferLength).

The Visual Prolog 6 code that would call the code above could look like this:

clauses
    p(TheString) = TheResult :-
        TheResult = string::create(bufferLength),
        RetCode = myRoutine(TheString, bufferLength, TheResult),
        checkRetCode(RetCode).

string::create is used to allocate the memory for the string. myRoutine copies its result into TheResult (something a Visual Prolog program would never do), it does not return any memory it has allocated itself.

Garbage Collection

If you do not rely on the "typical" solution, then take the following into account: Visual Prolog uses a garbage collector' for managing its heap.

You should notice that the garbage collector is kind of turns the reclaiming around: you should not reclaim storage, when it is not needed anymore, you should keep it alive as long as it is needed in the foreign context. Normally this is not a problem, because it is handled completely automatically for you. But the garbage collector cannot (necessarily) know when data is alive in the foreign part of the code. Therefore you should be certain that it stays alive longer time in the Visual Prolog part of the code than in the foreign part of the code.

A typical way to keep memory alive is to assert it in a fact, and then retract it again, when it is not needed by the foreign code anymore.

Visual Prolog 6 used some memory called a G-stack. However, beginning with Visual Prolog 7.0, G-stack is not used.

Microsoft Win32 Platform API

You might think: "I will never write any foreign code, so this not for me". You might be right about the first part of the sentence but the conclusion might be wrong anyway. The reason is that the complete operating system is "foreign".

The operating system provides thousands of interesting foreign routines, which you can call from Prolog programs using the principles from this tutorial. These routines are known as the Microsoft Windows XXX Platform API. The routines vary from platform to platform; it is especially true that newer platforms (e.g. XP) have more routines than older platforms (e.g. NT 4.0).

The Platform API is documented online in Microsoft's MSDN library.

The documentation has one problem though (for our purpose): It mentions a lot of constants by name, but nowhere does it say what value these constants have. For example, the documentation of the routine PlaySound says that you can use a flag called SND_ASYNC if you want the sound to be played asynchronous with the execution of your application (meaning that the sound plays while your application continues to do other things). But SND_ASYNC is actually a number, but the documentation does not say what this number is.

To find the value of such constants you can download the Platform SDK for C/C++ from Platform SDK Update. In this you can find the constants (as C/C++ macros) in the C/C++ header files (*.h).

Based in the information you have received here you should be able to create code like this yourself:

implement playSoundFile
    open core
 
domains
    soundFlag = unsigned32.
 
class predicates
    playSound : (string Sound, pointer HModule, soundFlag SoundFlag) -> booleanInt Success
        language apicall.
resolve playSound externally
 
constants
    snd_sync : soundFlag = 0x0000.
         /* play synchronously (default) */
    snd_async : soundFlag = 0x0001.
         /* play asynchronously */
    snd_nodefault : soundFlag = 0x0002.
         /* silence (!default) if sound not found */
    snd_memory : soundFlag = 0x0004.
         /* Sound points to a memory file */
    snd_loop : soundFlag = 0x0008.
         /* loop the sound until next sndplaysound */
    snd_nostop : soundFlag = 0x0010.
         /* don't stop any currently playing sound */
    snd_nowait : soundFlag = 0x00002000.
         /* don't wait if the driver is busy */
    snd_alias : soundFlag = 0x00010000.
         /* name is a registry alias */
    snd_alias_id : soundFlag = 0x00110000.
         /* alias is a predefined id */
    snd_filename : soundFlag = 0x00020000.
         /* name is file name */
    snd_resource : soundFlag = 0x00040004.
         /* name is resource name or atom */
    snd_purge : soundFlag = 0x0040.
         /* purge non-static events for task */
    snd_application : soundFlag = 0x0080.
         /* look for application specific association */
 
clauses
    run() :-
        console::init(),
        _ = playSound("tada.wav", null, snd_nodefault+snd_filename).
 
end implement playSoundFile
 
goal
    mainExe::run( playSoundFile::run ).

References