Functor Domain Layout

From wiki.visual-prolog.com

Revision as of 17:42, 21 January 2015 by Thomas Linder Puls (talk | contribs) (→‎References: remove header)

The tutorial describes the memory layout of functor domain in Visual Prolog. A functor term is allocated in the (memory) heap, and predicates etc works with a pointer to the allocated memory.

One alternative

Functor domains with (just) one alternative (here called a struct) is laid out in a compact way. The layout is fully compatible with Microsoft C/C++'s struct layout.

All data have an "alignment" size:

  • For simple data (numbers, etc) it is the size of the data.
  • For pointers it is the size of a pointer.
  • For a struct it is the largest of the components alignment sizes (unless an explicit alignment is stated).


Example
  • unsigned8 -> 1 byte
  • integer -> 4 bytes
  • real -> 8 bytes
  • string -> 4 bytes in a 32bit program; 8 bytes in a 64bit program
  • fff(integer8 A, unsigned16 B, real C) -> 8 bytes because real have alignment 8 bytes

When laying out a struct term all components are placed at the next available byte that match their alignment.

Example So fff(integer8 A, unsigned16 B, real C) is laid out as follows:
OffsetContents
0A
1(empty)
2B (low)
3B (high)
4(empty)
5(empty)
6(empty)
7(empty)
8C (byte 1)
9C (byte 2)
10C (byte 3)
11C (byte 4)
12C (byte 5)
13C (byte 6)
14C (byte 7)
15C (byte 8)

Besides being compatible with C/C++/Windows API the purpose of the alignment is to ensure that memory operations are performed efficient: When fetching unaligned data the processor will have to do it as two separate memory fetches and perform internal rearrangement of the data before it gets the right shape).

The built-in predicate sizeOf and sizeOfDomain predicates returns the total size of the struct i.e. 16 (bytes) for the fff struct in the example.

Multiple alternatives

If there are more alternatives in the functor domain the layout is much less compact, because in that case each field is laid out in a "pointer" size piece of memory. Future releases of Visual Prolog will (almost definitely) adopt the compact layout for such domains as well.

When there are several alternatives it is necessary to distinguish between the different alternatives. The default mechanism for this is to place an extra integer first in the struct representing the functor. Visual Prolog (version 7.5) do however avoid this extra field in two ways.

Alternatives without data

On the Windows platform a pointer (pointing to the heap) will never be a small number. Visual Prolog exploit this property for the representation of functor alternatives that doesn't carry any data. Instead of allocating any data at all and pointing to it, the functor will be represented directly in as a small pointer value.

Example {{{1}}}

If a domain only contains functor alternatives that doesn't carry data, it effectively becomes an enumeration domain where numbers represents the different enumerants.

Example
domains
    boolean = false; true.

Both false and true are functors alternative that doesn't carry any data. false will be represented as 0 and true as 1.

Alternatives with data

If an alternative carries data it will be represented as a pointer to the memory that carries the data. The memory is always allocated on a pointer size boundary, here we will only consider 64 bit programs so alignment is on 8 byte boundaries (the reader will have to "translate" to 32 bit if necessary). The 3 least significant bits of a pointer that points to an 8 byte boundary will be zero, so they are in a sense unused. Visual Prolog exploit these 3 unused bits in the following way. 3 bits gives 8 values. 7 of these are used to represent a functor value the last value means the "remaning" cases.

So for the first 7 functor alternatives that carry data, the allocated data will not contain a functor value, instead the functor number is encoded in the three least significant bits of the pointer to the data. (This encoding can be treated rather efficiently, but the details is outside the scope of this article).

  • If there are no more alternatives (after the data-less ones and up to 7 data carrying ones) we are done.
  • If there is exactly one more alternative this doesn't need a functor and it treated in the same way as the 7 first ones.
  • If there are more than one alternative left these alternatives will be represented in the "default" way with a functor-number as the first piece of data in the allocated memory.

Many domains have less than 8 data carrying alternatives, so the "default" representation is not used very often.

In a 32 bit program there are only unused 2 bits giving 4 values. And thus 3 or 4 functor free alternatives.

Example
domains
    example = enum1; enum2, enum3; ...
        functorFree1(...); functorFree2(...); functorFree3(...); functorFree4(...); 
        functorFree5(...); functorFree6(...); functorFree7(...);
        dataCarry1(...); dataCarry2(...); ...
  • The data free functors enum1, enum2, enum3, ... are represented as small numbers 0, 1, 2, ...
  • The first 7 alternatives (functorFree1 - functorFree7) will be represented as a pointer + a functor number (0-6), where the pointer points to the allocated data (which does not contain a functor).
  • The remaining alternatives (dataCarry1, dataCarry2, ...) )will be represented as pointer+7, where the pointer points to the allocated data (which does contain a functor)