Arc Forum | This approach is the best. In fact this has to be done, because not all compute...

Arc Forum

3 points by almkglor 6643 days ago | link | parent

This approach is the best. In fact this has to be done, because not all computers have sizeof(int) == sizeof(void* ); certainly my AMD64 running on a 64-bit kernel doesn't.

Note that we can actually abstract away the representation of objects from the actual binary bits that represent it. For instance I would recommend that objects use the following representation (as noted by binx):

  typedef struct s_obj{
    enum e_type {
      other = 0,
      number = 1,
      symbol = 2,
      character = 3
    }
    type;
    union u_data{
      void* other;
      int number;
      void* symbol;
      Uchar character; //unicode character
    } data;
  } obj;
  #define OBJ_TYPE(o) ((o).type)
  #define OBJ2FIXNUM(o) ((o).data.number)
  #define FIXNUM_MIN INT_MIN //requires glibc
  #define FIXNUM_MAX INT_MAX
  //others defined similarly

However, for machines with the following characteristics:

1. 32-bit pointers and 32-bit int's

2. malloc() always returns an address at a 32-bit boundary (i.e. every four bytes, or with the lowest two bits zero)

We can then use:

  typedef int obj;
  #define OBJ_TYPE(o)  ((o) & 0x03)
  #define OBJ2FIXNUM(o)  ((o) >> 2)
  #define FIXNUM_MIN (-(1 << 30))
  #define FIXNUM_MAX ((1 << 30) - 1)

Thus we are able to have portability while still gaining optimizations for some architectures.

1 point by sacado 6642 days ago | link

In fact, maybe it's not a problem even on machines where ints are not as big as pointers. You can treat each value as an int pointer, and, when you need a fixnum, do

  (int) my_int_ptr << 1

If ints are bigger than pointers, that's okay : you loose possible bits, but at least it's working (for example, if ints are on 6 bytes and addresses on 4 bytes, only the 4 least significant bytes are useful, the 2 other ones are lost). If pointers are bigger than ints, it's still working, but this time lost space is in their pointer representation.

The only thing is that fixnums don't have an architecture-independant size, but that's not a problem in practice anyway. The only constraint is that the biggest fixnum is

  2**(max(sizeof(int), sizeof(int*)) - 2)

The last-bit-is-1-for-fixnum trick works on every architecture except machines where addresses are on 8 bits. Well, maybe we can ignore these ?

-----

3 points by almkglor 6642 days ago | link

Actually it is, arc2c output crashes on my AMD64. I'm trying to hack out a replacement, but it's not working yet.

The problem is that apparently the bits of pointers that happen to be beyond the bits of ints are not zero, so assigning a pointer to an int chops off the significant bits.

-----

1 point by sacado 6641 days ago | link

Oh... Maybe a simple fix would be to change int to long. I know this is not standard behavior, but in practice I think long and pointers are always the same size. mzscheme obviously works that way : every object is a pointer that you can cast to a long if it is a fixnum.

-----

2 points by almkglor 6641 days ago | link

This seems to be correct on my AMD64:

  #include<stdio.h>
  
  int main(void){
        printf("sizeof(int) = %d\n", sizeof(int));
        printf("sizeof(long) = %d\n", sizeof(long));
        printf("sizeof(void*) = %d\n", sizeof(void*));
        return 0;
  }

  sizeof(int) = 4
  sizeof(long) = 8
  sizeof(void*) = 8

However I think there may be processors/archs where sizeof(long) != sizeof(void* ). My attempt at fixing was to use a union, but something's wrong with the way closures are handled in my fix attempt - closures don't seem to have a type associated with them, so I'm not exactly sure how they're supposed to be done.

Edit: I'll probably need to search through C language specs, though - I'm not sure, but it might be standardized that sizeof(long) >= sizeof(void* )

-----

2 points by absz 6641 days ago | link

What you want are the two typedefs intptr_t and uintptr_t . Each one is defined to be an integer big enough to hold a pointer (the former being signed, and the latter being unsigned); they aren't mandatory, though, so you might have

  #if defined(__COMPILER1__) || defined(__COMPILER2__)
    typedef intptr_t fixnum;
  #else
    typedef long fixnum;
  #endif

This would guarantee you correct behaviour if intptr_t were defined, and probably give you the correct behaviour even if it weren't.

-----

1 point by almkglor 6641 days ago | link

From this?

  #include<stdint.h>

Hmm. Interesting. How well supported is it?

-----

1 point by sacado 6641 days ago | link

Thanks for the information ! I'll try it.

-----