any variant?

Array in C and std::vector in C++ provide contiguous memory for elements of the same type (and implicitly the same size). These two properties are extremely important and because of them, one can index in an array/vector like this:

arr[4] = 10;

That means:
a. lookup the forth element in the array – that’s a arr + 4 * size_of_the_element
b. set the found location to the new value 10

As such, the size of the element needs to be the same for all elements for this to work.
Needless to say this indexing is an important optimization as it’s a quick lookup operation.

However, many times the need for working with heterogenous elements (different type/size) in the same array/vector arises. While the array/vector doesn’t support that natively, two solutions exist that allow us to do just that.

The first is using unions. Unions are C/C++ data structures that can hold different data types with the size of the union being the size of the biggest element. Example:

union U
{
    char c;
    int  i;
    void* ptr;
};

The size of the structure will be the max(sizeof(char), sizeof(i), sizeof(void*)). On 32 bit platforms that will be 4 bytes.
This approach allows you to set any of the components of the union individually but internally you will be referring to the same memory location.
Unions are used to implement the VARIANT type in Windows and other variant types. I remember I implemented a CVariant class about 6-7 years ago when I wrote a GUI library/framework for WinCE.
The catch here is that a union doesn’t tell you which component is ‘valid’, so type information needs to be stored in the Variant class too (which loses some of that static-type compiler ‘intelligence’). You will have to programmatically check the type and do stuff depending on the type.

The other approach is to use pointers. Pointers to objects have the same size irrespective of the type of the object. However, these pointers still need to have the same type in order to use them in containers. A standard C approach would be to use void*, but that’s just too error prone and very much not-C++.
In C++, an approach would be to have a base class and then a templated class that derives from it. The base class cannot be templated, as that would generate different types which would make it impossible to use in an array/vector. For example:

class base
{
public:
    virtual base* clone() = 0;
};

template <class T>
class deriv : public base
{
public:
     deriv(const T& t);
     virtual deriv<T>* clone() = 0;
};

When you construct such a pointer, you use the deriv class with the actual type, however, when you place it in the container, you place it as a base pointer. As such you’re container will look like this:

std::vector<base*> container;

Alternatively, you can just use boost::any which implements this scheme to give you any type. Unlike the union solution, this second solution allocates instances on the heap/free store. The union can store instances on the stack or on the heap. The limitation that comes with that is that the union can only hold the types that are in the union (in our example: char, int and void* only). boost::any on the other side can host any type.

Advertisements

Tags: , , ,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: