C++ and memory management

This article is part 0 in a series: Memory management

I thought that it would be a good idea to write a post about memory management and different types in C++. C++ is somewhat of a lower level programming language, at least the lowest level language I have experience with, so C++ will do well to explain the basics of memory management before proceeding to the higher level languages.

So, what does make C++ good as a language to describe this? Well, mainly due to the fact that the developer have to take care of cleaning up the heap memory themselves! As described in earlier post about the GC, there are two parts of the memory that the language makes use of. The stack and the heap. All dynamical data is allocated on the heap.

Hopefully you have read the previous post about memory management. In that one I briefly describe the difference between the heap and the stack. I’ll skip the “normal” variables that are stored on the stack, cause I think that the previous chapter covers that enough.

Dynamic data - Pointers

Dynamic data in C++ is allocated when the new keyword is used to create a new instance of a class. The variable itself is stored on the stack as a pointer and takes little space (4-8 bytes, depending on if its a 32 or 64 bit OS) while the actual object data is stored on the heap and could be of pretty much any size (depending on the class it is created from). When declaring a pointer the variable type is preceded by a * character.

ClassName* variableName = new ClassName();

In the above snippet, a pointer which has the name variableName is created, the data type is a pointer to a ClassName class.

To make sure that the data is removed from the heap - when the object is supposed to be removed - the delete keyword is used. Deleting the pointer will not remove the actual pointer (the data on the stack), but rather clean up the memory it points to (the data on the heap), the pointer will exist til it runs out of scope and is cleared from the stack. When a pointer has been “deleted” it will point to garbage data. And if it is used after that, it will probably crash the application (if lucky). I usually prefer to set the pointer to NULL (or nullptr) when it has been deleted and not use a pointer without either pointing it to another object or create a object on the heap.

Referencing objects on the stack

Objects on the stack are not created with the new keyword, sometimes we would want to use a pointer to point to data on the stack, to a non-pointer object. This type of pointers do not have to be created with the new keyword.

void PointToLocal() {
  int localInt = 5;
  int* intPtr = &localInt; // The & sign is used here to assign the localInt to the pointer.
}

In this example, a new object is created on the stack (localInt) and then a pointer is created pointing on the piece of memory that the localInt variable is using. This means, that if the localInt variable is removed (goes out of scope) the intPtr variable will point to bad data. A pointer points to a piece of memory, it is pretty much just a reference to a memory address, if the variable is changed (say you increase the variable that the pointer points to with one), the pointer will change too. You can though change what the pointer points to.

void PointToLocal2() {
  int localIntA = 5;
  int* intPtr = &localIntA;
  localIntA++;
  // intPtr is now 6.
  int localIntB = 100;
  intPtr = &localIntB; // intPtr now points to the memory address of localIntB.
  localIntB += 10; // intPtr & localIntB is now 110 and localIntA still 6.
}

The fact that the pointer is just a smal piece of data when it points to a object on the stack makes it a very good thing to pass to functions as a parameter. But be aware, if it changes inside the function, its changed outside too. This is why it could be a good idea to pass it as a const variable. If the pointer in the above example runs out of scope, its gone, and it wont have to be deleted cause it was never allocated on the heap.

Pointer as a class member

A class can contain a pointer as a member. One thing you will have to be careful with though is that any pointer that is created in a class have to be removed too. If you fail to delete the pointers in the destructor, you will create a memory leak when the instance is deleted!

#ifndef EXAMPLECLASS_H
#define EXAMPLECLASS_H

class ExampleClass
{
  public:
  // Constructor.
  ExampleClass(TestClass* aTestObject)
  {
    myTestObject = new TestClass();
    myTestObject2 = aTestObject;
  }

  // Destructor.
  ~ExampleClass()
  {
    // Delete the object!
    delete myTestObject;
    // I prefer to make it null after deletion.
    myTestObject = nullptr; // NULL | 0
    myTestObject2 = nullptr; // NULL | 0
  }

  private:
  // The member that is a pointer.
  TestClass* myTestObject;
  TestClass* myTestObject2;
};

#endif // EXAMPLECLASS_H

As you can see in the example above, the constructor takes a pointer as a parameter which is then stored in the myTestObject2 member, another pointer is created and memory for a TestClass instance is allocated on the heap. Now, in the destructor, the test object allocated on the heap is deleted and set to null, the other pointer don’t have to be. Why is that? Well, cause its created outside of the class and passed in. What would happen if it was deleted? Well, the memory that it points to would be deleted, and the pointer that was passed in would point to bad data. This would create quite annoying issues!

Now, the destructor is called when a object on the stack goes out of scope and is destroyed, but also when a pointer is deleted. So its in the destructor method that all memory cleanup is supposed to be made. Deallocate all memory allocated within the class.

References (Aliases)

A reference is not that much different from a pointer, it can be used much the same, but is a bit more safe and not just as powerful. I was once recommended (when I was new to C++) to use references whenever I could and only use pointers when they are really needed. I would like to pass this recommendation on, cause its a good one.

A reference is a data type which references a given variable. “Isn’t this the same as a pointer then?” you might wonder. Well not really. The pointer points to a piece of the memory, while the reference points to a variable. If a reference is pointing to something it can not be changed to point to something else. And they always need to be created by referencing something. This means that the reference pretty much IS the variable it is referencing, a alias of the original object. A reference is allocated on the stack but wont create any additional data for the object it references, cause the data already exist. That means that the reference don’t have to be deleted. This way, references are a lot more safe to use than pointers.

References are also perfectly fine to pass through functions, just like with pointers, it could be preferable to pass them as const if they are not supposed to be modified (cause they will be modified in the outer scope too!). Passing a object through a function without making it a reference or a pointer will create a new (using copy constructor) object, a copy. A copy of an object uses more memory than a pointer or a reference, this cause it has to allocate new memory for the given object and copy all the data from the old variable to the new.

The following example show what happens when pointer, reference and copy are passed into a function and changed:

void ParameterExample(int& aIntRef, int* aIntPtr, int aInt)
{
  aIntRef = 99;
  aIntPtr = 100;
  aInt = 1001;
}

void main()
{
  int willBeIntRef = 10;
  int willBeIntPtr = 10;
  int willBeInt = 10;

  ParameterExample(willBeIntRef, &willBeIntPtr, willBeInt);

  // willBeIntRef is now 99.
  // willBeIntPtr is now 100.
  // willBeInt is still 10.
  return 0;
}

As you might see, the declaration of the function uses the * sign for the pointer and & sign for the reference, but when called the reference is not using any sign, while the pointer uses the &. This might feel odd, but it is correct. The reference just creates a reference to the variable passed, it wont make a copy nor point to the address just the variable (as an alias). The pointer is a reference to a memory address, and the ampersand set before the variable shows that it should be a reference to the memory address. The last passed parameter is just a int, it will become a copy and a new int will be added to the stack during the execution of the function. The two first values will be changed inside the function and they will change in the outer scope (the main function) cause they are a ref and a pointer, the last parameter will not though, cause the thing that changes inside the function is not the thing that is changed outside, but just a copy of it.

Reference as class member

Using a reference as a member of a class is okay, but as said earlier, a reference can not be initialized without a value to point to. So when creating a class where a member is a reference, the reference needs to be set in the constructor initialization list as such:

class Example {
  public:
    Example() : intRef(5)
    { }

  private:
    int& intRef;
}

Other than that, its not that big of a diff from another member, not much to say. If it references another object, it will behave like its the same object, the class will be “cheaper” in terms of memory as the stack will only contain another address to a variable.

Why can’t the reference be created in the constructor? Well, if you create a variable to reference in the constructor, it will only have the lifetime of the constructor scope. There is no use in this, so any decent compiler will throw an error in that case.

Johannes Tegnér

Dynamic data - Pointers

Referencing objects on the stack

Pointer as a class member

References (Aliases)

Reference as class member