How are virtual functions implemented at a deep level?
From "Virtual Functions in C++":
Whenever a program has a virtual function declared, a v - table is constructed for the class. The v-table consists of addresses to the virtual functions for classes that contain one or more virtual functions. The object of the class containing the virtual function contains a virtual pointer that points to the base address of the virtual table in memory. Whenever there is a virtual function call, the v-table is used to resolve to the function address. An object of the class that contains one or more virtual functions contains a virtual pointer called the vptr at the very beginning of the object in the memory. Hence the size of the object in this case increases by the size of the pointer. This vptr contains the base address of the virtual table in memory. Note that virtual tables are class specific, i.e., there is only one virtual table for a class irrespective of the number of virtual functions it contains. This virtual table in turn contains the base addresses of one or more virtual functions of the class. At the time when a virtual function is called on an object, the vptr of that object provides the base address of the virtual table for that class in memory. This table is used to resolve the function call as it contains the addresses of all the virtual functions of that class. This is how dynamic binding is resolved during a virtual function call.
Can the vtable be modified or even directly accessed at runtime?
Universally, I believe the answer is "no". You could do some memory mangling to find the vtable but you still wouldn't know what the function signature looks like to call it. Anything that you would want to achieve with this ability (that the language supports) should be possible without access to the vtable directly or modifying it at runtime. Also note, the C++ language spec does not specify that vtables are required - however that is how most compilers implement virtual functions.
Does the vtable exist for all objects, or only those that have at least one virtual function?
I believe the answer here is "it depends on the implementation" since the spec doesn't require vtables in the first place. However, in practice, I believe all modern compilers only create a vtable if a class has at least 1 virtual function. There is a space overhead associated with the vtable and a time overhead associated with calling a virtual function vs a non-virtual function.
Do abstract classes simply have a NULL for the function pointer of at least one entry?
The answer is it is unspecified by the language spec so it depends on the implementation. Calling the pure virtual function results in undefined behavior if it is not defined (which it usually isn't) (ISO/IEC 14882:2003 10.4-2). In practice it does allocate a slot in the vtable for the function but does not assign an address to it. This leaves the vtable incomplete which requires the derived classes to implement the function and complete the vtable. Some implementations do simply place a NULL pointer in the vtable entry; other implementations place a pointer to a dummy method that does something similar to an assertion.
Note that an abstract class can define an implementation for a pure virtual function, but that function can only be called with a qualified-id syntax (ie., fully specifying the class in the method name, similar to calling a base class method from a derived class). This is done to provide an easy to use default implementation, while still requiring that a derived class provide an override.
Does having a single virtual function slow down the whole class or only the call to the function that is virtual?
This is getting to the edge of my knowledge, so someone please help me out here if I'm wrong!
I believe that only the functions that are virtual in the class experience the time performance hit related to calling a virtual function vs. a non-virtual function. The space overhead for the class is there either way. Note that if there is a vtable, there is only 1 per class, not one per object.
Does the speed get affected if the virtual function is actually overridden or not, or does this have no effect so long as it is virtual?
I don't believe the execution time of a virtual function that is overridden decreases compared to calling the base virtual function. However, there is an additional space overhead for the class associated with defining another vtable for the derived class vs the base class.
Additional Resources:
http://www.codersource.net/published/view/325/virtual_functions_in.aspx (via way back machine)
http://en.wikipedia.org/wiki/Virtual_table
http://www.codesourcery.com/public/cxx-abi/abi.html#vtable
Your question made me curious, so I went ahead and ran some timings on the 3GHz in-order PowerPC CPU we work with. The test I ran was to make a simple 4d vector class with get/set functions
class TestVec
{
float x,y,z,w;
public:
float GetX() { return x; }
float SetX(float to) { return x=to; } // and so on for the other three
}
Then I set up three arrays each containing 1024 of these vectors (small enough to fit in L1) and ran a loop that added them to one another (A.x = B.x + C.x) 1000 times. I ran this with the functions defined as inline
, virtual
, and regular function calls. Here are the results:
- inline: 8ms (0.65ns per call)
- direct: 68ms (5.53ns per call)
- virtual: 160ms (13ns per call)
So, in this case (where everything fits in cache) the virtual function calls were about 20x slower than the inline calls. But what does this really mean? Each trip through the loop caused exactly 3 * 4 * 1024 = 12,288
function calls (1024 vectors times four components times three calls per add), so these times represent 1000 * 12,288 = 12,288,000
function calls. The virtual loop took 92ms longer than the direct loop, so the additional overhead per call was 7 nanoseconds per function.
From this I conclude: yes, virtual functions are much slower than direct functions, and no, unless you're planning on calling them ten million times per second, it doesn't matter.
See also: comparison of the generated assembly.
Best Answer
Derived classes do not have to implement all virtual functions themselves. They only need to implement the pure ones.1 That means the
Derived
class in the question is correct. It inherits thebar
implementation from its ancestor class,Abstract
. (This assumes thatAbstract::bar
is implemented somewhere. The code in the question declares the method, but doesn't define it. You can define it inline as Trenki's answer shows, or you can define it separately.)1 And even then, only if the derived class is going to be instantiated. If a derived class is not instantiated directly, but only exists as a base class of more derived classes, then it's those classes that are responsible for having all their pure virtual methods implemented. The "middle" class in the hierarchy is allowed to leave some pure virtual methods unimplemented, just like the base class. If the "middle" class does implement a pure virtual method, then its descendants will inherit that implementation, so they don't have to re-implement it themselves.