C ++: Why would a regular function run faster than Inline?


I was doing a speed experiment and timed an inline function vs a regular function and I repeatedly get better timing on the regular function. Can you please take a look at the code and help me figure out why this is being that inlining is supposed to improve speed.


#include "stdafx.h"
#include <iostream>
#include "time.h"

inline int getMaxInline( int x, int y )
    return ( x > y ) ? x : y;

int getMaxRegular( int, int );

int _tmain(int argc, _TCHAR* argv[])
    clock_t inlineStart;
    clock_t inlineFinish;
    clock_t regularStart;
    clock_t regularFinish;

    inlineStart = clock();
    std::cout<<"inline max of 20 and 10 = "<<getMaxInline( 10, 20 )<<std::endl;
    inlineFinish = clock();

    std::cout<<"Time elapsed for inline = "<<(double(inlineFinish - inlineStart)/CLOCKS_PER_SEC)<<std::endl;

    regularStart = clock();
    std::cout<<"regular max of 20 and 10 = "<<getMaxRegular( 20, 10 )<<std::endl;
    regularFinish = clock();

    std::cout<<"Time elapsed for regular = "<<(double(regularFinish - regularStart)/CLOCKS_PER_SEC)<<std::endl;

    return 0;

int getMaxRegular( int x, int y )
    return ( x > y ) ? x : y;

My last 3 tests ran:

inline = 0.042 regular = 0.003

inline = 0.004 regular = 0.002

inline = 0.006 regular = 0.002

Any insights?


Bluntly, your program isn't constructed even remotely like the way programs that accurately measure the performance of code are constructed. In the call to the inline function, all the output code is not hot in the CPU code and branch prediction caches. In your call to the non-inline function, it is. This will completely dominate any microscopic differences due to avoiding a single call/return operation.