Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
211 views
in Technique[技术] by (71.8m points)

c++ - std::unique_ptr operator[] vs. raw ptr dynamic array operator[]

I have the following code sample:

void foo(int size)
{
    std::unique_ptr<uint8_t[]> data = std::make_unique<uint8_t>(size);
    for(int i = 0; i < size; i += 2)
    {
        data[i] = 1;
        data[i + 1] = 2;
    }
}

The actual code does actual calculations, but that does not matter for this question and was removed for simplicity's sake.

When compiling and running this code with optimizations turned on, everything works great and runs fast. However, when running without any optimizations, this code is much slower than:

void foo(int size)
{
    std::unique_ptr<uint8_t[]> data = std::make_unique<uint8_t>(size);
    uint8_t* dataPtr = data.get();

    for(int i = 0; i < size; i += 2)
    {
        dataPtr[i] = 1;
        dataPtr[i + 1] = 2;
    }
}

To investigate this a bit more, I ran multiple variations of the indexing operator with this dynamic array through compiler explorer (godbolt.org). Compiling with clang and -O3 optimizations, all variations result in the same assembly. However, compiling without any optimizations, the unique_ptr only sample has a call to the unique_ptr operator[], which seems to be causing the slowdown.

Why is the operator[] of the unique_ptr much slower without optimizations? From the documentation I see that the operator[] should be equivalent to unique_ptr.get()[]. Is it doing some safety checks without optimizations? If so, which ones?

question from:https://stackoverflow.com/questions/65643041/stdunique-ptr-operator-vs-raw-ptr-dynamic-array-operator

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Without optimisations, there is the overhead of calling operator[] and passing the index as an argument. With optimisations, the compiler can inline the entire function to avoid the overhead of calling the function (and since it results in identical assembly, you already know that the performance will be the same).

This is one of the many reasons that profiling/benchmarking without optimisations leads to incorrect results: zero-cost abstractions will have a cost that they wouldn't normally have in your production build.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...