Discussion:
reverse iteration is slower !
(too old to reply)
Bonita Montero
2024-05-04 12:16:47 UTC
Permalink
With a vector a reverse iterator is derived from the normal iterator
and accesses with an offset of -1. With today's CPUs this isn't slower.
I wondered if the performance is still the same with ordered maps or
sets, where this would mean a real penalty since offsetting is done
twice, once for the access and once for iteration to the next element.
So I wrote a small benchmark for that:

#include <iostream>
#include <set>
#include <chrono>
#include <atomic>

using namespace std;
using namespace chrono;

atomic_int aSum;

int main()
{
set<int> st;
for( int i = 0; i != 100; ++i )
st.emplace( i );
auto timeLoop = []( char const *what, auto fn )
{
auto start = high_resolution_clock::now();
int sum = 0;
for( size_t i = 1'000'000; i; --i )
fn( sum );
cout << what << duration_cast<nanoseconds>(
high_resolution_clock::now() - start ).count() / 100e6 << endl;
::aSum.store( sum, memory_order_relaxed );
};
timeLoop( "forward: ",
[&]( int &sum )
{
for( auto it = st.begin(); it != st.end(); ++it )
sum += *it;
} );
timeLoop( "reverse: ",
[&]( int &sum )
{
for( auto it = st.rbegin(); it != st.rend(); ++it )
sum += *it;
} );
timeLoop( "manual reverse: ",
[&]( int &sum )
{
for( auto it = st.end(); it != st.begin(); )
sum += *--it;
} );
}

This is the result with MSVC++ 18 on a 5.7GHz Zen4-CPU.

forward: 1.70247
reverse: 2.05701
manual reverse: 1.77813

The first result is from the bidirectional iterator, the second from the
reverse-iterator and the third from a bidirectional iterator iterating
reverse.
So iterate reverse only if necessary.
Chris M. Thomasson
2024-05-04 19:56:02 UTC
Permalink
Post by Bonita Montero
With a vector a reverse iterator is derived from the normal iterator
and accesses with an offset of -1. With today's CPUs this isn't slower.
I wondered if the performance is still the same with ordered maps or
sets, where this would mean a real penalty since offsetting is done
twice, once for the access and once for iteration to the next element.
[...]

Cache lines like to be read, going forward, perhaps a prefetch is in
order... ;^)
Bonita Montero
2024-05-04 20:19:16 UTC
Permalink
Post by Chris M. Thomasson
Post by Bonita Montero
With a vector a reverse iterator is derived from the normal iterator
and accesses with an offset of -1. With today's CPUs this isn't slower.
I wondered if the performance is still the same with ordered maps or
sets, where this would mean a real penalty since offsetting is done
twice, once for the access and once for iteration to the next element.
[...]
Cache lines like to be read, going forward, perhaps a prefetch is in
order... ;^)
Theres's no specific phyisal access order with a set or map.
Chris M. Thomasson
2024-05-04 20:32:42 UTC
Permalink
Post by Bonita Montero
Post by Chris M. Thomasson
Post by Bonita Montero
With a vector a reverse iterator is derived from the normal iterator
and accesses with an offset of -1. With today's CPUs this isn't slower.
I wondered if the performance is still the same with ordered maps or
sets, where this would mean a real penalty since offsetting is done
twice, once for the access and once for iteration to the next element.
[...]
Cache lines like to be read, going forward, perhaps a prefetch is in
order... ;^)
Theres's no specific phyisal access order with a set or map.
It depends on the implementation. Is this a generic set or map, or a
very highly specialized one that only works well with certain inputs...
Bonita Montero
2024-05-05 03:31:01 UTC
Permalink
It depends on the implementation. ...
There's nothing to prefetch in a tree.
Chris M. Thomasson
2024-05-05 05:20:40 UTC
Permalink
Post by Bonita Montero
It depends on the implementation. ...
There's nothing to prefetch in a tree.
Nevermind.

Loading...