Discussion:
thread about the pros and cons of lambdas, but more about cons
(too old to reply)
Michael S
2024-09-25 16:54:23 UTC
Permalink
On Wed, 25 Sep 2024 17:14:09 +0200
On Wed, 25 Sep 2024 15:04:27 +0200
That's reasonable if you have something useful to say.  Telling
people that they can use lambdas here is /not/ useful.  We all
know we could use lambdas, but that is totally missing the point
of the discussion.
I guess Michael S doesn't know this, otherwrite he would have used
a lambda himself. That's pre-C++11-style and maybe his knowledge
is from then.
Seriously? Sometimes I wonder if you ever bother reading other
people's posts.
Of /course/ Michael knows he could use a lambda there.
[O.T.]
I know that they can be used, but I never use lambdas myself.
Because of that my knowledge is theoretical and I am not fluent with
syntax.
Why I am not using lambdas myself? Because I think that lambda with
captures makes code harder to follow and to understand (that's my
1st hand experience from reading big corpus of Ada83 code. Ada83
does not have lambdas, but it has nested procedures that can access
parent's variable, similarly to lambda with [&] default capture).
And lambda without captures does not provide enough of advantage
over named functions to bother with mastering new concept.
It's maybe worth having another thread about the pros and cons of
lambdas, but that really should be a new thread.
Some time ago on comp.lang.c we had very long thread about floodfill4
algorithm (that both myself and TimR took more seriously than an
issue deserves, but that's off topic).

Today I coded two implementations of original brute-force recursive
algorithm.

// floodfill_recursive_nested.
// Implementation is in none-tricky C++
// Very similar to what can be done in C

#include <cstddef>

int floodfill4(
unsigned char *grey,
int width, int height,
int x, int y,
unsigned char target, unsigned char dest)
{
if (width < 1 || height < 1)
return 0;
if (x < 0 || x >= width || y < 0 || y >= height)
return 0;

size_t w = width, h = height;
if (grey[y*w+x] != target)
return 0;

struct {
unsigned char *grey;
size_t width, height;
unsigned char target, dest;
void core(size_t x, size_t y) const
{
if (x < width && y < height) {
auto idx = y*width+x;
if (grey[idx] == target) {
grey[idx] = dest;
core(x - 1, y);
core(x + 1, y);
core(x, y - 1);
core(x, y + 1);
}
}
}
} context = {
.grey = grey,
.width = w, .height = h,
.target = target, .dest = dest,
};
context.core(x, y);
return 1;
}

// end of floodfill_recursive_nested.



// floodfill_recursive_lambda.
// Implementation in tricky C++
// C can not do it

#include <cstddef>

int floodfill4(
unsigned char *grey,
int width, int height,
int x, int y,
unsigned char target, unsigned char dest)
{
if (width < 1 || height < 1)
return 0;
if (x < 0 || x >= width || y < 0 || y >= height)
return 0;

size_t w = width, h = height;
if (grey[y*w+x] != target)
return 0;

auto core = [=] (auto& a_ref, size_t x, size_t y) {
if (x >= w || y >= h)
return;
auto idx = y*w+x;
if (grey[idx] == target) {
grey[idx] = dest;
a_ref(a_ref, x - 1, y);
a_ref(a_ref, x + 1, y);
a_ref(a_ref, x, y - 1);
a_ref(a_ref, x, y + 1);
}
};
core(core, x, y);
return 1;
}

// end of floodfill_recursive_lambda.


In the second variant in order to make it compile at all I had to uses
very dirty trick with lambda passed as parameter to itself. I copied it
from Stack Overflow, but don't pretend to understand why it works and
why it needed in the first place.

But that's only part of the story.
The other part is that the first variant is 1.2x faster.
Bonita Montero
2024-09-25 17:17:35 UTC
Permalink
I've only pros about the functional programming which is possible since
C++11. I've already shown a function in another thread from which you
can see how I use functional programming in C++:

void fileList( wchar_t const *path, std::function<bool (
WIN32_FIND_DATAW const & )> const &fn, DWORD *pdwErr )
{
using namespace std;
if( pdwErr )
*pdwErr = NO_ERROR;
WIN32_FIND_DATAW fd;
HANDLE hFind = FindFirstFileW( path, &fd );
auto err = [&]<bool Initiate>( bool_constant<Initiate> )
{
DWORD dwErr = GetLastError();
if( dwErr == (Initiate ? ERROR_FILE_NOT_FOUND : ERROR_NO_MORE_FILES) )
return;
if( !pdwErr )
{
constexpr char const *ERR = Initiate ? "FindFirstFile() failed: \"" :
"FindNextFile() failed: \"";
throw system_error( dwErr, system_category(), (ostringstream() << ERR
<< wideCharToMultiByte( path ) << "\"").str().c_str() );
}
*pdwErr = dwErr;
};
if( hFind == INVALID_HANDLE_VALUE )
return (void)err( true_type() );
invoke_on_destruct closeFind( [&]() { FindClose( hFind ); } );
do
if( wcscmp( fd.cFileName, L"." ) == 0 || wcscmp( fd.cFileName, L".." )
== 0 )
continue;
else if( !fn( fd ) )
return;
while( FindNextFileW( hFind, &fd ) );
err( false_type() );
}

The function lists files in a directory with the Win32-API. It has
a runtime polymophic callback thar gets the WIN32_FIND_DATAW-ojects.
It has two places where it deals with the error-handling, either
by trowing a system_error exception or if *pdwErr is non-null, by
setting an error-code. This error handling is encapsulated in a
lambda to prevent mostly redundant code. This lambda is generic
because there are two different error codes which aren't actually
errors, but valid states.
And I'm using sth. like experimental::scope_exit to close the find
handle. But this scope_exit-like class is more flexible since you
can chain multiple of them and if an exception is throws multiple
changes to a data structure are "rolled back" and if you call dis-
able() on the last in the chain everything is "committed".
Bonita Montero
2024-09-25 17:55:50 UTC
Permalink
I just noticed who started this thread. Maybe you don't trust lambdas
because lambdas implement a lot of functionality with very little
syntax. This may seem much more understandable in a hand-written class
with a call operator, but once you get into the topic and really work
with it a lot, you really learn to appreciate this minimalist syntax.
At this point I can recommend the book "C++ Lambda Story" by Bartłomiej
Filipek.
Paavo Helde
2024-09-25 19:53:47 UTC
Permalink
Post by Michael S
On Wed, 25 Sep 2024 17:14:09 +0200
On Wed, 25 Sep 2024 15:04:27 +0200
That's reasonable if you have something useful to say.  Telling
people that they can use lambdas here is /not/ useful.  We all
know we could use lambdas, but that is totally missing the point
of the discussion.
I guess Michael S doesn't know this, otherwrite he would have used
a lambda himself. That's pre-C++11-style and maybe his knowledge
is from then.
Seriously? Sometimes I wonder if you ever bother reading other
people's posts.
Of /course/ Michael knows he could use a lambda there.
[O.T.]
I know that they can be used, but I never use lambdas myself.
Because of that my knowledge is theoretical and I am not fluent with
syntax.
Why I am not using lambdas myself? Because I think that lambda with
captures makes code harder to follow and to understand (that's my
1st hand experience from reading big corpus of Ada83 code. Ada83
does not have lambdas, but it has nested procedures that can access
parent's variable, similarly to lambda with [&] default capture).
And lambda without captures does not provide enough of advantage
over named functions to bother with mastering new concept.
It's maybe worth having another thread about the pros and cons of
lambdas, but that really should be a new thread.
Some time ago on comp.lang.c we had very long thread about floodfill4
algorithm (that both myself and TimR took more seriously than an
issue deserves, but that's off topic).
Today I coded two implementations of original brute-force recursive
algorithm.
// floodfill_recursive_nested.
// Implementation is in none-tricky C++
// Very similar to what can be done in C
#include <cstddef>
int floodfill4(
unsigned char *grey,
int width, int height,
int x, int y,
unsigned char target, unsigned char dest)
{
if (width < 1 || height < 1)
return 0;
if (x < 0 || x >= width || y < 0 || y >= height)
return 0;
size_t w = width, h = height;
if (grey[y*w+x] != target)
return 0;
struct {
unsigned char *grey;
size_t width, height;
unsigned char target, dest;
void core(size_t x, size_t y) const
{
if (x < width && y < height) {
auto idx = y*width+x;
if (grey[idx] == target) {
grey[idx] = dest;
core(x - 1, y);
core(x + 1, y);
core(x, y - 1);
core(x, y + 1);
}
}
}
} context = {
.grey = grey,
.width = w, .height = h,
.target = target, .dest = dest,
};
context.core(x, y);
return 1;
}
// end of floodfill_recursive_nested.
// floodfill_recursive_lambda.
// Implementation in tricky C++
// C can not do it
#include <cstddef>
int floodfill4(
unsigned char *grey,
int width, int height,
int x, int y,
unsigned char target, unsigned char dest)
{
if (width < 1 || height < 1)
return 0;
if (x < 0 || x >= width || y < 0 || y >= height)
return 0;
size_t w = width, h = height;
if (grey[y*w+x] != target)
return 0;
auto core = [=] (auto& a_ref, size_t x, size_t y) {
if (x >= w || y >= h)
return;
auto idx = y*w+x;
if (grey[idx] == target) {
grey[idx] = dest;
a_ref(a_ref, x - 1, y);
a_ref(a_ref, x + 1, y);
a_ref(a_ref, x, y - 1);
a_ref(a_ref, x, y + 1);
}
};
core(core, x, y);
return 1;
}
// end of floodfill_recursive_lambda.
In the second variant in order to make it compile at all I had to uses
very dirty trick with lambda passed as parameter to itself. I copied it
from Stack Overflow, but don't pretend to understand why it works and
why it needed in the first place.
But that's only part of the story.
The other part is that the first variant is 1.2x faster.
This is called a strawman argument. You invent an example where the
lambda does not really suit and is slower than an alternative, then
attack it.

Solution is simple: use lambdas where they fit.
Michael S
2024-09-25 20:30:08 UTC
Permalink
On Wed, 25 Sep 2024 22:53:47 +0300
Post by Paavo Helde
This is called a strawman argument. You invent an example where the
lambda does not really suit and is slower than an alternative, then
attack it.
Solution is simple: use lambdas where they fit.
Can you give me one reason why writing recursive lambdas in C++ can not
be streight-forward?
That's how lambda-base recursive code that is 100% equivalent of above
C++ looks in Go. Simpler, isn't it?

var ff4 func(x, y uintptr)
ff4 = func(x, y uintptr) {
if x >= w || y >= h {
return
}
if img[w*y+x] != target {
return
}
img[w*y+x] = dest
ff4(x-1, y)
ff4(x+1, y)
ff4(x, y-1)
ff4(x, y+1)
}
Paavo Helde
2024-09-25 21:04:03 UTC
Permalink
Post by Michael S
On Wed, 25 Sep 2024 22:53:47 +0300
Post by Paavo Helde
This is called a strawman argument. You invent an example where the
lambda does not really suit and is slower than an alternative, then
attack it.
Solution is simple: use lambdas where they fit.
Can you give me one reason why writing recursive lambdas in C++ can not
be streight-forward?
That's how lambda-base recursive code that is 100% equivalent of above
C++ looks in Go. Simpler, isn't it?
var ff4 func(x, y uintptr)
ff4 = func(x, y uintptr) {
if x >= w || y >= h {
return
}
if img[w*y+x] != target {
return
}
img[w*y+x] = dest
ff4(x-1, y)
ff4(x+1, y)
ff4(x, y-1)
ff4(x, y+1)
}
Looks simpler, but not sure it actually is, given that access to the
"outer" variables seems to be not encapsulated or controlled in any way.
Probably this is an equivalent to a C++ free function with global or
thread-local variables, not to a C++ lambda.


There are many things in C++ which I do not like and many things which
I'm sure could work better. However, my task as a software engineer is
to make the best use of the tools which I have got. With C++ this means
avoiding use or misuse of very many of its features.
Michael S
2024-09-26 08:17:05 UTC
Permalink
On Thu, 26 Sep 2024 00:04:03 +0300
Post by Paavo Helde
Post by Michael S
On Wed, 25 Sep 2024 22:53:47 +0300
Post by Paavo Helde
This is called a strawman argument. You invent an example where the
lambda does not really suit and is slower than an alternative, then
attack it.
Solution is simple: use lambdas where they fit.
Can you give me one reason why writing recursive lambdas in C++ can
not be streight-forward?
That's how lambda-base recursive code that is 100% equivalent of
above C++ looks in Go. Simpler, isn't it?
var ff4 func(x, y uintptr)
ff4 = func(x, y uintptr) {
if x >= w || y >= h {
return
}
if img[w*y+x] != target {
return
}
img[w*y+x] = dest
ff4(x-1, y)
ff4(x+1, y)
ff4(x, y-1)
ff4(x, y+1)
}
Looks simpler, but not sure it actually is, given that access to the
"outer" variables seems to be not encapsulated or controlled in any way.
Correct. Go does has no fine access control that is available in C++.
Any lambda in Go has access to all variables at scope of its
declaration/definition.
Post by Paavo Helde
Probably this is an equivalent to a C++ free function with
global or thread-local variables, not to a C++ lambda.
That's incorrect.
Lambda in Go is less controlled than in C++, but it is not less
powerful. More like more powerful.
According to my understanding, in Go one can pass lambda to goroutine
(which is more similar to C++ thread than to C++ coroutine) then exit
the calling function and things will still work as naively expected.
I don't believe that you can do it in C++, or, at least you can't do it
if your lambda has captures accessed by reference.
Pay attention, I didn't try it, just skimmed docs, so I could be wrong.
Post by Paavo Helde
There are many things in C++ which I do not like and many things
which I'm sure could work better. However, my task as a software
engineer is to make the best use of the tools which I have got. With
C++ this means avoiding use or misuse of very many of its features.
Of course, I agree with that. Where I disagree is that IMO in case of
lambdas almost every use carries at least a seed of misuse So, on
balance, C++ language would have been better without them.
Paavo Helde
2024-09-26 08:25:05 UTC
Permalink
Post by Paavo Helde
Post by Michael S
On Wed, 25 Sep 2024 22:53:47 +0300
Post by Paavo Helde
This is called a strawman argument. You invent an example where the
lambda does not really suit and is slower than an alternative, then
attack it.
Solution is simple: use lambdas where they fit.
Can you give me one reason why writing recursive lambdas in C++ can not
be streight-forward?
That's how lambda-base recursive code that is 100% equivalent of above
C++ looks in Go. Simpler, isn't it?
   var ff4 func(x, y uintptr)
   ff4 = func(x, y uintptr) {
     if x >= w || y >= h {
       return
     }
     if img[w*y+x] != target {
       return
     }
     img[w*y+x] = dest
     ff4(x-1, y)
     ff4(x+1, y)
     ff4(x, y-1)
     ff4(x, y+1)
   }
Looks simpler, but not sure it actually is, given that access to the
"outer" variables seems to be not encapsulated or controlled in any way.
Probably this is an equivalent to a C++ free function with global or
thread-local variables, not to a C++ lambda.
There are many things in C++ which I do not like and many things which
I'm sure could work better. However, my task as a software engineer is
to make the best use of the tools which I have got. With C++ this means
avoiding use or misuse of very many of its features.
PS. One of things I have learned to avoid in C++ is heavy recursion,
which is known to be slow (on x86 at least) and can easily cause stack
overflows. I just tested this simplistic fully recursive floodfill4 and
both versions run out of stack already with so small as 200x200
uniformly filled images, in a VS2022 x86_64 program with default settings.

So this solution is totally unacceptable for production code, and all
talk about its simplicity (or not) is purely academic.
Bonita Montero
2024-09-26 08:28:32 UTC
Permalink
Post by Paavo Helde
PS. One of things I have learned to avoid in C++ is heavy recursion,
which is known to be slow (on x86 at least) and can easily cause stack
overflows. ...
Recursion ain't slow and stack you never have so much recursion levels
that a stack overlow ist realistic. Stack overflows are usually program-
ming errors like a missing termination condition with a recursion or
you do too much alloca().
Paavo Helde
2024-09-26 08:49:20 UTC
Permalink
Post by Bonita Montero
Post by Paavo Helde
PS. One of things I have learned to avoid in C++ is heavy recursion,
which is known to be slow (on x86 at least) and can easily cause stack
overflows. ...
Recursion ain't slow and stack you never have so much recursion levels
that a stack overlow ist realistic. Stack overflows are usually program-
ming errors like a missing termination condition with a recursion or
you do too much alloca().
You are funny, thanks for the laughs!

But seriously, the recursion stack frame size in the original floodfill4
examples in top of this thread is 64 and 80 bytes, respectively (MSVC
adds its own security checks in each frame by default. that's why these
are larger than expected). The default stack size in MSVC++ is 1MB. This
means ca 17,000 or 13,000 recursions, which means these algorithms only
cope with contiguous objects of so many pixels. Do you really think this
is an unrealistically large number? It would be about 0.1 % of your
average phone click.
Michael S
2024-09-26 08:38:37 UTC
Permalink
On Thu, 26 Sep 2024 11:25:05 +0300
Post by Paavo Helde
Post by Paavo Helde
Post by Michael S
On Wed, 25 Sep 2024 22:53:47 +0300
Post by Paavo Helde
This is called a strawman argument. You invent an example where
the lambda does not really suit and is slower than an
alternative, then attack it.
Solution is simple: use lambdas where they fit.
Can you give me one reason why writing recursive lambdas in C++
can not be streight-forward?
That's how lambda-base recursive code that is 100% equivalent of
above C++ looks in Go. Simpler, isn't it?
   var ff4 func(x, y uintptr)
   ff4 = func(x, y uintptr) {
     if x >= w || y >= h {
       return
     }
     if img[w*y+x] != target {
       return
     }
     img[w*y+x] = dest
     ff4(x-1, y)
     ff4(x+1, y)
     ff4(x, y-1)
     ff4(x, y+1)
   }
Looks simpler, but not sure it actually is, given that access to
the "outer" variables seems to be not encapsulated or controlled in
any way. Probably this is an equivalent to a C++ free function with
global or thread-local variables, not to a C++ lambda.
There are many things in C++ which I do not like and many things
which I'm sure could work better. However, my task as a software
engineer is to make the best use of the tools which I have got.
With C++ this means avoiding use or misuse of very many of its
features.
PS. One of things I have learned to avoid in C++ is heavy recursion,
which is known to be slow (on x86 at least) and can easily cause
stack overflows. I just tested this simplistic fully recursive
floodfill4 and both versions run out of stack already with so small
as 200x200 uniformly filled images, in a VS2022 x86_64 program with
default settings.
When I test it on Windows the stack size is set to 8MB.
That's enough for 200x200.
It seems, on many x86-64 Linuxes 8MB is a default.
Post by Paavo Helde
So this solution is totally unacceptable for production code, and all
talk about its simplicity (or not) is purely academic.
It depends.
Sometimes one knows up front that the images are small.
Yet sometimes one knows that in his environment stack is allowed to be
huge.
For example, Go on x86-64 sets default stack limit to 1 GB.
But more often what you said is correct.
Ben Bacarisse
2024-09-25 22:13:00 UTC
Permalink
Post by Michael S
On Wed, 25 Sep 2024 17:14:09 +0200
On Wed, 25 Sep 2024 15:04:27 +0200
That's reasonable if you have something useful to say.  Telling
people that they can use lambdas here is /not/ useful.  We all
know we could use lambdas, but that is totally missing the point
of the discussion.
I guess Michael S doesn't know this, otherwrite he would have used
a lambda himself. That's pre-C++11-style and maybe his knowledge
is from then.
Seriously? Sometimes I wonder if you ever bother reading other
people's posts.
Of /course/ Michael knows he could use a lambda there.
[O.T.]
I know that they can be used, but I never use lambdas myself.
Because of that my knowledge is theoretical and I am not fluent with
syntax.
Why I am not using lambdas myself? Because I think that lambda with
captures makes code harder to follow and to understand (that's my
1st hand experience from reading big corpus of Ada83 code. Ada83
does not have lambdas, but it has nested procedures that can access
parent's variable, similarly to lambda with [&] default capture).
And lambda without captures does not provide enough of advantage
over named functions to bother with mastering new concept.
It's maybe worth having another thread about the pros and cons of
lambdas, but that really should be a new thread.
Some time ago on comp.lang.c we had very long thread about floodfill4
algorithm (that both myself and TimR took more seriously than an
issue deserves, but that's off topic).
Today I coded two implementations of original brute-force recursive
algorithm.
// floodfill_recursive_nested.
// Implementation is in none-tricky C++
// Very similar to what can be done in C
#include <cstddef>
int floodfill4(
unsigned char *grey,
int width, int height,
int x, int y,
unsigned char target, unsigned char dest)
{
if (width < 1 || height < 1)
return 0;
if (x < 0 || x >= width || y < 0 || y >= height)
return 0;
size_t w = width, h = height;
if (grey[y*w+x] != target)
return 0;
struct {
unsigned char *grey;
size_t width, height;
unsigned char target, dest;
void core(size_t x, size_t y) const
{
if (x < width && y < height) {
auto idx = y*width+x;
if (grey[idx] == target) {
grey[idx] = dest;
core(x - 1, y);
core(x + 1, y);
core(x, y - 1);
core(x, y + 1);
}
}
}
} context = {
.grey = grey,
.width = w, .height = h,
.target = target, .dest = dest,
};
context.core(x, y);
return 1;
}
// end of floodfill_recursive_nested.
// floodfill_recursive_lambda.
// Implementation in tricky C++
// C can not do it
#include <cstddef>
int floodfill4(
unsigned char *grey,
int width, int height,
int x, int y,
unsigned char target, unsigned char dest)
{
if (width < 1 || height < 1)
return 0;
if (x < 0 || x >= width || y < 0 || y >= height)
return 0;
size_t w = width, h = height;
if (grey[y*w+x] != target)
return 0;
auto core = [=] (auto& a_ref, size_t x, size_t y) {
if (x >= w || y >= h)
return;
auto idx = y*w+x;
if (grey[idx] == target) {
grey[idx] = dest;
a_ref(a_ref, x - 1, y);
a_ref(a_ref, x + 1, y);
a_ref(a_ref, x, y - 1);
a_ref(a_ref, x, y + 1);
}
};
core(core, x, y);
return 1;
}
// end of floodfill_recursive_lambda.
In the second variant in order to make it compile at all I had to uses
very dirty trick with lambda passed as parameter to itself. I copied it
from Stack Overflow, but don't pretend to understand why it works and
why it needed in the first place.
For this part the answer is simple. The lambda only captures names that
are defined at its point of definition, and core is not defined until
the end of the declarator which include the initialisation -- the lambda
you are defining. The lambda stored in 'core', can't therefore refer to
the name core because core was not defined before the lambda. You can,
instead, do this (untested):

#include <functional>

int floodfill4(
unsigned char *grey,
int width, int height,
int x, int y,
unsigned char target, unsigned char dest)
{
if (width < 1 || height < 1)
return 0;
if (x < 0 || x >= width || y < 0 || y >= height)
return 0;

size_t w = width, h = height;
if (grey[y*w+x] != target)
return 0;

std::function<void(size_t, size_t)> core;
core = [=] (size_t x, size_t y) {
if (x >= w || y >= h)
return;
auto idx = y*w+x;
if (grey[idx] == target) {
grey[idx] = dest;
core(x - 1, y);
core(x + 1, y);
core(x, y - 1);
core(x, y + 1);
}
};
core(x, y);
return 1;
}
Post by Michael S
But that's only part of the story.
The other part is that the first variant is 1.2x faster.
Interesting, though this is not really what a lambda is for. What you
want here is a plain lexical nested function -- it's purpose being just
an auxiliary function that can refer to the outer scope so as to require
fewer parameters.

It would be interesting to see is gcc's nested function extension
produced something that was faster than a lambda.

Note: this is a very old problem and actually pre-dates the computing
era. Combinatory logic had to come up with the Y combinator to make
recursive "functions", and (slightly more recently) some Lisps have two
forms of 'let' to deal with this issue.
--
Ben.
Ben Bacarisse
2024-09-25 22:28:57 UTC
Permalink
... You can,
I should have tested! Of course you have to change the capture (at
#include <functional>
int floodfill4(
unsigned char *grey,
int width, int height,
int x, int y,
unsigned char target, unsigned char dest)
{
if (width < 1 || height < 1)
return 0;
if (x < 0 || x >= width || y < 0 || y >= height)
return 0;
size_t w = width, h = height;
if (grey[y*w+x] != target)
return 0;
std::function<void(size_t, size_t)> core;
core = [&] (size_t x, size_t y) {
Corrected-----^
if (x >= w || y >= h)
return;
auto idx = y*w+x;
if (grey[idx] == target) {
grey[idx] = dest;
core(x - 1, y);
core(x + 1, y);
core(x, y - 1);
core(x, y + 1);
}
};
core(x, y);
return 1;
}
--
Ben.
Michael S
2024-09-26 07:41:24 UTC
Permalink
On Wed, 25 Sep 2024 23:28:57 +0100
Post by Ben Bacarisse
... You can,
I should have tested! Of course you have to change the capture (at
#include <functional>
int floodfill4(
unsigned char *grey,
int width, int height,
int x, int y,
unsigned char target, unsigned char dest)
{
if (width < 1 || height < 1)
return 0;
if (x < 0 || x >= width || y < 0 || y >= height)
return 0;
size_t w = width, h = height;
if (grey[y*w+x] != target)
return 0;
std::function<void(size_t, size_t)> core;
core = [&] (size_t x, size_t y) {
Corrected-----^
if (x >= w || y >= h)
return;
auto idx = y*w+x;
if (grey[idx] == target) {
grey[idx] = dest;
core(x - 1, y);
core(x + 1, y);
core(x, y - 1);
core(x, y + 1);
}
};
core(x, y);
return 1;
}
That's where I started.
It is approximately twice slower than tricky version with [&].
And tricky [&] almost 20% slower than tricky [=].
I didn't try to understand what std::function thing is, but fast it
isn't.
David Brown
2024-09-26 08:29:46 UTC
Permalink
Post by Michael S
On Wed, 25 Sep 2024 23:28:57 +0100
Post by Ben Bacarisse
... You can,
I should have tested! Of course you have to change the capture (at
#include <functional>
int floodfill4(
unsigned char *grey,
int width, int height,
int x, int y,
unsigned char target, unsigned char dest)
{
if (width < 1 || height < 1)
return 0;
if (x < 0 || x >= width || y < 0 || y >= height)
return 0;
size_t w = width, h = height;
if (grey[y*w+x] != target)
return 0;
std::function<void(size_t, size_t)> core;
core = [&] (size_t x, size_t y) {
Corrected-----^
if (x >= w || y >= h)
return;
auto idx = y*w+x;
if (grey[idx] == target) {
grey[idx] = dest;
core(x - 1, y);
core(x + 1, y);
core(x, y - 1);
core(x, y + 1);
}
};
core(x, y);
return 1;
}
That's where I started.
It is approximately twice slower than tricky version with [&].
And tricky [&] almost 20% slower than tricky [=].
I didn't try to understand what std::function thing is, but fast it
isn't.
std::function<> can be very flexible, but it is often costly at
run-time. There are situations where the compiler can optimise away the
overheads, but often at least some remains.
Bonita Montero
2024-09-26 09:35:55 UTC
Permalink
std::function<> can be very flexible, but it is often costly at run-
time.  There are situations where the compiler can optimise away the
overheads, but often at least some remains.
I posted here some code that compares a std::function<> against a
virtual method call and some other alternatives. A function<>-object
adds only two ints and returns the result is about one nanosecond
on my Zen4-PC. That's not much.
David Brown
2024-09-26 11:27:21 UTC
Permalink
Post by Bonita Montero
std::function<> can be very flexible, but it is often costly at run-
time.  There are situations where the compiler can optimise away the
overheads, but often at least some remains.
I posted here some code that compares a std::function<> against a
virtual method call and some other alternatives. A function<>-object
adds only two ints and returns the result is about one nanosecond
on my Zen4-PC. That's not much.
Please don't bother replying to posts unless you first read them, then
think about them. Otherwise you are just wasting everyone's time.

Nothing you write there contradicts what I wrote, or adds any new
information.
Bonita Montero
2024-09-26 11:31:22 UTC
Permalink
Am 26.09.2024 um 13:27 schrieb David Brown:
^> Please don't bother replying to posts unless you first read them, then
think about them.  Otherwise you are just wasting everyone's time.
I've read it and I'm thinking the overvhead of std::function<> ist
neglectable.
Michael S
2024-09-26 12:25:11 UTC
Permalink
On Thu, 26 Sep 2024 13:31:22 +0200
Post by Bonita Montero
^> Please don't bother replying to posts unless you first read them, then
think about them.  Otherwise you are just wasting everyone's time.
I've read it and I'm thinking the overvhead of std::function<> ist
neglectable.
As majority of things, it depends.
On CPU with deep OoO capabilities it likely depends on being on or off
critical latency path.
I'd guess that in your case std::function calls were off critical
latency path. In the other words, they were independent of each other.
In my measurements the calls were partially dependent which led to
higher cost - approximately 2ns both on Intel Skylake at 4.25 GHz and
on AMD Zen3 at 3.6 GHz.
Since in this algorithm call/return overhead is the main component
of the run time, 2ns addition was enough to make the whole test run
twice slower on Skylake and more than twice slower on Zen3.
Bonita Montero
2024-09-26 12:58:26 UTC
Permalink
Post by Michael S
As majority of things, it depends.
I can also imagine code where the call-overhead might be relevant,
but the virtual dispatch on the function<>-object is that fast that
it is not relevant in 95% of all cases.
Michael S
2024-09-26 13:53:04 UTC
Permalink
On Thu, 26 Sep 2024 14:58:26 +0200
Post by Bonita Montero
Post by Michael S
As majority of things, it depends.
I can also imagine code where the call-overhead might be relevant,
but the virtual dispatch on the function<>-object is that fast that
it is not relevant in 95% of all cases.
Actually, I am not at all sure that slowdown in my case caused by
overhead of virtual dispatch.
It appears more likely that virtuality of dispatch prevents from
happening some sort of compiler optimization, probably partial inlining,
that in case of direct call is able to eliminating approximately half
of the calls.
But exact reason does not matter. What matters is a slowdown.
Bonita Montero
2024-09-26 13:54:44 UTC
Permalink
Post by Michael S
Actually, I am not at all sure that slowdown in my case caused by
overhead of virtual dispatch.
It appears more likely that virtuality of dispatch prevents from
happening some sort of compiler optimization, probably partial inlining,
that in case of direct call is able to eliminating approximately half
of the calls.
But exact reason does not matter. What matters is a slowdown.
Use function<>-dispatch only for coarse-grained virtuality, then
you don't make anything worse.
Michael S
2024-09-27 13:55:07 UTC
Permalink
On Thu, 26 Sep 2024 10:41:24 +0300
Post by Michael S
On Wed, 25 Sep 2024 23:28:57 +0100
Post by Ben Bacarisse
... You can,
I should have tested! Of course you have to change the capture (at
#include <functional>
int floodfill4(
unsigned char *grey,
int width, int height,
int x, int y,
unsigned char target, unsigned char dest)
{
if (width < 1 || height < 1)
return 0;
if (x < 0 || x >= width || y < 0 || y >= height)
return 0;
size_t w = width, h = height;
if (grey[y*w+x] != target)
return 0;
std::function<void(size_t, size_t)> core;
core = [&] (size_t x, size_t y) {
Corrected-----^
if (x >= w || y >= h)
return;
auto idx = y*w+x;
if (grey[idx] == target) {
grey[idx] = dest;
core(x - 1, y);
core(x + 1, y);
core(x, y - 1);
core(x, y + 1);
}
};
core(x, y);
return 1;
}
That's where I started.
It is approximately twice slower than tricky version with [&].
And tricky [&] almost 20% slower than tricky [=].
I didn't try to understand what std::function thing is, but fast it
isn't.
After digging a bit deeper I realized that slowness is only a symptom
of the bigger problem. This variant of lambda defeats the whole purpose
of having lambda/context, i.e. reduction of the size of call frame.
This variant of lambda generates call frame of exactly the same size as
straightforward code below.
I'd guess, it means that this variant of lambda tracks its captures
separately, instead of tracking them together via single pointer to
parent's stack frame.

#include <cstddef>

static void floodfill4u(
unsigned char *grey,
size_t width, size_t height,
size_t x, size_t y,
unsigned char target, unsigned char dest)
{
if (x >= width || y >= height)
return;
auto idx = y*width+x;
if (grey[idx] == target) {
grey[idx] = dest;
floodfill4u(grey, width, height, x - 1, y, target, dest);
floodfill4u(grey, width, height, x + 1, y, target, dest);
floodfill4u(grey, width, height, x, y - 1, target, dest);
floodfill4u(grey, width, height, x, y + 1, target, dest);
}
}


int floodfill4(
unsigned char *grey,
int width, int height,
int x, int y,
unsigned char target, unsigned char dest)
{
if (width < 1 || height < 1)
return 0;
if (x < 0 || x >= width || y < 0 || y >= height)
return 0;
if (grey[(size_t)y*width+x] != target)
return 0;
floodfill4u(grey, width, height, x, y, target, dest);
return 1;
}
Tim Rentsch
2024-09-26 17:01:11 UTC
Permalink
Post by Ben Bacarisse
... You can,
I should have tested! Of course you have to change the capture (at
#include <functional>
int floodfill4(
unsigned char *grey,
int width, int height,
int x, int y,
unsigned char target, unsigned char dest)
{
if (width < 1 || height < 1)
return 0;
if (x < 0 || x >= width || y < 0 || y >= height)
return 0;
size_t w = width, h = height;
if (grey[y*w+x] != target)
return 0;
std::function<void(size_t, size_t)> core;
core = [&] (size_t x, size_t y) {
Corrected-----^
Why do you think [&] should be used instead of [=]?
Bonita Montero
2024-09-26 17:04:07 UTC
Permalink
Post by Tim Rentsch
Post by Ben Bacarisse
... You can,
I should have tested! Of course you have to change the capture (at
#include <functional>
int floodfill4(
unsigned char *grey,
int width, int height,
int x, int y,
unsigned char target, unsigned char dest)
{
if (width < 1 || height < 1)
return 0;
if (x < 0 || x >= width || y < 0 || y >= height)
return 0;
size_t w = width, h = height;
if (grey[y*w+x] != target)
return 0;
std::function<void(size_t, size_t)> core;
core = [&] (size_t x, size_t y) {
Corrected-----^
Why do you think [&] should be used instead of [=]?
If you're capturing only variables of a fundamental type the whole
copying is optimized away. If you need a lambda in a foreign thread
context you have to copy always.
Michael S
2024-09-26 17:20:50 UTC
Permalink
On Thu, 26 Sep 2024 10:01:11 -0700
Post by Tim Rentsch
Post by Ben Bacarisse
... You can,
I should have tested! Of course you have to change the capture (at
#include <functional>
int floodfill4(
unsigned char *grey,
int width, int height,
int x, int y,
unsigned char target, unsigned char dest)
{
if (width < 1 || height < 1)
return 0;
if (x < 0 || x >= width || y < 0 || y >= height)
return 0;
size_t w = width, h = height;
if (grey[y*w+x] != target)
return 0;
std::function<void(size_t, size_t)> core;
core = [&] (size_t x, size_t y) {
Corrected-----^
Why do you think [&] should be used instead of [=]?
Compiler says that otherwise 'core' captures itself before it properly
initialize.
And indeed, it does not work (Segmentation fault).
Tim Rentsch
2024-09-26 17:38:02 UTC
Permalink
Post by Michael S
On Thu, 26 Sep 2024 10:01:11 -0700
Post by Tim Rentsch
Post by Ben Bacarisse
... You can,
I should have tested! Of course you have to change the capture (at
#include <functional>
int floodfill4(
unsigned char *grey,
int width, int height,
int x, int y,
unsigned char target, unsigned char dest)
{
if (width < 1 || height < 1)
return 0;
if (x < 0 || x >= width || y < 0 || y >= height)
return 0;
size_t w = width, h = height;
if (grey[y*w+x] != target)
return 0;
std::function<void(size_t, size_t)> core;
core = [&] (size_t x, size_t y) {
Corrected-----^
Why do you think [&] should be used instead of [=]?
Compiler says that otherwise 'core' captures itself before it
properly initialize.
And indeed, it does not work (Segmentation fault).
Ahh. A chicken and egg problem. I sidestepped that by initializing
'core' as part of its declaration (as my other response to Ben
shows).

No compiler warning, but I didn't try running it.
Michael S
2024-09-27 13:27:13 UTC
Permalink
On Thu, 26 Sep 2024 10:38:02 -0700
Post by Tim Rentsch
Post by Michael S
On Thu, 26 Sep 2024 10:01:11 -0700
Post by Tim Rentsch
Post by Ben Bacarisse
... You can,
I should have tested! Of course you have to change the capture
#include <functional>
int floodfill4(
unsigned char *grey,
int width, int height,
int x, int y,
unsigned char target, unsigned char dest)
{
if (width < 1 || height < 1)
return 0;
if (x < 0 || x >= width || y < 0 || y >= height)
return 0;
size_t w = width, h = height;
if (grey[y*w+x] != target)
return 0;
std::function<void(size_t, size_t)> core;
core = [&] (size_t x, size_t y) {
Corrected-----^
Why do you think [&] should be used instead of [=]?
Compiler says that otherwise 'core' captures itself before it
properly initialize.
And indeed, it does not work (Segmentation fault).
Ahh. A chicken and egg problem. I sidestepped that by initializing
'core' as part of its declaration (as my other response to Ben
shows).
No compiler warning, but I didn't try running it.
It does not make any difference. The same crash as before.
BTW, I am surprised that your compiler issues no warning.
Tim Rentsch
2024-09-28 11:06:40 UTC
Permalink
Post by Michael S
On Thu, 26 Sep 2024 10:38:02 -0700
Post by Tim Rentsch
Post by Michael S
On Thu, 26 Sep 2024 10:01:11 -0700
Post by Tim Rentsch
Post by Ben Bacarisse
... You can,
I should have tested! Of course you have to change the capture
#include <functional>
int floodfill4(
unsigned char *grey,
int width, int height,
int x, int y,
unsigned char target, unsigned char dest)
{
if (width < 1 || height < 1)
return 0;
if (x < 0 || x >= width || y < 0 || y >= height)
return 0;
size_t w = width, h = height;
if (grey[y*w+x] != target)
return 0;
std::function<void(size_t, size_t)> core;
core = [&] (size_t x, size_t y) {
Corrected-----^
Why do you think [&] should be used instead of [=]?
Compiler says that otherwise 'core' captures itself before it
properly initialize.
And indeed, it does not work (Segmentation fault).
Ahh. A chicken and egg problem. I sidestepped that by initializing
'core' as part of its declaration (as my other response to Ben
shows).
No compiler warning, but I didn't try running it.
It does not make any difference. The same crash as before.
Yes, not surprising.
Post by Michael S
BTW, I am surprised that your compiler issues no warning.
Probably just an old compiler. I'm not up-to-date with what's
going on with C++ so there isn't much incentive to upgrade.
Trying again with clang I do get a warning.
Michael S
2024-09-26 07:58:46 UTC
Permalink
On Wed, 25 Sep 2024 23:13:00 +0100
Post by Ben Bacarisse
Post by Michael S
// end of floodfill_recursive_lambda.
In the second variant in order to make it compile at all I had to
uses very dirty trick with lambda passed as parameter to itself. I
copied it from Stack Overflow, but don't pretend to understand why
it works and why it needed in the first place.
For this part the answer is simple. The lambda only captures names
that are defined at its point of definition, and core is not defined
until the end of the declarator which include the initialisation --
the lambda you are defining. The lambda stored in 'core', can't
therefore refer to the name core because core was not defined before
the lambda.
It does not sound too different from struct that holds pointer to
itself. If forward declaration works for later I don't see why it can
not work for the former. Isn't lambda a sort of struct when we look
under the hood?
May be, the syntax for forward declaration would need a new keyword, but
that does not sound like exaggerated price for convenience and for
improved readability.
<snip>
Post by Ben Bacarisse
Post by Michael S
But that's only part of the story.
The other part is that the first variant is 1.2x faster.
Interesting, though this is not really what a lambda is for. What you
want here is a plain lexical nested function -- it's purpose being
just an auxiliary function that can refer to the outer scope so as to
require fewer parameters.
Yes, that's a fair definition of my needs.
The main and almost only reason for having core() instead of calling
recursively to floodfill4() itself is to reduce the size of stack frame
of recursive call.
Post by Ben Bacarisse
It would be interesting to see is gcc's nested function extension
produced something that was faster than a lambda.
Note: this is a very old problem and actually pre-dates the computing
era. Combinatory logic had to come up with the Y combinator to make
recursive "functions", and (slightly more recently) some Lisps have
two forms of 'let' to deal with this issue.
I don't know what is Y combinator. I do know that in Go forward
declaration works fine.
Tim Rentsch
2024-09-26 17:27:13 UTC
Permalink
[...]
Post by Ben Bacarisse
Post by Michael S
Some time ago on comp.lang.c we had very long thread about
floodfill4 algorithm (that both myself and TimR took more
seriously than an issue deserves, but that's off topic).
Today I coded two implementations of original brute-force
recursive algorithm.
// floodfill_recursive_nested.
// Implementation is in none-tricky C++
// Very similar to what can be done in C
#include <cstddef>
int floodfill4(
unsigned char *grey,
int width, int height,
int x, int y,
unsigned char target, unsigned char dest)
{
if (width < 1 || height < 1)
return 0;
if (x < 0 || x >= width || y < 0 || y >= height)
return 0;
size_t w = width, h = height;
if (grey[y*w+x] != target)
return 0;
struct {
unsigned char *grey;
size_t width, height;
unsigned char target, dest;
void core(size_t x, size_t y) const
{
if (x < width && y < height) {
auto idx = y*width+x;
if (grey[idx] == target) {
grey[idx] = dest;
core(x - 1, y);
core(x + 1, y);
core(x, y - 1);
core(x, y + 1);
}
}
}
} context = {
.grey = grey,
.width = w, .height = h,
.target = target, .dest = dest,
};
context.core(x, y);
return 1;
}
// end of floodfill_recursive_nested.
// floodfill_recursive_lambda.
// Implementation in tricky C++
// C can not do it
#include <cstddef>
int floodfill4(
unsigned char *grey,
int width, int height,
int x, int y,
unsigned char target, unsigned char dest)
{
if (width < 1 || height < 1)
return 0;
if (x < 0 || x >= width || y < 0 || y >= height)
return 0;
size_t w = width, h = height;
if (grey[y*w+x] != target)
return 0;
auto core = [=] (auto& a_ref, size_t x, size_t y) {
if (x >= w || y >= h)
return;
auto idx = y*w+x;
if (grey[idx] == target) {
grey[idx] = dest;
a_ref(a_ref, x - 1, y);
a_ref(a_ref, x + 1, y);
a_ref(a_ref, x, y - 1);
a_ref(a_ref, x, y + 1);
}
};
core(core, x, y);
return 1;
}
// end of floodfill_recursive_lambda.
In the second variant in order to make it compile at all I had to
uses very dirty trick with lambda passed as parameter to itself.
I copied it from Stack Overflow, but don't pretend to understand
why it works and why it needed in the first place.
For this part the answer is simple. The lambda only captures
names that are defined at its point of definition, and core is not
defined until the end of the declarator which include the
initialisation -- the lambda you are defining. The lambda stored
in 'core', can't therefore refer to the name core because core was
not defined before the lambda.
This explanation isn't right. In both C and C++ a declared name is
available as soon as its declarator is complete, and declarators are
complete before the following initializer (if any). The following code
compiles just fine:

#include <functional>

int revised_floodfill4(
unsigned char *const grey,
int width, int height,
int x, int y,
unsigned char target, unsigned char dest)
{
if (width < 1 || height < 1)
return 0;
if (x < 0 || x >= width || y < 0 || y >= height)
return 0;

const size_t w = width, h = height;
if (grey[y*w+x] != target)
return 0;

std::function<void(size_t, size_t)> core = [=]
(size_t x, size_t y) {
if (x >= w || y >= h)
return;
auto idx = y*w+x;
if (grey[idx] == target) {
grey[idx] = dest;
core(x - 1, y);
core(x + 1, y);
core(x, y - 1);
core(x, y + 1);
}
};
core(x, y);
return 1;
}

The problem with the earlier definition is not the use of 'core' in
the body of the lambda, but the 'auto' part of the declaration.
Because the type of 'core' is not yet known, it can't be used in the
body. Once that problem is fixed by giving 'core' a specific type,
there is no problem calling it recursively in the body of the
lambda.
Bonita Montero
2024-09-26 17:32:39 UTC
Permalink
Why do you use a function<>-object where you don't need any
runtime-polymorphism. Use a variable which gets a lambda
assigned and the call is likely to be inlined.
Tim Rentsch
2024-09-26 21:09:09 UTC
Permalink
Michael S <***@yahoo.com> writes:

[.. when or why to use lambdas? ..]
Post by Michael S
Some time ago on comp.lang.c we had very long thread about
floodfill4 algorithm (that both myself and TimR took more
seriously than an issue deserves, but that's off topic).
Thank goodness for that. :)
Post by Michael S
Today I coded two implementations of original brute-force
recursive algorithm.
// floodfill_recursive_nested.
// Implementation is in none-tricky C++
// Very similar to what can be done in C
#include <cstddef>
int floodfill4(
unsigned char *grey,
int width, int height,
int x, int y,
unsigned char target, unsigned char dest)
{
if (width < 1 || height < 1)
return 0;
if (x < 0 || x >= width || y < 0 || y >= height)
return 0;
size_t w = width, h = height;
if (grey[y*w+x] != target)
return 0;
struct {
unsigned char *grey;
size_t width, height;
unsigned char target, dest;
void core(size_t x, size_t y) const
{
if (x < width && y < height) {
auto idx = y*width+x;
if (grey[idx] == target) {
grey[idx] = dest;
core(x - 1, y);
core(x + 1, y);
core(x, y - 1);
core(x, y + 1);
}
}
}
} context = {
.grey = grey,
.width = w, .height = h,
.target = target, .dest = dest,
};
context.core(x, y);
return 1;
}
// end of floodfill_recursive_nested.
// floodfill_recursive_lambda.
// Implementation in tricky C++
// C can not do it
#include <cstddef>
int floodfill4(
unsigned char *grey,
int width, int height,
int x, int y,
unsigned char target, unsigned char dest)
{
if (width < 1 || height < 1)
return 0;
if (x < 0 || x >= width || y < 0 || y >= height)
return 0;
size_t w = width, h = height;
if (grey[y*w+x] != target)
return 0;
auto core = [=] (auto& a_ref, size_t x, size_t y) {
if (x >= w || y >= h)
return;
auto idx = y*w+x;
if (grey[idx] == target) {
grey[idx] = dest;
a_ref(a_ref, x - 1, y);
a_ref(a_ref, x + 1, y);
a_ref(a_ref, x, y - 1);
a_ref(a_ref, x, y + 1);
}
};
core(core, x, y);
return 1;
}
// end of floodfill_recursive_lambda.
In the second variant in order to make it compile at all I had to
uses very dirty trick with lambda passed as parameter to itself.
I copied it from Stack Overflow, but don't pretend to understand
why it works and why it needed in the first place.
But that's only part of the story.
The other part is that the first variant is 1.2x faster.
The examples show two different ways of achieving the same goal.
However these two ways aren't that much different from each
other. To me it seems like they are both making the same
mistake, which is using a language feature to hide some context
that is better left unhidden. Here is a sketch to make that
comment more concrete:

int floodfill4(
unsigned char *grey,
int width, int height,
int x, int y,
unsigned char target, unsigned char dest)
{
/* possible early return */

typedef struct {
unsigned char *grey;
// ... etc
} Stuff;

Stuff stuff = { grey, width, height, /*...*/ };

void
core( Stuff *it, size_t x, size_t y ){
auto k = y*it->width + x;
/* possible early return */
it->grey[k] = it->dest;
core( it, x-1, y );
core( it, x+1, y );
core( it, x, y-1 );
core( it, x, y+1 );
}

core( &stuff, x, y )
return 1;
}

The code for core() looks basically the same except that in a few
places we need to say it->width instead of width, etc. There is
no particular meaning to the contents of the struct; all that's
been done is to disguise where the variables are coming from. I
don't see any compelling reason to do that in this situation.

Getting back to lambdas, I would say that there are two primary
uses for lambdas. One use is as a convenience function, local to
an outer function definition, where some small-scale processing
step is encapsulated rather than being replicated. The second use
is as a call-back function given as an argument to some outside
function, where there is state that the outside function doesn't
know about. The same kind of reasoning applies to member functions
in structs defined locally in the outer function. Neither of those
scenarios applies in the earlier example functions.

In situations where an interface needs a callback function, usually
specifying a lambda parameter means less work for the client of
the interface, and so that scenario would be a good one to explore.
Michael S
2024-09-27 13:15:41 UTC
Permalink
On Thu, 26 Sep 2024 14:09:09 -0700
Post by Tim Rentsch
[.. when or why to use lambdas? ..]
Post by Michael S
Some time ago on comp.lang.c we had very long thread about
floodfill4 algorithm (that both myself and TimR took more
seriously than an issue deserves, but that's off topic).
Thank goodness for that. :)
Post by Michael S
Today I coded two implementations of original brute-force
recursive algorithm.
// floodfill_recursive_nested.
// Implementation is in none-tricky C++
// Very similar to what can be done in C
#include <cstddef>
int floodfill4(
unsigned char *grey,
int width, int height,
int x, int y,
unsigned char target, unsigned char dest)
{
if (width < 1 || height < 1)
return 0;
if (x < 0 || x >= width || y < 0 || y >= height)
return 0;
size_t w = width, h = height;
if (grey[y*w+x] != target)
return 0;
struct {
unsigned char *grey;
size_t width, height;
unsigned char target, dest;
void core(size_t x, size_t y) const
{
if (x < width && y < height) {
auto idx = y*width+x;
if (grey[idx] == target) {
grey[idx] = dest;
core(x - 1, y);
core(x + 1, y);
core(x, y - 1);
core(x, y + 1);
}
}
}
} context = {
.grey = grey,
.width = w, .height = h,
.target = target, .dest = dest,
};
context.core(x, y);
return 1;
}
// end of floodfill_recursive_nested.
// floodfill_recursive_lambda.
// Implementation in tricky C++
// C can not do it
#include <cstddef>
int floodfill4(
unsigned char *grey,
int width, int height,
int x, int y,
unsigned char target, unsigned char dest)
{
if (width < 1 || height < 1)
return 0;
if (x < 0 || x >= width || y < 0 || y >= height)
return 0;
size_t w = width, h = height;
if (grey[y*w+x] != target)
return 0;
auto core = [=] (auto& a_ref, size_t x, size_t y) {
if (x >= w || y >= h)
return;
auto idx = y*w+x;
if (grey[idx] == target) {
grey[idx] = dest;
a_ref(a_ref, x - 1, y);
a_ref(a_ref, x + 1, y);
a_ref(a_ref, x, y - 1);
a_ref(a_ref, x, y + 1);
}
};
core(core, x, y);
return 1;
}
// end of floodfill_recursive_lambda.
In the second variant in order to make it compile at all I had to
uses very dirty trick with lambda passed as parameter to itself.
I copied it from Stack Overflow, but don't pretend to understand
why it works and why it needed in the first place.
But that's only part of the story.
The other part is that the first variant is 1.2x faster.
The examples show two different ways of achieving the same goal.
However these two ways aren't that much different from each
other. To me it seems like they are both making the same
mistake, which is using a language feature to hide some context
that is better left unhidden. Here is a sketch to make that
int floodfill4(
unsigned char *grey,
int width, int height,
int x, int y,
unsigned char target, unsigned char dest)
{
/* possible early return */
typedef struct {
unsigned char *grey;
// ... etc
} Stuff;
Stuff stuff = { grey, width, height, /*...*/ };
void
core( Stuff *it, size_t x, size_t y ){
auto k = y*it->width + x;
/* possible early return */
it->grey[k] = it->dest;
core( it, x-1, y );
core( it, x+1, y );
core( it, x, y-1 );
core( it, x, y+1 );
}
core( &stuff, x, y )
return 1;
}
The code for core() looks basically the same except that in a few
places we need to say it->width instead of width, etc. There is
no particular meaning to the contents of the struct; all that's
been done is to disguise where the variables are coming from. I
don't see any compelling reason to do that in this situation.
Well, the code above is neither legal standard C++ nor legal standard C.
But I understand what you mean and how to make it legal by extracting
core() out of floodfill4() body. More so, that's where I started.
In theory, it's should be exactly the same as variant with core() as
member function. In practice, compiler (gcc) optimizes recursion of
member function much better.
Post by Tim Rentsch
Getting back to lambdas, I would say that there are two primary
uses for lambdas. One use is as a convenience function, local to
an outer function definition, where some small-scale processing
step is encapsulated rather than being replicated. The second use
is as a call-back function given as an argument to some outside
function, where there is state that the outside function doesn't
know about. The same kind of reasoning applies to member functions
in structs defined locally in the outer function. Neither of those
scenarios applies in the earlier example functions.
I don't disagree in theory. Practice is something else.
Post by Tim Rentsch
In situations where an interface needs a callback function, usually
specifying a lambda parameter means less work for the client of
the interface, and so that scenario would be a good one to explore.
Tim Rentsch
2024-09-28 11:25:42 UTC
Permalink
Post by Michael S
On Thu, 26 Sep 2024 14:09:09 -0700
[...]
Post by Michael S
Post by Tim Rentsch
Getting back to lambdas, I would say that there are two primary
uses for lambdas. One use is as a convenience function, local to
an outer function definition, where some small-scale processing
step is encapsulated rather than being replicated. The second use
is as a call-back function given as an argument to some outside
function, where there is state that the outside function doesn't
know about. The same kind of reasoning applies to member functions
in structs defined locally in the outer function. Neither of those
scenarios applies in the earlier example functions.
I don't disagree in theory. Practice is something else.
Could I ask you to expand on that answer? I'm not sure what
it is you're agreeing with (or not disagreeing with), and
also not sure what the reasons are for not agreeing.

I confess to being completely lost about what you're
getting at with the practice/theory distinction here.

Loading...