smrproxy v2

I replaced the hazard pointer logic in smrproxy. It's now wait-free
instead of mostly wait-free. The reader lock logic after loading
the address of the reader lock object into a register is now 2
instructions a load followed by a store. The unlock is same
as before, just a store.
It's way faster now.
It's on the feature/003 branch as a POC. I'm working on porting
it to c++ and don't want to waste any more time on c version.
No idea of it's a new algorithm. I suspect that since I use
the term epoch that it will be claimed that it's ebr, epoch
based reclamation, and that all ebr algorithms are equivalent.
Though I suppose you could argue it's qsbr if I point out what
the quiescent states are.

I have to take a look at it! Been really busy lately. Shit happens.

jseigh

2024-10-17 21:08:04 UTC

I replaced the hazard pointer logic in smrproxy. It's now wait-free
instead of mostly wait-free. The reader lock logic after loading
the address of the reader lock object into a register is now 2
instructions a load followed by a store. The unlock is same
as before, just a store.
It's way faster now.
It's on the feature/003 branch as a POC. I'm working on porting
it to c++ and don't want to waste any more time on c version.
No idea of it's a new algorithm. I suspect that since I use
the term epoch that it will be claimed that it's ebr, epoch
based reclamation, and that all ebr algorithms are equivalent.
Though I suppose you could argue it's qsbr if I point out what
the quiescent states are.

I have to take a look at it! Been really busy lately. Shit happens.

There's a quick and dirty explanation at
http://threadnought.wordpress.com/

repo at https://github.com/jseigh/smrproxy

I'll need to create some memory access diagrams that
visualize how it works at some point.

Anyway if it's new, another algorithm to use without
attribution.

Joe Seigh

Chris M. Thomasson

2024-10-17 23:40:00 UTC

I replaced the hazard pointer logic in smrproxy. It's now wait-free
instead of mostly wait-free. The reader lock logic after loading
the address of the reader lock object into a register is now 2
instructions a load followed by a store. The unlock is same
as before, just a store.
It's way faster now.
It's on the feature/003 branch as a POC. I'm working on porting
it to c++ and don't want to waste any more time on c version.
No idea of it's a new algorithm. I suspect that since I use
the term epoch that it will be claimed that it's ebr, epoch
based reclamation, and that all ebr algorithms are equivalent.
Though I suppose you could argue it's qsbr if I point out what
the quiescent states are.

I have to take a look at it! Been really busy lately. Shit happens.

There's a quick and dirty explanation at
http://threadnought.wordpress.com/
repo at https://github.com/jseigh/smrproxy
I'll need to create some memory access diagrams that
visualize how it works at some point.
Anyway if it's new, another algorithm to use without
attribution.

Interesting. From a quick view, it kind of reminds me of a distributed
seqlock for some reason. Are you using an asymmetric membar in here? in
smr_poll ?

Chris M. Thomasson

2024-10-18 00:16:55 UTC

I replaced the hazard pointer logic in smrproxy. It's now wait-free
instead of mostly wait-free. The reader lock logic after loading
the address of the reader lock object into a register is now 2
instructions a load followed by a store. The unlock is same
as before, just a store.
It's way faster now.
It's on the feature/003 branch as a POC. I'm working on porting
it to c++ and don't want to waste any more time on c version.
No idea of it's a new algorithm. I suspect that since I use
the term epoch that it will be claimed that it's ebr, epoch
based reclamation, and that all ebr algorithms are equivalent.
Though I suppose you could argue it's qsbr if I point out what
the quiescent states are.

I have to take a look at it! Been really busy lately. Shit happens.

There's a quick and dirty explanation at
http://threadnought.wordpress.com/
repo at https://github.com/jseigh/smrproxy
I'll need to create some memory access diagrams that
visualize how it works at some point.
Anyway if it's new, another algorithm to use without
attribution.

Interesting. From a quick view, it kind of reminds me of a distributed
seqlock for some reason. Are you using an asymmetric membar in here? in
smr_poll ?

I remember a long time ago I was messing around where each thread had
two version counters:

pseudo code:

per_thread
{
word m_version[2];

word acquire()
{
word ver = load(global_version);
m_version[ver % 2] = ver ;
return ver ;
}

void release(word ver)
{
m_version[ver % 2] = 0;
}
}

The global_version would only be incremented by the polling thread. This
was WAY back. I think I might of posted about it on cpt.

So, when a node was made unreachable, it would be included in the
polling logic. The polling could increment the version counter then wait
for all the threads prior m_versions to be zero. Collect the current
generation of objects in a defer list. Then on the next cycle it would
increment the version counter, wait until all threads prior versions
were zero, then delete the defer count, and transfer the current gen to
the defer.

It went something like that.

Chris M. Thomasson

2024-10-18 00:24:08 UTC

I replaced the hazard pointer logic in smrproxy. It's now wait-free
instead of mostly wait-free. The reader lock logic after loading
the address of the reader lock object into a register is now 2
instructions a load followed by a store. The unlock is same
as before, just a store.
It's way faster now.
It's on the feature/003 branch as a POC. I'm working on porting
it to c++ and don't want to waste any more time on c version.
No idea of it's a new algorithm. I suspect that since I use
the term epoch that it will be claimed that it's ebr, epoch
based reclamation, and that all ebr algorithms are equivalent.
Though I suppose you could argue it's qsbr if I point out what
the quiescent states are.

I have to take a look at it! Been really busy lately. Shit happens.

There's a quick and dirty explanation at
http://threadnought.wordpress.com/
repo at https://github.com/jseigh/smrproxy
I'll need to create some memory access diagrams that
visualize how it works at some point.
Anyway if it's new, another algorithm to use without
attribution.

Interesting. From a quick view, it kind of reminds me of a distributed
seqlock for some reason. Are you using an asymmetric membar in here?
in smr_poll ?

I remember a long time ago I was messing around where each thread had
per_thread
{
    word m_version[2];
    word acquire()
    {
        word ver = load(global_version);
        m_version[ver % 2] = ver ;
        return ver ;
    }
    void release(word ver)
    {
        m_version[ver % 2] = 0;
    }
}
The global_version would only be incremented by the polling thread. This
was WAY back. I think I might of posted about it on cpt.
So, when a node was made unreachable, it would be included in the
polling logic. The polling could increment the version counter then wait
for all the threads prior m_versions to be zero. Collect the current
generation of objects in a defer list. Then on the next cycle it would
increment the version counter, wait until all threads prior versions
were zero, then delete the defer count, and transfer the current gen to
the defer.
It went something like that.

Iirc, I was using FlushProcessWriteBuffers back then for an asymmetric
barrier for my experiment. The polling thread would execute one after it
increased the global version. Actualy, I can't remember where I placed
it exactly, after or before. The defer list made things work.

jseigh

2024-10-18 12:07:11 UTC

I replaced the hazard pointer logic in smrproxy. It's now wait-free
instead of mostly wait-free. The reader lock logic after loading
the address of the reader lock object into a register is now 2
instructions a load followed by a store. The unlock is same
as before, just a store.
It's way faster now.
It's on the feature/003 branch as a POC. I'm working on porting
it to c++ and don't want to waste any more time on c version.
No idea of it's a new algorithm. I suspect that since I use
the term epoch that it will be claimed that it's ebr, epoch
based reclamation, and that all ebr algorithms are equivalent.
Though I suppose you could argue it's qsbr if I point out what
the quiescent states are.

I have to take a look at it! Been really busy lately. Shit happens.

There's a quick and dirty explanation at
http://threadnought.wordpress.com/
repo at https://github.com/jseigh/smrproxy
I'll need to create some memory access diagrams that
visualize how it works at some point.
Anyway if it's new, another algorithm to use without
attribution.

Interesting. From a quick view, it kind of reminds me of a distributed
seqlock for some reason. Are you using an asymmetric membar in here? in
smr_poll ?

Yes, linux membarrier() in smr_poll.

Not seqlock, not least for the reason that exiting the critical region
is 3 instructions unless you use atomics which are expensive and have
memory barriers usually.

A lot of the qsbr and ebr reader lock/unlock code is going to look
somewhat similar so you have to know how the reclaim logic uses it.
In this case I am slingshotting off of the asymmetric memory barrier.

Earlier at one point I was going to have smrproxy use hazard pointer
logic or qsbr logic as a config option, but the extra code complexity
and the fact that qsbr required 2 grace periods kind of made that
unfeasible. The qsbr logic was mostly ripped out but there were still
some pieces there.

Anyway I'm working a c++ version which involves a lot of extra work
besides just rewriting smrproxy. There coming up with an api for
proxies and testcases which tend to be more work than the code that
they are testing.

Joe Seigh

Chris M. Thomasson

2024-10-25 22:00:15 UTC

I replaced the hazard pointer logic in smrproxy. It's now wait-free
instead of mostly wait-free. The reader lock logic after loading
the address of the reader lock object into a register is now 2
instructions a load followed by a store. The unlock is same
as before, just a store.
It's way faster now.
It's on the feature/003 branch as a POC. I'm working on porting
it to c++ and don't want to waste any more time on c version.
No idea of it's a new algorithm. I suspect that since I use
the term epoch that it will be claimed that it's ebr, epoch
based reclamation, and that all ebr algorithms are equivalent.
Though I suppose you could argue it's qsbr if I point out what
the quiescent states are.

I have to take a look at it! Been really busy lately. Shit happens.

There's a quick and dirty explanation at
http://threadnought.wordpress.com/
repo at https://github.com/jseigh/smrproxy
I'll need to create some memory access diagrams that
visualize how it works at some point.
Anyway if it's new, another algorithm to use without
attribution.

Interesting. From a quick view, it kind of reminds me of a distributed
seqlock for some reason. Are you using an asymmetric membar in here?
in smr_poll ?

Yes, linux membarrier() in smr_poll.
Not seqlock, not least for the reason that exiting the critical region
is 3 instructions unless you use atomics which are expensive and have
memory barriers usually.
A lot of the qsbr and ebr reader lock/unlock code is going to look
somewhat similar so you have to know how the reclaim logic uses it.
In this case I am slingshotting off of the asymmetric memory barrier.
Earlier at one point I was going to have smrproxy use hazard pointer
logic or qsbr logic as a config option, but the extra code complexity
and the fact that qsbr required 2 grace periods kind of made that
unfeasible. The qsbr logic was mostly ripped out but there were still
some pieces there.
Anyway I'm working a c++ version which involves a lot of extra work
besides just rewriting smrproxy. There coming up with an api for
proxies and testcases which tend to be more work than the code that
they are testing.

Damn! I almost missed this post! Fucking Thunderbird... Will get back to
you. Working on something else right now Joe, thanks.

https://www.facebook.com/share/p/ydGSuPLDxjkY9TAQ/

jseigh

2024-10-25 22:56:16 UTC

I replaced the hazard pointer logic in smrproxy. It's now wait-free
instead of mostly wait-free. The reader lock logic after loading
the address of the reader lock object into a register is now 2
instructions a load followed by a store. The unlock is same
as before, just a store.
It's way faster now.
It's on the feature/003 branch as a POC. I'm working on porting
it to c++ and don't want to waste any more time on c version.
No idea of it's a new algorithm. I suspect that since I use
the term epoch that it will be claimed that it's ebr, epoch
based reclamation, and that all ebr algorithms are equivalent.
Though I suppose you could argue it's qsbr if I point out what
the quiescent states are.

I have to take a look at it! Been really busy lately. Shit happens.

There's a quick and dirty explanation at
http://threadnought.wordpress.com/
repo at https://github.com/jseigh/smrproxy
I'll need to create some memory access diagrams that
visualize how it works at some point.
Anyway if it's new, another algorithm to use without
attribution.

Interesting. From a quick view, it kind of reminds me of a
distributed seqlock for some reason. Are you using an asymmetric
membar in here? in smr_poll ?

Yes, linux membarrier() in smr_poll.
Not seqlock, not least for the reason that exiting the critical region
is 3 instructions unless you use atomics which are expensive and have
memory barriers usually.
A lot of the qsbr and ebr reader lock/unlock code is going to look
somewhat similar so you have to know how the reclaim logic uses it.
In this case I am slingshotting off of the asymmetric memory barrier.
Earlier at one point I was going to have smrproxy use hazard pointer
logic or qsbr logic as a config option, but the extra code complexity
and the fact that qsbr required 2 grace periods kind of made that
unfeasible. The qsbr logic was mostly ripped out but there were still
some pieces there.
Anyway I'm working a c++ version which involves a lot of extra work
besides just rewriting smrproxy. There coming up with an api for
proxies and testcases which tend to be more work than the code that
they are testing.

Damn! I almost missed this post! Fucking Thunderbird... Will get back to
you. Working on something else right now Joe, thanks.
https://www.facebook.com/share/p/ydGSuPLDxjkY9TAQ/

No problem. The c++ work is progressing pretty slowly, not least in
part because the documentation is not always clear as to what
something does or even what problem it is supposed to solve.
To think I took a pass on on rust because I though it was
more complicated than it needed to be.

Joe Seigh

Chris M. Thomasson

2024-10-27 19:33:46 UTC

I replaced the hazard pointer logic in smrproxy. It's now wait-free
instead of mostly wait-free. The reader lock logic after loading
the address of the reader lock object into a register is now 2
instructions a load followed by a store. The unlock is same
as before, just a store.
It's way faster now.
It's on the feature/003 branch as a POC. I'm working on porting
it to c++ and don't want to waste any more time on c version.
No idea of it's a new algorithm. I suspect that since I use
the term epoch that it will be claimed that it's ebr, epoch
based reclamation, and that all ebr algorithms are equivalent.
Though I suppose you could argue it's qsbr if I point out what
the quiescent states are.

I have to take a look at it! Been really busy lately. Shit happens.

There's a quick and dirty explanation at
http://threadnought.wordpress.com/
repo at https://github.com/jseigh/smrproxy
I'll need to create some memory access diagrams that
visualize how it works at some point.
Anyway if it's new, another algorithm to use without
attribution.

Interesting. From a quick view, it kind of reminds me of a
distributed seqlock for some reason. Are you using an asymmetric
membar in here? in smr_poll ?

Yes, linux membarrier() in smr_poll.
Not seqlock, not least for the reason that exiting the critical region
is 3 instructions unless you use atomics which are expensive and have
memory barriers usually.
A lot of the qsbr and ebr reader lock/unlock code is going to look
somewhat similar so you have to know how the reclaim logic uses it.
In this case I am slingshotting off of the asymmetric memory barrier.
Earlier at one point I was going to have smrproxy use hazard pointer
logic or qsbr logic as a config option, but the extra code complexity
and the fact that qsbr required 2 grace periods kind of made that
unfeasible. The qsbr logic was mostly ripped out but there were still
some pieces there.
Anyway I'm working a c++ version which involves a lot of extra work
besides just rewriting smrproxy. There coming up with an api for
proxies and testcases which tend to be more work than the code that
they are testing.

Damn! I almost missed this post! Fucking Thunderbird... Will get back
to you. Working on something else right now Joe, thanks.
https://www.facebook.com/share/p/ydGSuPLDxjkY9TAQ/

Never even tried Rust, shit, I am behind the times. ;^)

Humm... I don't think we can get 100% C++ because of the damn asymmetric
membar for these rather "specialized" algorithms?

Is C++ thinking about creating a standard way to gain an asymmetric membar?

jseigh

2024-10-27 22:29:43 UTC

I replaced the hazard pointer logic in smrproxy. It's now wait-
free
instead of mostly wait-free. The reader lock logic after loading
the address of the reader lock object into a register is now 2
instructions a load followed by a store. The unlock is same
as before, just a store.
It's way faster now.
It's on the feature/003 branch as a POC. I'm working on porting
it to c++ and don't want to waste any more time on c version.
No idea of it's a new algorithm. I suspect that since I use
the term epoch that it will be claimed that it's ebr, epoch
based reclamation, and that all ebr algorithms are equivalent.
Though I suppose you could argue it's qsbr if I point out what
the quiescent states are.

I have to take a look at it! Been really busy lately. Shit happens.

There's a quick and dirty explanation at
http://threadnought.wordpress.com/
repo at https://github.com/jseigh/smrproxy
I'll need to create some memory access diagrams that
visualize how it works at some point.
Anyway if it's new, another algorithm to use without
attribution.

Interesting. From a quick view, it kind of reminds me of a
distributed seqlock for some reason. Are you using an asymmetric
membar in here? in smr_poll ?

Yes, linux membarrier() in smr_poll.
Not seqlock, not least for the reason that exiting the critical region
is 3 instructions unless you use atomics which are expensive and have
memory barriers usually.
A lot of the qsbr and ebr reader lock/unlock code is going to look
somewhat similar so you have to know how the reclaim logic uses it.
In this case I am slingshotting off of the asymmetric memory barrier.
Earlier at one point I was going to have smrproxy use hazard pointer
logic or qsbr logic as a config option, but the extra code complexity
and the fact that qsbr required 2 grace periods kind of made that
unfeasible. The qsbr logic was mostly ripped out but there were still
some pieces there.
Anyway I'm working a c++ version which involves a lot of extra work
besides just rewriting smrproxy. There coming up with an api for
proxies and testcases which tend to be more work than the code that
they are testing.

Damn! I almost missed this post! Fucking Thunderbird... Will get back
to you. Working on something else right now Joe, thanks.
https://www.facebook.com/share/p/ydGSuPLDxjkY9TAQ/

Never even tried Rust, shit, I am behind the times. ;^)
Humm... I don't think we can get 100% C++ because of the damn asymmetric
membar for these rather "specialized" algorithms?
Is C++ thinking about creating a standard way to gain an asymmetric membar?

I don't think so. It's platform dependent. Apart from linux, mostly
it's done with a call to some virtual memory function that flushes
the TLBs (translation lookaside buffers) which involves IPI calls
to all the processors and those have memory barriers. This is
old, 1973, patent 3,947,823 cited by the patent I did.

Anyway, I version the code so there's a asymmetric memory barrier
version and an explicit memory barrier version, the latter
being much slower.

Joe Seigh

Chris M. Thomasson

2024-10-27 22:32:33 UTC