Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make init_mechlist thread-safe #584

Open
elliefm opened this issue Nov 15, 2019 · 9 comments
Open

make init_mechlist thread-safe #584

elliefm opened this issue Nov 15, 2019 · 9 comments

Comments

@elliefm
Copy link
Contributor

elliefm commented Nov 15, 2019

Originally opened as cyrusimap/cyrus-imapd#2813 by @vlmarek

Hi,

We are doing some housekeeping in the changes we use in cyrus-sasl in Solaris.

https://github.com/oracle/solaris-userland/tree/master/components/cyrus-sasl/patches

And I have noticed that we have in the pipeline a fix which never got to be used. It seems simple enough, but I can't judge if this is something you would consider having. If yes, I'm happy to try to port it to current version and create a pull request. If no I will just drop it because we don't need it internally anymore.

That said I don't even see lib/client.c in your code so probably this is pointless ...

Thank you
__
Vlad

With the following attached patch:

https://github.com/cyrusimap/cyrus-imapd/files/3331156/116-race-condition.txt

@quanah
Copy link
Contributor

quanah commented Feb 24, 2020

@vlmarek The proposed fix is in and of itself, not thread safe, as one thread could allocate and start using the mutex, and a second thread could allocate a new mutex, overwriting the pointer to the first one. So while the problem description is correct, it needs a different solution.

@quanah quanah added the bug label Feb 27, 2020
@simo5
Copy link
Contributor

simo5 commented Mar 19, 2020

@quanah wouldn't it be on the mutexes implementation to make sure that sasl_MUTEX_ALLOC() is itself thread safe and always returns the same pointer if multiple threads are racing to call it ?

@hyc
Copy link
Contributor

hyc commented Mar 20, 2020

@simo5 the API isn't designed to allow callers to pass in a unique handle, so sasl_MUTEX_ALLOC() can't know it's being called multiple times for the same purpose. It's functionally like malloc, always returning a new object on every call.

@simo5
Copy link
Contributor

simo5 commented Mar 20, 2020

@hyc duh right! I was looking at the tree (the specific place) and not the forest ...
I guess some trick could be done to figure out if the same handle needs to be returned based on the stack, but yeah, guess a new API would need to be provided here to do the right thing.

@quanah quanah added this to the 2.2.0 milestone Mar 25, 2022
@mistotebe
Copy link
Contributor

libsasl2 isn't thread-safe until someone supplies viable callbacks via sasl_set_mutex().

To get there, we might need to provide and document a single API entry point that users should call before libsasl2 is considered initialised. Probably a single one that does it for both client and server side. Getting this into 2.2 would give us a good opportunity to adjust our API and docs accordingly.

@simo5
Copy link
Contributor

simo5 commented Aug 21, 2023

Is there a reason why sasl2 can't use pthread_mutex by default instead of being simply thread unsafe?

@mistotebe
Copy link
Contributor

We should have a single documented entry point that initialises the library's mutexes, that would be the only thing people would have to make sure not to call concurrently.

In pthread we have PTHREAD_MUTEX_INITIALIZER, so the init above might even be a noop but officially, we also support systems (Windows) where AFAIK we can't bootstrap a mutex ourselves? (outside some __atomic_ + Sleep(0) trickery)

@simo5
Copy link
Contributor

simo5 commented Aug 30, 2023

Right, some platforms may need that, but some platforms can be made safer by default, perhaps they should be.

mistotebe added a commit to mistotebe/cyrus-sasl that referenced this issue Sep 5, 2023
Fixes cyrusimap#584

Signed-off-by: Ondřej Kuzník <okuznik@symas.com>
mistotebe added a commit to mistotebe/cyrus-sasl that referenced this issue Sep 18, 2023
Fixes cyrusimap#584

Signed-off-by: Ondřej Kuzník <okuznik@symas.com>
@quanah quanah closed this as completed in f6e7dc6 Sep 18, 2023
@adamyi
Copy link
Contributor

adamyi commented Mar 26, 2024

@mistotebe I don't think #803 completely fixed this for sasl_client_init unless I'm missing something.

I think the patch only made it such that _sasl_common_init is thread-safe, but even before entering that, sasl_client_init's _sasl_client_active check is not thread-safe and does mutate global_callbacks_client and more importantly cmechlist without any locks. So it's possible that one thread turns cmechlist to NULL and triggers a segfault.

Here's a simple repro:

cat test.c
#include <stdio.h>
#include <pthread.h>
#include <sasl/sasl.h>

void *work(void *param) {
  sasl_client_init(param);
  return NULL;
}

int main() {
  pthread_t t;
  pthread_create(&t, NULL, work, NULL);
  work(NULL);
  pthread_join(t, NULL);
  return 0;
}

And I'm getting the following segfault:

                Stack trace of thread 1996385:
                #0  0x00007fc0e3dd27eb init_mechlist (/usr/lib64/libsasl2.so.3.0.0)
                #1  0x000000000040115e n/a (/home/ayi/test)
                #2  0x00007fc0e39bb1ca start_thread (libpthread.so.0)
                #3  0x00007fc0e3627e73 __clone (libc.so.6)
                
                Stack trace of thread 1996384:
                #0  0x00007fc0e39bc6cd __GI___pthread_timedjoin_ex (libpthread.so.0)
                #1  0x00000000004011be n/a (/home/ayi/test)
                #2  0x00007fc0e3628d85 __libc_start_main (libc.so.6)
                #3  0x000000000040108e n/a (/home/ayi/test)
Core was generated by `./test'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007fc0e3dd27eb in init_mechlist () at client.c:69
69	 if (cmechlist->utils==NULL)
[Current thread is 1 (Thread 0x7fc0e1d75700 (LWP 1996385))]
(gdb) p cmechlist
$1 = (cmech_list_t *) 0x0
(gdb) 

In another more complicated code, I'm getting a similar segfault also due to cmechlist turning 0x00:

#0  0x00007f49ca5e6fe7 in sasl_client_add_plugin (plugname=plugname@entry=0x7f49ca5f590d "EXTERNAL", entry_point=<optimized out>) at client.c:225
#1  0x00007f49ca5e7817 in sasl_client_init (callbacks=0x0) at client.c:311

(gdb) p cmechlist
$1 = (cmech_list_t *) 0x0

I think we might still need something similar to static_mutex to guard client.c in addition to #803 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants