-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FR: Remove producer/consumer; Confusing output #491
Comments
More on this. When using
|
It's normal there are a few thread/process that must end before it does shut down. This is a message from the main thread/process telling you the shutdown command is received and processing. It will close once the child process are ended, do not kill it. It probably either finishing it's batch payments or waiting for payment confirmation. This is correct, not a bug. |
Yes, I completely understand what is happening. The feature request (FR) is to remove the threaded nature as it is completely unnecessary and outputs confusing information. |
You want a redesign that is not threaded because the log is too hard to understand? Is that correct? |
Maybe improve the log messages and see if there are actual race conditions. Have a second pair of eyes on the threading. Rebuilding the whole thing? .... mäh... IMO it is working as intended with just some confusing output |
I agree with jdiska, the output is only confusing if you do not understand the threaded nature, maybe a section to explain it. Also some better output message in the log. |
I never said hard to understand. I said it was confusing to state the program was exiting/shutting down, when in fact it could be 10m until it stops. I'm also saying that the threaded nature provides absolutely zero benefit to the application's performance as a whole and only adds unnecessary code complexity. I'd estimate a good 500-900 lines of code could be removed and the performance/behavior would be exactly the same. Changing the messaging would be a good band-aid so that it's no longer confusing. |
Actually if more than one ops occurs during that span, it does make sense. From what I saw of the code, it would be a fair bit of redesign. There are IMO benefits to the multi threaded design in the way it works. All because the shutdown occurs 10 minutes after it says it is shutting down. Is it overly complex, yeah I agree. But a redesign would be as complex and doing sequential with no thread will bring problems of it's own. |
👍 I never understood the benefits of producer/consumer. It's like we are forcing ourselves to do it async processing in what is fundamentally a synchronous path. Step 1: calculate the rewards. Step 2: pay them. Why do we have this in the first place? I could see paying several cycles in a row, in this case you calculate all cycles in a sequence and you pay all batches in a sequence, in parallel. So you can save time here. But what's the point? Paying several cycles in a row is dangerous and unusual, not something we want to make faster. |
Maybe it would male sense, if you ran payment for more than one baker on one system (it does not) It could be the reason it was designed that way. |
@habanoz can you please provide historical context on this? Why do we have a producer and a consumer? |
It was designed this way from the very beginning. TRD is not designed solely for single baker single cycle payments. Multiple bakers support was on the road map but eventually had to be dropped. (see #15 ) P/C is preferred due to its extensible and scalable nature. Another benefit is concern separation: Producers
Consumer Regards |
We added code in #653 to have TRD send a non-zero exit status when anything wrong happens - including payout failures, or (relevant here) keyboard interrupts. However, it turns out that when using TRD in runmode -2, -3, -4, the regular course of operations is to have the producer thread send a KeyboardInterrupt when done... which is then treated as error! This is done with `_thread.interrupt_main()`. It's bad: humans send interrupts, programs should not. Instead, I'm suggesting that the producer use `SIGUSR1` to trigger a normal interruption. This triggers the same code path as today, just with an exit status of SUCCESS. So I expect that the main thread will still wait for the consumer, and it won't actually exit before payouts are done, if there are any. I'll deploy in a small baker and see if that's true. I have already verified that when no payouts are due, exit code is now zero. For the record, I still believe this entire producer/consumer logic should be thrown away and we should simply do things in order (calculate, then pay). I've ranted about this before (#491).
* fix #678 - Fix single-shot mode always exiting in error (#679) * We added code in #653 to have TRD send a non-zero exit status when anything wrong happens - including payout failures, or (relevant here) keyboard interrupts. * However, it turns out that when using TRD in runmode -2, -3, -4, the regular course of operations is to have the producer thread send a KeyboardInterrupt when done... which is then treated as error! * This is done with `_thread.interrupt_main()`. It's bad: humans send interrupts, programs should not. Instead, I'm suggesting that the producer use `SIGUSR1` to trigger a normal interruption. * This triggers the same code path as today, just with an exit status of SUCCESS. So I expect that the main thread will still wait for the consumer, and it won't actually exit before payouts are done, if there are any. * For the record, I still believe this entire producer/consumer logic should be thrown away and we should simply do things in order (calculate, then pay). I've ranted about this before (#491). * catch the consumer failures and fail the main process accordingly * Also propagate error for producer failures (e.g. tzkt unresponsive) * Contributor: nicolasochem, Effort=3h * Reviewer: jdsika, Effort=0.5h --------- Co-authored-by: Carlo van Driesten <carlo.van-driesten@vdl.digital>
There is no real purpose behind the P/C operation of TRD. It does not make TRD process any faster due to parallelism nor does it improve concurrency. What it does do is present confusion:
TRD remained like this for a full minute. Is TRD stuck? Looks like it is shutting down/stopping. But no, not really. In the background is another thread injecting the multiple batches of payouts.
IMO, we should remove the P/C in favor of a simple payout loop. The P/C is just unnecessary overhead/complexity.
The text was updated successfully, but these errors were encountered: