-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added logs rfc #4
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a quick read through and left some minor comments -- I will try to write up something more substantial.
- fork join graph | ||
|
||
I'll start by the raw events. Since logging modify application behaviors we need to be able to log at a very low (and deterministic) cost. | ||
Basically each thread records in memory a few bytes tag for every event happening to him. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically each thread records in memory a few bytes tag for every event happening to him. | |
Basically each thread records in memory a few bytes tag for every event happening to it. |
|
||
## Saving logs. | ||
|
||
It is the user's responsibility to not save logs while using threads. It should not crash but the graphs obtained |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are logs saved? I guess there is a function that gets called by the user to save logs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, there is a save_logs function (global pool) or method on the pool.
Currently I depend on the following crates. Should I remove these dependencies ? | ||
- time (easy to remove) | ||
- itertools (harder to remove but ok) | ||
- serde (also easy to remove since I output logs in json) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we probably want to remove dependencies where we can
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, that's not a big deal. that's fine.
|
||
## Svg | ||
|
||
All graph algorithms and svg display are not included. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we will want to create an external crate that is able to do the postprocessing and imagine manipulation
As I mentioned in #5, I've also created a logging system that I was using there to analyze performance. However, the focus of those logs was pretty different. They were focused on logging the internals of the scheduler, like how many sleeping threads there are at a given time, how many pending jobs, and so forth. Still, there are of course similarities. I think overall it'd be great to have a mechanism (such as the one described in this RFC) that lets users log and visualize the parallelism they are getting. I think one thing that wasn't described very much in this RFC but which I would want to think about is how this mechanism is enabled/disabled and interacted with. A few thoughts, in no particular order: If possible, I think we should avoid requiring a My reasoning is mostly that testing around features and cfgs is a pain. You have to test with the enabled, disabled, and every which way, and they multiply. Moreover, we can tune the execution for logs being disabled (e.g., have an if that guards a function call) and it is unlikely to have an impact on performance. What I didn't get from this RFC was some idea how users interact with logs. How are logs enabled and saved? It seemed to me that there was probably some method that they should call, and that they have to do so while parallelism is not happening. Presumably this is a method on the thread-pool? This seems "ok" but it may be inconvenient -- particularly in larger applications, the use of Rayon may not be readily accessible. An alternative might be to use an environment variable, but it raises the question of "when do we dump the events?". (The approach I was trying in my internal event logger was to have events get flushed automatically, each time we exit a root event.) I guess a final thought is -- who do we envision consuming these logs? What kind of "semver guarantees" do we want to provide around them? Are these mostly going to be used by us, in tuning parallel iterator implementations? Do we think end users will want them? I'm somewhat inclined to try and avoid making guarantees. I'd prefer to say something like "the logging facility is provided for debugging and performance tuning purposes only; it may change radically or even be removed in the future". To that end, the more that we can avoid adding methods or other publicly visible bits of API related to logging, the better. |
Oh, one final question that I was wondering about: After logs are saved, is the data cleared? I had the impression from the RFC that data is never cleared. I guess this assumes then that we are working with short-lived benchmarks? |
here are a few more answers to your questions:
|
hi, i'm back to work. let's start the rfc on logs.