Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added logs rfc #4

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

added logs rfc #4

wants to merge 1 commit into from

Conversation

wagnerf42
Copy link

hi, i'm back to work. let's start the rfc on logs.

Copy link
Member

@nikomatsakis nikomatsakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a quick read through and left some minor comments -- I will try to write up something more substantial.

- fork join graph

I'll start by the raw events. Since logging modify application behaviors we need to be able to log at a very low (and deterministic) cost.
Basically each thread records in memory a few bytes tag for every event happening to him.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Basically each thread records in memory a few bytes tag for every event happening to him.
Basically each thread records in memory a few bytes tag for every event happening to it.


## Saving logs.

It is the user's responsibility to not save logs while using threads. It should not crash but the graphs obtained
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are logs saved? I guess there is a function that gets called by the user to save logs?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, there is a save_logs function (global pool) or method on the pool.

Currently I depend on the following crates. Should I remove these dependencies ?
- time (easy to remove)
- itertools (harder to remove but ok)
- serde (also easy to remove since I output logs in json)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we probably want to remove dependencies where we can

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, that's not a big deal. that's fine.


## Svg

All graph algorithms and svg display are not included.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we will want to create an external crate that is able to do the postprocessing and imagine manipulation

@nikomatsakis
Copy link
Member

As I mentioned in #5, I've also created a logging system that I was using there to analyze performance. However, the focus of those logs was pretty different. They were focused on logging the internals of the scheduler, like how many sleeping threads there are at a given time, how many pending jobs, and so forth. Still, there are of course similarities.

I think overall it'd be great to have a mechanism (such as the one described in this RFC) that lets users log and visualize the parallelism they are getting. I think one thing that wasn't described very much in this RFC but which I would want to think about is how this mechanism is enabled/disabled and interacted with.

A few thoughts, in no particular order:

If possible, I think we should avoid requiring a #[cfg] or a cargo feature, but I'm not sure on this point. I've been leaning this way for the other logging mechanism as well. In principle, though, that requires a certain cost for every execution -- the branches to check if logging is enabled will be compiled in -- though hopefully a very minimal one.

My reasoning is mostly that testing around features and cfgs is a pain. You have to test with the enabled, disabled, and every which way, and they multiply. Moreover, we can tune the execution for logs being disabled (e.g., have an if that guards a function call) and it is unlikely to have an impact on performance.

What I didn't get from this RFC was some idea how users interact with logs. How are logs enabled and saved? It seemed to me that there was probably some method that they should call, and that they have to do so while parallelism is not happening. Presumably this is a method on the thread-pool? This seems "ok" but it may be inconvenient -- particularly in larger applications, the use of Rayon may not be readily accessible. An alternative might be to use an environment variable, but it raises the question of "when do we dump the events?".

(The approach I was trying in my internal event logger was to have events get flushed automatically, each time we exit a root event.)

I guess a final thought is -- who do we envision consuming these logs? What kind of "semver guarantees" do we want to provide around them? Are these mostly going to be used by us, in tuning parallel iterator implementations? Do we think end users will want them?

I'm somewhat inclined to try and avoid making guarantees. I'd prefer to say something like "the logging facility is provided for debugging and performance tuning purposes only; it may change radically or even be removed in the future". To that end, the more that we can avoid adding methods or other publicly visible bits of API related to logging, the better.

@nikomatsakis
Copy link
Member

Oh, one final question that I was wondering about:

After logs are saved, is the data cleared? I had the impression from the RFC that data is never cleared. I guess this assumes then that we are working with short-lived benchmarks?

@wagnerf42
Copy link
Author

here are a few more answers to your questions:

  • about #cfg
    you are right that it is painful (especially when chaining crates dependencies).
    I really wanted to be able to remove all costs if unneeded while keeping the same source
    code for both logged and unlogged runs.
    there might be a possibility to do that without features : have a ThreadPool trait with one logging threadpool and one non-logging both answering the same api (join, join_context, scope, ...)`
    the other possibility is to add conditionals everywhere which would be not so nice for me.
  • logs interactions:
    logs are auto enabled when the feature is turned on.
    then you use the save_logs function to get them out of the pool.
    we could also autosave when destroying the pool to keep an identical api.
    if we switch to different threadpools we could have logging threadpools and unlogging threadpools.
  • logs consumption:
    I think the logs are great for understanding what goes on in your code
    (especially with the extended features from rayon logs like hardware events or work computation).
    For sure it is useful for rayon developers and teachers
    (oh by the way I've got a course started now : http://wagnerf.pages.ensimag.fr/parallel-systems/) but I do hope it would be good for apps developers also.
    the semver question is interesting because I've already been confronted to that when changing data file format. My take was to just revert using an older library.
  • data clearing:
    currently nothing is cleared.
    it would be possible to partially clear data though.
    we need a mutex on save_logs, a counter indicating which task has not yet been saved.
    data is appended at the end of a buffers list. when saving we could clear all buffers except the working one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants