Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trace aggregator decouple #90

Merged
merged 4 commits into from
Jul 25, 2024
Merged

Trace aggregator decouple #90

merged 4 commits into from
Jul 25, 2024

Conversation

shuhaowu
Copy link
Contributor

Previously we were leaking ThreadTracer objects in the
TraceAggregator as creating new threads means ThreadTracer gets
pushed into TraceAggregator but it is never removed. This causes a
memory leak and also makes the TraceAggregator slower.

This refactors the entire code to make this work. Some of the highlights
are:

  • Removed the feature of dynamically adding in trace Sinks. Sink can
    now only be specified when the trace session is started. Since we
    don't really have any use case of dynamically hooking in sinks, this
    feature is a lot of complexity for no reason.
    • This also removed the sticky packet feature within the
      TraceAggregator which saved additional complexity.
  • Remove the feature of the TraceAggregator mirroring data to multiple
    Sinks. This is also unnecessary. Could create a MultiSink feature
    if necessary to emulate this. So now, when you start a trace session,
    you must give a single sink for the data to be pushed to.
  • TraceAggregator is now a permanent object (as a shared_ptr) on the
    App instead of being dynamically created and deleted when the trace
    session is started and stopped. This skips the need of having App
    cache the list of ThreadTracers and pass it to the TraceAggregator
    during its construction when the trace session starts.
    • Instead, the TraceAggregator internally has a SessionData object
      (session_). This object is recreated and deleted when the trace
      session starts and stops.
    • Since the TraceAggregator is permanent now, the Thread directly
      register the ThreadTracer with the TraceAggregator during its
      start up procedure. This replaces registering through the App. The
      Thread are also now holding a weak_ptr to the TraceAggregator.
  • When a thread shuts down, the ThreadTracer is marked as "done". The
    TraceAggregator will check for this "done" status if no more events
    are available from the queue. If it is done, the ThreadTracer will
    be removed from the TraceAggregator.
  • TraceAggregator no longer supports RequestStop and Join. This is
    replaced with a single Stop call which will wait for the
    TraceAggregator to fully stop and reset the state of the
    TraceAggregator (and any registered ThreadTracer's string interner).
    • Right now, there is a potential data race during the
      TraceAggregator.Stop, as we access session_ without a lock. This
      is most likely OK as we don't expect TraceAggregator.Stop to be
      called from multiple threads or rapidly recreated for now. Should
      probably fix it in the future.
  • String interner states are now reset when a trace session stops. This
    means if another trace session is started, the strings it remembers
    are reset. The id starting positions are also reset.
    • Since we no longer have sticky packets, we also no longer emit a
      trace event packet that contains interned data from a previous trace
      session.

shuhaowu added 2 commits July 23, 2024 21:44
This reverts commit 793f79e.
@shuhaowu shuhaowu requested a review from stephanie-eng July 24, 2024 04:23
@shuhaowu shuhaowu force-pushed the trace-aggregator-decouple branch from 000f6ee to 86cf4b6 Compare July 24, 2024 04:27
Previously we were leaking `ThreadTracer` objects in the
`TraceAggregator` as creating new threads means `ThreadTracer` gets
pushed into `TraceAggregator` but it is never removed. This causes a
memory leak and also makes the `TraceAggregator` slower.

This refactors the entire code to make this work. Some of the highlights
are:

- Removed the feature of dynamically adding in trace `Sink`s. `Sink` can
  now only be specified when the trace session is started. Since we
  don't really have any use case of dynamically hooking in sinks, this
  feature is a lot of complexity for no reason.
  - This also removed the sticky packet feature within the
    `TraceAggregator` which saved additional complexity.
- Remove the feature of the `TraceAggregator` mirroring data to multiple
  `Sink`s. This is also unnecessary. Could create a `MultiSink` feature
  if necessary to emulate this. So now, when you start a trace session,
  you must give a single sink for the data to be pushed to.
- `TraceAggregator` is now a permanent object (as a `shared_ptr`) on the
  `App` instead of being dynamically created and deleted when the trace
  session is started and stopped. This skips the need of having App
  cache the list of `ThreadTracer`s and pass it to the `TraceAggregator`
  during its construction when the trace session starts.
  - Instead, the `TraceAggregator` internally has a `SessionData` object
    (`session_`). This object is recreated and deleted when the trace
    session starts and stops.
  - Since the `TraceAggregator` is permanent now, the `Thread` directly
    register the `ThreadTracer` with the `TraceAggregator` during its
    start up procedure. This replaces registering through the `App`. The
    `Thread` are also now holding a `weak_ptr` to the `TraceAggregator`.
- When a thread shuts down, the `ThreadTracer` is marked as "done". The
  `TraceAggregator` will check for this "done" status if no more events
  are available from the queue. If it is done, the `ThreadTracer` will
  be removed from the `TraceAggregator`.
- `TraceAggregator` no longer supports `RequestStop` and `Join`. This is
  replaced with a single `Stop` call which will wait for the
  `TraceAggregator` to fully stop and reset the state of the
  TraceAggregator (and any registered `ThreadTracer`'s string interner).
  - Right now, there is a potential data race during the
    `TraceAggregator.Stop`, as we access `session_` without a lock. This
    is most likely OK as we don't expect `TraceAggregator.Stop` to be
    called from multiple threads or rapidly recreated for now. **Should
    probably fix it in the future.**
- String interner states are now reset when a trace session stops. This
  means if another trace session is started, the strings it remembers
  are reset. The id starting positions are also reset.
  - Since we no longer have sticky packets, we also no longer emit a
    trace event packet that contains interned data from a previous trace
    session.
@shuhaowu shuhaowu force-pushed the trace-aggregator-decouple branch from 86cf4b6 to 04969a8 Compare July 24, 2024 04:28
docs/tracing.md Outdated Show resolved Hide resolved
docs/tracing.md Outdated Show resolved Hide resolved
@@ -1,6 +1,7 @@
#ifndef CACTUS_TRACING_THREAD_TRACER_H_
#define CACTUS_TRACING_THREAD_TRACER_H_

#include <atomic>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be with the other includes at line 11

Co-authored-by: Stephanie Eng <stephanie-eng@users.noreply.github.com>
@shuhaowu shuhaowu force-pushed the trace-aggregator-decouple branch from 2f970e0 to 089cbbb Compare July 25, 2024 23:30
@shuhaowu shuhaowu merged commit 9dbda99 into master Jul 25, 2024
2 checks passed
@shuhaowu shuhaowu deleted the trace-aggregator-decouple branch July 25, 2024 23:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants