-
Notifications
You must be signed in to change notification settings - Fork 267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document compatibility requirements for profiles #559
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -12,3 +12,16 @@ for general information about the project. | |||||
|
||||||
After making any changes to .proto files make sure to generate all | ||||||
implementation by running `make gen-all`. | ||||||
|
||||||
## Profiles Compatibility Requirements | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In the Java SIG we've been discussing whether our in-memory representation of the profiling data model should include the deprecated pprof fields. I argued that they should not, since doing so is confusing and error prone for users who wish to implement the data model: Which field do I need to provide when two represent the same information? What happens if there's a conflict - are exporters expected to perform validation? Its simpler to say that our in-memory data model needs to be guaranteed to be able to encode information to the pprof format. In other words, its compatible with pprof rather than a perfect apples to apples reflection of it. This is a more relaxed compatibility requirement that what has been proposed so far with the profiling in this repo. If applied here, it would allow the profiling proto representation to completely diverge from pprof so long as it could maintain compatibility - as long as it could be translated losslessly to pprof. This wouldn't weaken otel from a tooling standpoint, since as long as we guaranteed that compatiblity and maintained OTLP exporters as well as pprof serializers, users could plug otel profiling data into all the tools that support pprof. However, it would mean that we have yet another standard, since pprof and the new pprof compatible otel standard would exist where previously only pprof existed. So the value in maintaining a stricter compatibility requirement where we have a profiling message type which is a strict extension of the equivalent pprof proto message type is that it provides a route to having only a single standard. I think this is valuable. But what would this actually mean in practice? All opentelemetry OTLP profile exporters would export data which is wrapped in additional messages such that it can't be directly plugged into pprof tools. It seems like we need a way for the pprof |
||||||
|
||||||
The Otel profile format in | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
(nit as per https://github.com/open-telemetry/opentelemetry-specification/blob/v1.33.0/specification/README.md?plain=1#L72) |
||||||
[profiles/v1experimental](opentelemetry/proto/profiles/v1experimental) | ||||||
is derived from | ||||||
[pprof](https://github.com/google/pprof/tree/main/proto) and is currently | ||||||
in experimental state. We would like to continue maintaining compatibility with pprof. | ||||||
All changes to profile proto should meet the following compatibility criteria: | ||||||
|
||||||
- Every valid pprof profile is also a valid Otel profile AND, | ||||||
- Every Otel profile is a valid pprof profile, after any new fields introduced in | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can expand if this is not clear enough. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That second requirement is interesting. What does "valid" mean here? That it doesn't cause a decoding error? Or that it allows the pprof CLI tool to render flame graph? Arguably there isn't much value in being "valid" if it doesn't enable interoperability. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "doesn't cause a decoding error" is not good enough for me. All we would need to accomplish this is to just assign all new field numbers for Otel pprofextended messages and there would be no decoding errors since Protobuf decoder happily ignores unknown fields. This is obviously of no practical value. What I am aiming for here is the following. If we extend profiles in Otel by new capabilities then those new capabilities obviously can't be understood by pprof. However ignoring those new additional capabilities should still result in a meaningful profile from pprof perspective. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should discuss the meaning of 'meaningful' in the next SIG meeting. Taking into account that the scope of pprofextended is larger than pprof, having every OTel profile be a valid and meaningful pprof profile may not be workable. Quick examples that come to mind include missing symbols due to delayed symbolization (see the agenda for recent discussions around standardizing a symbol upload protocol) and deprecation / non-usage of fields that pprof expects to be present (e.g. Sample.location_index). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree, let's discuss in the SIG. This criterion came up in a meeting last week (suggested by @rsc). This is essentially a forward compatibility requirement. Due to how unknown fields are by default handled by Protobuf deserializers existing pprof tooling can read pprofextended with no code changes. There is certain value in this, although I don't know what practical use cases would benefit from this. @aalexand also suggested an alternate criteria: "All Otel pprofextended profiles must be convertible to valid pprof profiles" (the precise meaning of "convertible" TBD). This is a less restrictive criteria that gives more flexibility to Otel profiles, but means if we want existing pprof tools to work with Otel profiles we either need a "converter" tool or need to teach pprof to understand the new Otel profiles directly. There is probably other options that we can discuss. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @aalexand it would be great if you can also join the Profiling SIG on May 30 to discuss the options. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I'll try to join. The main question is basically whether OTel's pprofextended can become a source of truth for existing pprof clients without requiring them to migrate all at once (which is infeasible). For that to happen there needs to be an arbitrarily long period of time where the pprof's subset of pprofextended functionality is semantically equivalent to what pprof has. Today this is not the case - the most prominent example is the change from ID referencing to index referencing, since it's wire and semantics breaking. An alternative is to simply consider pprofextended a fork of pprof where some sort of converter would exist from pprofextended to pprof format for pprof to be able to read pprofextended format. The converter does not have to be a part of pprof in this case, it could be anywhere. I suspect that Go runtime would probably keep using the current pprof format as well. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. pprofextended already uses different names than pprof. Is this considered as conflict? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
If we do this I think we loose the benefits of having one standard. In that case we may as well give Otel a complete freedom to designs its own format. I would like to avoid this.
We should consider 2 aspects of compatibility: wire compatibility and symbolic compatibility. Renaming fields does not impact wire encoding, so if we aim just for wire compatibility renames are fine. However I would argue that we should probably aim for symbolic compatibility and avoid renames, so that we make pprof tooling evolution to newer pprofextended versions easier. In this particular case it may be nice to rename from There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The most important change in this particular case is not just the name change, but the change in meaning: mapping_id in pprof means "the positive value of the Mapping.id field of the referenced mapping or zero if this is a nil reference". mapping_index in pprofextended means "the index of the mapping in the repeated Profile.mappings field" (plus rather undefined currently way to express nil references) This semantics change is going to be fairly hard to deal with in terms of switching existing pprof clients to pprofextended. |
||||||
Otel profile are discarded. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this section should also align with
Compatibility With Original pprof
in the data model.The proposed changes would conflict as the original statement is:
I think, it is fine to change these original compatibility requirements, but it should be aligned and adapted correctly.