-
Notifications
You must be signed in to change notification settings - Fork 267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consistently use int64 type for string refs in pprofextended. #560
Conversation
Fixing two fields that used a different type, I don't think there is a good reason to be inconsistent.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but @petethepig should confirm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blocking temporarily until #559 is resolved.
@@ -235,7 +235,7 @@ message Sample { | |||
// Supersedes location_index. | |||
uint64 locations_length = 8; | |||
// A 128bit id that uniquely identifies this stacktrace, globally. Index into string table. [optional] | |||
uint32 stacktrace_id_index = 9; | |||
int64 stacktrace_id_index = 9; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Semi off-topic, but was wondering if it would make sense to change this field. I feel that the space savings of referencing the string table might not be that big and it's not clear what encoding this would have in the string table (base64? ascii? would the string have to be unicode? etc):
- two
uint64
, storing the high and low bits for the stack ID; - a
bytes
field; - if we want to keep the indexed approach, maybe we could move this to a new repeated field in the
Profile
message;
curious on your thoughts here! cc @florianl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using bytes or string would mean another indirection / dynamic allocation for the in-memory representation so I would be careful with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I personally prefer having this as two uint64
s but not sure if everyone would agree with the slightly increase in memory. I am not a protobuf expert, I am assuming that adding the extra uint64
field would increase the size of the message by 10 Bytes if I am understanding the docs right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The concept of a stacktrace_id
is new and google/pprof doesn't know this element. It originates from the original optimyze stateful protocol and helped two communicate how often a trace was seen, while not sending the full stack trace every time. Back then, we went with 128bit as we wanted the stacktrace IDs to be globally unique and reduce collisions. I'm not sure how this field is used by other implementations of the protocol. So having stacktrace_id_index
as a index into the string table is the most flexible way for the moment, I think.
It is just the type of the filed, that needs better alignment. For every index int64
is used (also in google/pprof) and so the current type of uint32
should change and align.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, in that case, should we update the comment and leave the number of bits for the stack id unspecified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how relevant this PR is anymore, after the discussion around the google/pprof donation in the last meeting. There are some cases where the profiling protocol can and should evolve and maybe this is one of them.
@@ -355,7 +355,7 @@ message Location { | |||
bool is_folded = 5; | |||
|
|||
// Type of frame (e.g. kernel, native, python, hotspot, php). Index into string table. | |||
uint32 type_index = 6; | |||
int64 type_index = 6; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate of #557
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As #557 got merged, this should no longer be relevant.
I removed my block, the PR can progress. |
@@ -235,7 +235,7 @@ message Sample { | |||
// Supersedes location_index. | |||
uint64 locations_length = 8; | |||
// A 128bit id that uniquely identifies this stacktrace, globally. Index into string table. [optional] | |||
uint32 stacktrace_id_index = 9; | |||
int64 stacktrace_id_index = 9; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This field got removed with #575. Maybe we can close the PR now?
Fixing two fields that used a different type, I don't think there is a good reason to be inconsistent.