Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tracing for SplitManager.getSplits #20194

Conversation

findepi
Copy link
Member

@findepi findepi commented Dec 21, 2023

Sometimes, ConnectorSplitManager.getSplits can take long to construct ConnectorSplitSource. For example, in Delta, there is IO work being done before ConnectorSplitSource is returned. This work would better be delayed until ConnectorSplitSource.getNextBatch is invoked, but currently this is not the case. Let's add tracing so that time spent in ConnectorSplitManager.getSplits is attributable.

Sometimes, `ConnectorSplitManager.getSplits` can take long to construct
`ConnectorSplitSource`. For example, in Delta, there is IO work being
done before `ConnectorSplitSource` is returned. This work would better
be delayed until `ConnectorSplitSource.getNextBatch` is invoked, but
currently this is not the case. Let's add tracing so that time spent in
`ConnectorSplitManager.getSplits` is attributable.
@findepi findepi added the no-release-notes This pull request does not require release notes entry label Dec 21, 2023
@cla-bot cla-bot bot added the cla-signed label Dec 21, 2023
The parameter refers to "a stage span", not to the top level "query
span".
Copy link
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hashhar
Copy link
Member

hashhar commented Dec 21, 2023

cc: @lukasz-stec were we thinking about similar for PageSourceProvider for system tables?

@findepi
Copy link
Member Author

findepi commented Dec 21, 2023

CI failed at TestAccessControl, and #20198 and something more.
#20196 should fix TAC.

@findepi
Copy link
Member Author

findepi commented Dec 21, 2023

thanks @wendigo for restarting the failed builds

@findepi findepi merged commit 506da60 into trinodb:master Dec 21, 2023
87 of 88 checks passed
@findepi findepi deleted the findepi/add-tracing-for-splitmanager-getsplits-35e094 branch December 21, 2023 22:31
@findepi findepi restored the findepi/add-tracing-for-splitmanager-getsplits-35e094 branch December 21, 2023 22:31
@findepi findepi deleted the findepi/add-tracing-for-splitmanager-getsplits-35e094 branch December 21, 2023 22:31
@github-actions github-actions bot added this to the 436 milestone Dec 21, 2023
@lukasz-stec
Copy link
Member

cc: @lukasz-stec were we thinking about similar for PageSourceProvider for system tables?

@hashhar I wasn't thinking about that but yes, it looks useful. That said tracing is already there for most of stuff via TracingMetadata, it just not linked to the parent span correctly I believe.
One interesting thing, not strictly related to tracing, is it looks like in some cases, ConnectorPageSource can be created not during Operator.addInput so we will miss Operator CPU metrics from this operation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed no-release-notes This pull request does not require release notes entry
Development

Successfully merging this pull request may close these issues.

6 participants