Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HRQB 29 - Improve Quickbase upsert reporting #100

Merged
merged 2 commits into from
Jul 12, 2024
Merged

Conversation

ghukill
Copy link
Collaborator

@ghukill ghukill commented Jul 11, 2024

Purpose and background context

Formerly, each QuickbaseUpsertTask (tasks that actually upsert data to Quickbase) would individually log metrics about modified records. This made it difficult to quickly get a holistic picture of the run; how many tables had how many updates. The methods were also logging oriented vs data oriented, which made working with that data in other contexts tricky.

How this addresses that need:

  • Utilize luigi Events and Callbacks to pinpoint tasks only from the current run
  • Shift parsing up upsert results to QBClient
  • Add method HRQBPipelineTask.aggregate_upsert_results() to aggregate all upsert results, across all Load tasks that fired during the run

By utilizing luigi Events, this also paves the way for better task data cleanup, if we want to only pinpoint tasks that ran during the last run (though not yet implemented).

Includes new or updated dependencies?

NO

Changes expectations for external applications?

YES: updated logging structure

What are the relevant tickets?

Developer

  • All new ENV is documented in README
  • All new ENV has been added to staging and production environments
  • All related Jira tickets are linked in commit message(s)
  • Stakeholder approval has been confirmed (or is not needed)

Code Reviewer(s)

  • The commit message is clear and follows our guidelines (not just this PR message)
  • There are appropriate tests covering any new functionality
  • The provided documentation is sufficient for understanding any new functionality introduced
  • Any manual tests have been performed or provided examples verified
  • New dependencies are appropriate or there were no changes

Why these changes are being introduced:

Formerly, each QuickbaseUpsertTask would individually log how metrics
about modified records.  This made it difficult to quickly get a holistic
picture of the run.  The methods were also logging oriented vs data oriented,
which made working with that data tricky.

How this addresses that need:
* Utilize luigi Events to pinpoint tasks only from the current run
* Shift parsing up upsert results to QBClient
* Add method HRQBPipelineTask.aggregate_upsert_results() to aggregate
all upsert results, across all Load tasks that fired during the run

By utilizing luigi Events, this also paves the way for better task
data cleanup, if we want to only pinpoint tasks that ran during the
last run (though not yet implemented).

Side effects of this change:
* All upsert results are written to logs in an aggregate form

Relevant ticket(s):
* https://mitlibraries.atlassian.net/browse/HRQB-20
for task, target in list(zip(self.deps(), self.input(), strict=True))
}


@HRQBTask.event_handler(luigi.Event.SUCCESS)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an example of the Luigi Events/Callbacks. They allow registering on a particular task class, and then subscribing to particular events, in this case a successful run.

Copy link

@ehanson8 ehanson8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, one suggestion

Comment on lines 216 to 221
@property
def parse_upsert_counts(self) -> dict | None:
if self.target.exists():
return QBClient.parse_upsert_results(self.target.read())
return None

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A docstring might help to clarify this property

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added! Good call.

@ghukill ghukill merged commit ed392aa into main Jul 12, 2024
3 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants