write up design principles for hoad #249

maxheld83 · 2020-07-15T13:20:53Z

it just occurred to me during the call with @kjgarza that it might be a good idea to write down the draft design principles for hoad that we've been talking about.

There are three levels of user/target segmentation, which correspond to three levels of our code.

Distributed in-memory database.
This database should be as generic as possible, in the extreme case just duplicating the crossref coverage, but with a lot better performance and arbitrary SQL/dplyr queries.
- Target: Analysts (us).
- Code:
  - setup of the database (currently Google BigQuery, maybe Azure Synapse)
  - batch jobs to seed the db with dumps and incremental updates
  - example queries
Domain-specific APIs
Opinionated queries against 1 to yield domain-specific data objects (that fit into laptop memories).
A set of (multiple!) tidy data frames that make sense for hybrid open access uptake analysis, i.e. make it possible to run the plots/analyses in 3.
- Target: R users interested in hybrid OA.
- Code:
  - dplyr/sql queries against 1
  - additional on-client data wrangling
  - assertions and tests
Dashboard
Views on the data in 2 to tell answer our business questions.
- Target: HOAD project stakeholders
- Code:
  - plots (those are also part of the package proper)
  - dashboard (maybe modules are also part of the package)

maxheld83 · 2020-07-15T13:21:28Z

this is just quickly jotted down, should be in the repo somewhere

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

write up design principles for hoad #249

write up design principles for hoad #249

maxheld83 commented Jul 15, 2020 •

edited

Loading

maxheld83 commented Jul 15, 2020

write up design principles for hoad #249

write up design principles for hoad #249

Comments

maxheld83 commented Jul 15, 2020 • edited Loading

maxheld83 commented Jul 15, 2020

maxheld83 commented Jul 15, 2020 •

edited

Loading