Add `pb_read` and `pb_write` functions #115

tanho63 · 2023-12-27T23:37:07Z

Closes #97.

I thought briefly about making this a wrapper around pb_download_url + read function that accepts URLs, but I don't think it had the flexibility I wanted plus I ran into issues downloading from private repositories that I later learned was around not being able to pass an auth token to it.

I think this is the most flexible approach to the problem but would love to hear any thoughts

cboettig

this strategy (specifically the guess_read_function()) always makes me nervous. While all abstractions are a bit leaky, I've always found attempts to write a generic "read" function as an abstract method particularly leaky. There's just too much going on in read_ methods to abstract away (what about other data serializations, like spatial formats? what about lazy reads / remote reads etc).

I'm fine with a convenience wrapper that makes common use patterns more concise. Maybe all that's needed is a bit more documentation saying something like 'for common formats such as ..., along with advice about how to bypass this convenience if a user prefers their own read function (say, from readr` package, or for some other format).

tanho63 · 2023-12-29T19:31:00Z

too much going on in read_ methods to abstract away (what about other data serializations, like spatial formats?

I believe this will fail on spatial formats since it would not be one of csv, tsv, rds, parquet? (unfamiliar with how geoparquet works and whether arrow::read_parquet will process geoparquet, but I assume yes?).

what about lazy reads / remote reads etc?

Yep, this will read eagerly by design/default, and maybe it's a bad thing for folks who should be thinking about optimizing - however uninformed users would currently do pb_download anyways so it's not necessarily much different than that?

I agree with improving the docs, e.g.

explaining that cloud native is best performance (more clearly linking to vignette)
demonstrating ways to read from URL (maybe adding to getting started vignette?)
better demonstrating examples of passing in a different read function and explaining what is supported ?

R/pb_read.R

R/pb_write.R

vignettes/piggyback.Rmd

tanho63 · 2023-12-30T20:33:54Z

Flow state hit me like a bus, many apologies for this PR running away from me. Since your last review (diff), I:

gitignored + got rid of the stored docs/ folder because ropensci builds pkgdown externally + I've been using it to test out how the vignette looks after generating it
updated DESCRIPTION with the newly shortened description from the readme (because I realized it hadn't been updated)
fleshed out man files for read_function and write_function args, added man files for guess_read_function and guess_write_function that fully explain how it gets mapped
added various \dontshow blocks to hide the interactive() and try() blocks as inspired by the examplesIf
updated README to not evaluate any chunks except for regenerating codemeta (improves syntax highlighting in RStudio to use {r} instead of just r)
rewrote vignette/piggyback.Rmd again to try and address your various feedback points - this is probably the main thing that needs looked at

cboettig

looks good!

tanho63 requested a review from cboettig December 27, 2023 23:37

cboettig requested changes Dec 29, 2023

View reviewed changes

tanho63 mentioned this pull request Dec 29, 2023

Updates to README and vignettes #112

Merged

tanho63 force-pushed the tan/pb-rw/97 branch from 3edf4e8 to e3c07b8 Compare December 29, 2023 19:43

tanho63 added 5 commits December 29, 2023 23:31

add pb_read/pb_write and tests

7b6faa7

bump news.md

8448e4e

typos

3d997b1

bugfix example for pb_write

9fee352

stash work on vignette

6ff7768

tanho63 force-pushed the tan/pb-rw/97 branch from cd41835 to 6ff7768 Compare December 30, 2023 04:43

improve pb_read and pb_write documentation

21a4737

tanho63 force-pushed the tan/pb-rw/97 branch from 0042f61 to 21a4737 Compare December 30, 2023 05:02

tanho63 commented Dec 30, 2023

View reviewed changes

R/pb_read.R Show resolved Hide resolved

tanho63 commented Dec 30, 2023

View reviewed changes

R/pb_read.R Show resolved Hide resolved

tanho63 commented Dec 30, 2023

View reviewed changes

R/pb_write.R Show resolved Hide resolved

tanho63 commented Dec 30, 2023

View reviewed changes

vignettes/piggyback.Rmd Show resolved Hide resolved

tanho63 added 7 commits December 30, 2023 11:13

update pkgdown and ignore future local pkgdown builds

f0c5c8c

remove git-tracking of pkgdown since ropensci builds it separately

cf4c45e

switch to standard rmarkdown chunks with eval=FALSE

a0e3f1b

shorten DESCRIPTION description based on new readme

38193e1

reknit readme

de822e3

start adding section on reading in URLs

6164702

relocate dots to before named args

b595bbd

tanho63 commented Dec 30, 2023

View reviewed changes

vignettes/piggyback.Rmd Outdated Show resolved Hide resolved

tanho63 commented Dec 30, 2023

View reviewed changes

vignettes/piggyback.Rmd Show resolved Hide resolved

try to align diffs

d9c7bec

tanho63 requested a review from cboettig December 30, 2023 18:47

tanho63 added 2 commits December 30, 2023 14:29

add pb_read section

ddb60d7

add pb_write and edits to pb_upload

8f8d08d

cboettig approved these changes Dec 30, 2023

View reviewed changes

tanho63 merged commit 4589222 into master Dec 30, 2023
6 checks passed

tanho63 deleted the tan/pb-rw/97 branch December 30, 2023 23:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `pb_read` and `pb_write` functions #115

Add `pb_read` and `pb_write` functions #115

tanho63 commented Dec 27, 2023 •

edited

Loading

cboettig left a comment

tanho63 commented Dec 29, 2023

tanho63 commented Dec 30, 2023 •

edited

Loading

cboettig left a comment

Add pb_read and pb_write functions #115

Add pb_read and pb_write functions #115

Conversation

tanho63 commented Dec 27, 2023 • edited Loading

cboettig left a comment

Choose a reason for hiding this comment

tanho63 commented Dec 29, 2023

tanho63 commented Dec 30, 2023 • edited Loading

cboettig left a comment

Choose a reason for hiding this comment

Add `pb_read` and `pb_write` functions #115

Add `pb_read` and `pb_write` functions #115

tanho63 commented Dec 27, 2023 •

edited

Loading

tanho63 commented Dec 30, 2023 •

edited

Loading