Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add_dataset_file error #116

Closed
Danny-dK opened this issue Mar 24, 2022 · 10 comments
Closed

add_dataset_file error #116

Danny-dK opened this issue Mar 24, 2022 · 10 comments

Comments

@Danny-dK
Copy link
Contributor

Danny-dK commented Mar 24, 2022

In a previous issue #82 (comment) I indicated issues with adding dataset files. I posted the code that was working #82 (comment)

After many months I'm now trying to run the exact same code, but I'm now getting an error at the last step of adding a datasetfile using add_dataset_file(). The error thrown is Bad Request (HTTP 400). Failed to Error in parsing provided json. Everything else up to that point seems to work, including the creation of the dataset and the retrieval of the doi, just uploading files seems to fail. Does anyone know what may changed between march 2021 and now?

And again the CURL command in R works


headers = c(
  `X-Dataverse-key` = 'xxxxxxxxxxxxxxxxxxxxxxxxxxx'
)

params = list(
  `persistentId` = 'doi:10.80227/test-YDCZ1J',
  `version` = 'DRAFT'
)

files = list(
  `file` = upload_file('D:/parttwo.txt')
)

res <- httr::POST(url = 'https://demo.dataverse.nl/api/datasets/:persistentId/add', httr::add_headers(.headers=headers), 
                  query = params, body = files)


Windows 10
R 4.1.2.
Rstudio 2021.9.1.372
Dataverse 0.3.10

@Danny-dK
Copy link
Contributor Author

Danny-dK commented Mar 24, 2022

It seems to be an error at creating description? If I add description to the add_dataset_file() it works.

So this does not work:

f <- add_dataset_file(dataset = 'doi:10.80227/test-YDCZ1J&version=DRAFT', file = 'D:/partone.txt')

But this does work:

f <- add_dataset_file(dataset = 'doi:10.80227/test-YDCZ1J&version=DRAFT', file = 'D:/partone.txt', description = 'text')

Within the function off add_dataset_file() I see:

function (file, dataset, description = NULL, key = Sys.getenv("DATAVERSE_KEY"), 
  server = Sys.getenv("DATAVERSE_SERVER"), ...) 
{
  dataset <- dataset_id(dataset, key = key, server = server, 
    ...)
  bod2 <- list()
  if (!is.null(description)) {
    bod2$description <- description
  }
  jsondata <- as.character(jsonlite::toJSON(bod2, auto_unbox = TRUE))
  u <- paste0(api_url(server), "datasets/", dataset, "/add")
  r <- httr::POST(u, httr::add_headers(`X-Dataverse-key` = key), 
    ..., body = list(file = httr::upload_file(file), jsonData = jsondata), 
    encode = "multipart")
  httr::stop_for_status(r, task = httr::content(r)$message)
  out <- jsonlite::fromJSON(httr::content(r, "text", encoding = "UTF-8"))
  out$data$files$dataFile$id[1L]
}

Could it be that when description is not present (and therefore bod2 is empty), that it results in an empty jsondata [] which might not be accepted at the httr::POST() section?

@kuriwaki
Copy link
Member

Does it work on CRAN version 0.3.9? devtools::install_version("dataverse", version = "0.3.9")

@Danny-dK
Copy link
Contributor Author

Nope. Also tried with version 0.3.0 which it worked with last year. So I would assume something in dataverse itself changed that they may not accept empty json? the CURL script also doesn't seem to need json to be uploaded.

@pdurbin
Copy link
Member

pdurbin commented Mar 28, 2022

I assume that so far we're talking about if anything has changed in the dataverse package on CRAN.

Is there a chance that something changed on the server side? Have you been running the same version of Dataverse on the server this whole time? If not, do you know what version you were running and which version you upgraded to?

@Danny-dK
Copy link
Contributor Author

That was my question as well as the 0.3.0 package also displays this issue (and as far as I can tell the code is not that much different in that specific aspect). It is happening both on demo.dataverse.nl (v5.6) and demo.dataverse.org (v5.10). To reproduce the problem at hand, the part in your readme / about section at https://github.com/IQSS/dataverse-client-r#data-archiving can be run. If you're not seeing the same thing happening, than it must be something different.

@Danny-dK
Copy link
Contributor Author

Not sure if this has anything to do with it (not that great in R, sorry if it does not help). The code for add_dataset_file() (also shown above) uses:

bod2 <- list()
  if (!is.null(description)) {
    bod2$description <- description
  }
  jsondata <- as.character(jsonlite::toJSON(bod2, auto_unbox = TRUE))

If description remains empty, the jsondata object from jsondata <- as.character(jsonlite::toJSON(bod2, auto_unbox = TRUE)) results in '[]'. If I use that in the curl script, it fails to upload:

library(curl)
library(httr)

headers = c(
  `X-Dataverse-key` = 'xxxxxxxxxxxxxxxxxxxxx'
)

params = list(
  `persistentId` = 'doi:10.80227/test-PXCGQL',
  `version` = 'DRAFT'
)

files = list(
  `file` = upload_file('D:/part4.txt'),
  `jsonData` = '[]'
)

res <- httr::POST(url = 'https://demo.dataverse.nl/api/datasets/:persistentId/add', httr::add_headers(.headers=headers), 
                  query = params, body = files)

But if I use '{}' it uploads without problem. So it would seem dataverse repo does allow empty jsondata, but not as '[]'. Maybe that is the issue?

@pdurbin
Copy link
Member

pdurbin commented Mar 30, 2022

@Danny-dK hmm, if you can reproduce a bug with using [] in jsonData using command line curl, please open an issue at https://github.com/IQSS/dataverse/issues . You are welcome to test against https://demo.dataverse.org

@Danny-dK
Copy link
Contributor Author

@pdurbin I did, but I was too hastily. I posted the R curl, but then started messing around with the curl in cmd prompt in Windows. There the curl command does allow empty json either in '[]' or '{}'. So now I'm thinking it is an R thing, maybe an R curl or httr. I'll try installing httr and curl from march 2021 and see if that is the issue there.

In cmd prompt it works:

curl -H X-Dataverse-key:xxxxxxxxxxxxxxxxxxxxxxxxxxxxx -X POST -F file=@"D:\part4.txt" -F 'jsonData="[]"' "https://demo.dataverse.org/api/datasets/:persistentId/add?persistentId=doi:10.70122/FK2/S0LXD9&version=DRAFT"

@Danny-dK
Copy link
Contributor Author

Danny-dK commented Mar 31, 2022

I tried various versions and non seemed to work. Don't know why there is a difference. Just to reaffirm, you are seeing the same error when trying to run your own example at https://github.com/IQSS/dataverse-client-r#data-archiving ?

But this adjustment at 'add_dataset_file()' might fix it:

instead of:
1

bod2 <- list()
    if (!is.null(description)) {
        bod2$description <- description
    }

jsondata <- as.character(jsonlite::toJSON(bod2, auto_unbox = TRUE))

this might work:
2

bod2 <- NULL
    if (!is.null(description)) {
        bod2$description <- description
    }

jsondata <- as.character(jsonlite::toJSON(bod2, auto_unbox = TRUE))

For both the result when description <- 'test' is "{\"description\":\"test\"}". But when description is NULL then for the original code the result is '[]', while for the suggested change the result is '{}' which should upload without problem. I tested and works. I'll create a pull request (if I can figure out how).

@kuriwaki kuriwaki linked a pull request Apr 7, 2022 that will close this issue
@kuriwaki kuriwaki removed a link to a pull request Apr 9, 2022
kuriwaki added a commit that referenced this issue Apr 9, 2022
Fixes #116 when merged in master, add_dataset_file() to in 0.3.10
@Danny-dK
Copy link
Contributor Author

Danny-dK commented Apr 9, 2022

issue resolved in pull request (for now).
(feel free to reopen if required)

@Danny-dK Danny-dK closed this as completed Apr 9, 2022
This was referenced Apr 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants