Skip to content

Commit

Permalink
Fix checksum verification
Browse files Browse the repository at this point in the history
To investigate the intermittent checksum errors, I added some code to print out stats on the downloaded package index file when it encounters a checksum error. What stood out was that, whenever the checksum failed, the byte size written would always be slightly off from the byte size expected. E.g.;

```
release_url: http://ports.ubuntu.com/ubuntu-ports/dists/noble-security/InRelease
url: http://ports.ubuntu.com/ubuntu-ports/dists/noble-security/universe/binary-arm64/by-hash/SHA256/184473504b3e97aec550d1bb687b0871e8113f60ab7e8ff136f60f4ecdca9b20
uncompressed size: 1461506
compressed size:   365359
expected size:     365376
```

These errors also often go away when the automation is re-run. This would indicate that there are bytes not being consumed during the `async_copy(reader, writer)` process`. After several different attempts at forcing the remaining bytes to write, there are two changes that seemed to provide stability:

- Removing the `AsyncBufWriter`

    Since we need an `AsyncBufReader` in front of the `GzipDecoder`, we already are buffering the reads into large chunks before sending them to the writer so why not just write directly to the file instead of passing it through a buffered writer?

- Setting `reader.multiple_members(true)`

    You can gzip compress multiple files into a single stream and many gzip readers decode these automatically. Here though, the gzip content is not comprised of multiple files so why would setting this flag cause all the bytes to write? I suspect that, due to how the read process works, when the end of the stream is reached, an extra read is performed to check if there are more entries in the stream and that somehow triggers a flush.

Surprisingly, trying to explicitly `flush` or `shutdown` the `AsyncBufWriter` did not provide the expected stability for correcting these checksum issues. And I'm still not 100% convinced the instability is completely fixed but, so far, this change seems like a reasonable thing to try and observe.
  • Loading branch information
colincasey committed Oct 10, 2024
1 parent 85c8fe6 commit f61de3f
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion src/create_package_index.rs
Original file line number Diff line number Diff line change
Expand Up @@ -320,7 +320,7 @@ async fn get_package_list(
CreatePackageIndexError::WritePackagesLayer(package_index_url_path, e)
})?;

let mut response = client
let response = client
.get(&package_release_url)
.send()
.await
Expand Down

0 comments on commit f61de3f

Please sign in to comment.