Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FTheoryTools: Use Zenodo data as artifact #4423

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

HereAround
Copy link
Member

@HereAround HereAround commented Jan 7, 2025

The tar ball provided at https://zenodo.org/records/14611045 (version 2) can be downloaded. It contains two .mrdi files, which are the OSCAR serialization of two F-theory models. Creating these two models by loading in the .mrdi files takes about 5 to 10 minutes, each. Computing these models from scratch takes probably half an hour/3-4 hours, respectively. A stark improvement, that we must use.

@benlorenz The tar ball at zenodo is about 400MB in size. I would love to add a test for this artifact/creating the model. However, it seems to me, that this would burn too many resources. Any idea/suggestion?

cc @emikelsons @apturner

@HereAround HereAround marked this pull request as draft January 7, 2025 17:33
@HereAround HereAround added WIP NOT ready for merging topic: FTheoryTools labels Jan 7, 2025
Copy link

codecov bot commented Jan 7, 2025

Codecov Report

Attention: Patch coverage is 20.00000% with 4 lines in your changes missing coverage. Please review.

Project coverage is 84.36%. Comparing base (264490b) to head (bd335c9).
Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
.../FTheoryTools/src/AbstractFTheoryModels/methods.jl 33.33% 2 Missing ⚠️
.../FTheoryTools/src/LiteratureModels/constructors.jl 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4423      +/-   ##
==========================================
+ Coverage   84.34%   84.36%   +0.02%     
==========================================
  Files         663      663              
  Lines       87788    87804      +16     
==========================================
+ Hits        74042    74075      +33     
+ Misses      13746    13729      -17     
Files with missing lines Coverage Δ
.../FTheoryTools/src/AbstractFTheoryModels/methods.jl 77.94% <33.33%> (-0.40%) ⬇️
.../FTheoryTools/src/LiteratureModels/constructors.jl 95.08% <0.00%> (+1.23%) ⬆️

... and 5 files with indirect coverage changes

@HereAround HereAround force-pushed the AutomaticallyDownloadBigModelFile branch from 4ed293e to f718b36 Compare January 7, 2025 19:57
@HereAround HereAround marked this pull request as ready for review January 7, 2025 20:05
Co-authored-by: Lars Göttgens <lars.goettgens@gmail.com>
@fingolfin fingolfin changed the title FTheoryTools: Use Zenodo data as artifcat FTheoryTools: Use Zenodo data as artifact Jan 8, 2025
Artifacts.toml Outdated Show resolved Hide resolved
@fingolfin fingolfin removed the triage label Jan 8, 2025
aaruni96 and others added 2 commits January 8, 2025 14:10
Co-authored-by: Lars Göttgens <lars.goettgens@gmail.com>
@aaruni96
Copy link
Member

aaruni96 commented Jan 8, 2025

@benlorenz can you check whether the new job step I added (extra-large-test, line 123) makes sense for trying to implement what you suggested during triage ?

@benlorenz
Copy link
Member

@benlorenz can you check whether the new job step I added (extra-large-test, line 123) makes sense for trying to implement what you suggested during triage ?

Yes that was what I had in mind but I see a small issue, that this will still be run in the default group when the tests are run locally without a testset. I will do a small refactoring of these subsets and push another change in a few minutes.

.github/workflows/CI.yml Outdated Show resolved Hide resolved
@fingolfin
Copy link
Member

A note on the serialization performance (repeating some of what I said during triage): I got the following for pure JSON parsing of the data:

julia> @time JSON.parsefile("1511-03209.mrdi");
  3.564484 seconds (77.98 M allocations: 3.459 GiB, 47.63% gc time, 0.09% compilation time)

julia> @time JSON.parsefile("1511-03209-resolved.mrdi");
 14.119739 seconds (188.79 M allocations: 9.096 GiB, 46.01% gc time)

@HereAround what are the timings on your computer?

I hope that means we have indeed a chance to get the performance for loading these -- I'd be hoping for something below a minute.

@fingolfin
Copy link
Member

ping @antonydellavecchia

@HereAround
Copy link
Member Author

A note on the serialization performance (repeating some of what I said during triage): I got the following for pure JSON parsing of the data:

julia> @time JSON.parsefile("1511-03209.mrdi");
  3.564484 seconds (77.98 M allocations: 3.459 GiB, 47.63% gc time, 0.09% compilation time)

julia> @time JSON.parsefile("1511-03209-resolved.mrdi");
 14.119739 seconds (188.79 M allocations: 9.096 GiB, 46.01% gc time)

@HereAround what are the timings on your computer?

I hope that means we have indeed a chance to get the performance for loading these -- I'd be hoping for something below a minute.

I got the following:

julia> @time JSON.parsefile("1511-03209.mrdi");
  8.901444 seconds (77.98 M allocations: 3.434 GiB, 65.55% gc time, 0.16% compilation time)

julia> @time JSON.parsefile("1511-03209-resolved.mrdi");
 65.112687 seconds (188.39 M allocations: 9.041 GiB, 88.51% gc time)

benlorenz and others added 2 commits January 8, 2025 17:09
Co-authored-by: Lars Göttgens <lars.goettgens@gmail.com>
@HereAround
Copy link
Member Author

I am generally speaking happy with this PR - ready for review.

@HereAround
Copy link
Member Author

Thank you for your help @benlorenz and @aaruni96 !

@lgoettgens lgoettgens removed the WIP NOT ready for merging label Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants