Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix for turbo and bumper not being used in TemplateExpression #399

Merged
merged 3 commits into from
Jan 3, 2025

Conversation

MilesCranmer
Copy link
Owner

cc @gm89uk - should fix the speed issue you noted over in the pysr discussions page

Copy link
Contributor

github-actions bot commented Jan 1, 2025

Benchmark Results

master 0609848... master/0609848d9a8691...
search/multithreading 17.3 ± 0.94 s 16.2 ± 1.4 s 1.07
search/serial 29.9 ± 1.2 s 29.1 ± 0.4 s 1.03
utils/best_of_sample 2.1 ± 1.4 μs 1.68 ± 1.4 μs 1.25
utils/check_constraints_x10 11.1 ± 2.8 μs 11 ± 2.8 μs 1
utils/compute_complexity_x10/Float64 2.08 ± 0.12 μs 2.07 ± 0.11 μs 1
utils/compute_complexity_x10/Int64 2.04 ± 0.11 μs 2.01 ± 0.11 μs 1.02
utils/compute_complexity_x10/nothing 1.48 ± 0.1 μs 1.45 ± 0.13 μs 1.02
utils/insert_random_op_x10 5.86 ± 1.8 μs 5.89 ± 1.8 μs 0.995
utils/next_generation_x100 0.426 ± 0.095 ms 0.425 ± 0.061 ms 1
utils/optimize_constants_x10 0.0368 ± 0.0082 s 0.0348 ± 0.0081 s 1.06
utils/randomly_rotate_tree_x10 5.32 ± 0.6 μs 5.23 ± 0.58 μs 1.02
time_to_load 1.89 ± 0.072 s 1.89 ± 0.0099 s 0.997

Benchmark Plots

A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR.
Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).

@gm89uk
Copy link

gm89uk commented Jan 1, 2025

Thanks @MilesCranmer for fixing this so quickly. I will do a long run overnight with this and report back.
I'm looking forward to the next merge to run parametric template expressions with bumper and turbo!

Presumably no change in code is required here, just to set bumper and turbo true in the SRRegressor

Thanks again

@gm89uk
Copy link

gm89uk commented Jan 3, 2025

Thanks @MilesCranmer, I tested it out and it is quite a bit quicker than before! However, TemplateExpression remains quite a bit slower after a long run than a custom loss function.

Further thoughts:
With monotonicity check
TemplateExpression is definitely quicker than the loss_fnc to begin with, but then slows down (although not quite as much as before)

What I think is going on,
Custom Loss Function: eval_grad_tree_array is calculating grad for all variables (+prediction) in a single tree traversal (I think) and using Zygote for the ForwardPass derivative

TemplateExpression Code: o = f(..) is performing an eval_tree_array, then following a very fast symbolic differentiation, it is then repeating eval_tree_array for first derivative for each variable.
So initially, the code might be doing one eval_tree_array, quickly failing the monotonicity check and exiting early for a faster runtime. Then, as the expressions evolve toward those that have monotonicity for variables 1-3, the full structure code is being processed and looping eval_tree_array a further 3 times.

Therefore, you are comparing the overhead of eval_grad_tree_array once in the custom loss function , vs eval_tree_array four times in the TemplateExpression, (one for o = f(…) and then one for each variable in the monotonicity check loop with d(...).x) at later runs.

In summary for first diff with multiple variables, eval_grad_tree_array is probably more efficient than my structure code and I cannot think of a faster way to do those within a TemplateExpression, which is fine for most cases. It's just my database is relatively large at 7000+ rows (with batching set to 50).

However, interestingly, I ran a stripped down code without any monotonicity checks and found the custom_loss_func to run about twice as fast as TemplateExpression:

Running stripped down code:
Template Expression

using Statistics #for mean
structure = TemplateStructure{(:f,)}(
  ((; f), (x1, x2, x3, x4, x5, x6)) -> begin
    o = f(x1, x2, x3, x4, x5, x6)
    return o
  end
)
model = SRRegressor(
    niterations=1000000,
    binary_operators=[+,-,*,/],
    maxsize=60,
    bumper=true,
    turbo=true,
    populations=18,
    expression_type = TemplateExpression,
    expression_options = (; structure),
    population_size=100,
    parsimony = 0.01,
    batching=true,
)

Custom Loss Function

using Statistics #for mean

function loss_fnc(tree, dataset::Dataset{T,L}, options, idx) where {T,L}    
    # Extract data for the given indices
    X = idx === nothing ? dataset.X : view(dataset.X, : , idx)
    y = idx === nothing ? dataset.y : view(dataset.y, idx)
    prediction, grad, complete = eval_grad_tree_array(tree, X, options; variable=true)
    if !complete
        return L(Inf)
    end
    return mean((prediction .- y).^2)
end
model = SRRegressor(
    niterations=1000000,
    binary_operators=[+,-,*,/],
    maxsize=60,
    bumper=true,
    turbo=true,
    populations=18,
    population_size=100
    parsimony = 0.01,
    batching=true,
    loss_function = loss_fnc,
)

So there is understandably some overhead from setting up a TemplateExpression structure there as well.

@coveralls
Copy link

coveralls commented Jan 3, 2025

Pull Request Test Coverage Report for Build 12597444445

Details

  • 8 of 8 (100.0%) changed or added relevant lines in 3 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.008%) to 94.754%

Totals Coverage Status
Change from base Build 12595020950: 0.008%
Covered Lines: 3161
Relevant Lines: 3336

💛 - Coveralls

@MilesCranmer MilesCranmer force-pushed the pass-eval-through-template branch from ef8ece4 to 0609848 Compare January 3, 2025 12:07
@MilesCranmer MilesCranmer merged commit ef0248a into master Jan 3, 2025
17 checks passed
@MilesCranmer MilesCranmer deleted the pass-eval-through-template branch January 3, 2025 13:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants