-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix for turbo
and bumper
not being used in TemplateExpression
#399
Conversation
Benchmark Results
Benchmark PlotsA plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR. |
Thanks @MilesCranmer for fixing this so quickly. I will do a long run overnight with this and report back. Presumably no change in code is required here, just to set bumper and turbo true in the SRRegressor Thanks again |
Thanks @MilesCranmer, I tested it out and it is quite a bit quicker than before! However, TemplateExpression remains quite a bit slower after a long run than a custom loss function. Further thoughts: What I think is going on, TemplateExpression Code: o = f(..) is performing an Therefore, you are comparing the overhead of In summary for first diff with multiple variables, However, interestingly, I ran a stripped down code without any monotonicity checks and found the custom_loss_func to run about twice as fast as TemplateExpression: Running stripped down code: using Statistics #for mean
structure = TemplateStructure{(:f,)}(
((; f), (x1, x2, x3, x4, x5, x6)) -> begin
o = f(x1, x2, x3, x4, x5, x6)
return o
end
)
model = SRRegressor(
niterations=1000000,
binary_operators=[+,-,*,/],
maxsize=60,
bumper=true,
turbo=true,
populations=18,
expression_type = TemplateExpression,
expression_options = (; structure),
population_size=100,
parsimony = 0.01,
batching=true,
) Custom Loss Function using Statistics #for mean
function loss_fnc(tree, dataset::Dataset{T,L}, options, idx) where {T,L}
# Extract data for the given indices
X = idx === nothing ? dataset.X : view(dataset.X, : , idx)
y = idx === nothing ? dataset.y : view(dataset.y, idx)
prediction, grad, complete = eval_grad_tree_array(tree, X, options; variable=true)
if !complete
return L(Inf)
end
return mean((prediction .- y).^2)
end
model = SRRegressor(
niterations=1000000,
binary_operators=[+,-,*,/],
maxsize=60,
bumper=true,
turbo=true,
populations=18,
population_size=100
parsimony = 0.01,
batching=true,
loss_function = loss_fnc,
) So there is understandably some overhead from setting up a TemplateExpression structure there as well. |
Pull Request Test Coverage Report for Build 12597444445Details
💛 - Coveralls |
ef8ece4
to
0609848
Compare
cc @gm89uk - should fix the speed issue you noted over in the pysr discussions page