-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New filter for orphan afterglows #403
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @marinamasson ! I have done a first pass on the code, and left some comments. I will perform a profiling of the code, and paste the results here later this week.
fink_science/orphans/filter.py
Outdated
import joblib | ||
from sklearn import preprocessing | ||
|
||
from filter_utils import compute_duration_between_first_and_peak, compute_rate, compute_color, fit_light_curve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid name conflicts, the import should always be absolute:
from fink_science.orphans.filter_utils import ...
fink_science/orphans/filter.py
Outdated
@@ -0,0 +1,120 @@ | |||
from pyspark.sql.functions import pandas_udf, PandasUDFType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All files should start with the header containing the license statement & authorship (see other modules for example)
fink_science/orphans/filter.py
Outdated
clf = joblib.load(model_path + 'model_orphans.pkl') | ||
proba = clf.predict_proba(features_norm) | ||
|
||
# `True` for the objects that have a probability > 0.999999 to be an orphan, else `False` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For more flexibility, we usually decouple the computation of a quantity (fink science module), and the relevant threshold(s) to be applied on this quantity (fink filter). Hence I would suggest to return the probability values here, and then design a filter in fink-filters.
fink_science/orphans/filter_utils.py
Outdated
@@ -0,0 +1,457 @@ | |||
import numpy as np |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All files should start with the header containing the license statement & authorship (see other modules for example)
fink_science/orphans/filter_utils.py
Outdated
|
||
|
||
# TOOLS | ||
# ======================================================================================================================= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two commented lines are unnecessary here.
fink_science/orphans/filter_utils.py
Outdated
|
||
def compute_duration_between_first_and_peak(times, mags): | ||
""" | ||
Save the number of days between the first detection and the peak, and the date of the first detection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This description is not clear to me :-)
fink_science/orphans/filter_utils.py
Outdated
|
||
def compute_rates(times, mags, filts): | ||
""" | ||
Save the number of days between the first detection and the peak, and the date of the first detection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description is not correct?
fink_science/orphans/filter_utils.py
Outdated
|
||
|
||
# LIGHT CURVE FIT | ||
# ======================================================================================================================= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two commented lines are unnecessary here.
fink_science/orphans/filter_utils.py
Outdated
|
||
# colors and mean frequency of the band u, g, r, i, z, y | ||
filters = ['u', 'g', 'r', 'i', 'z', 'Y'] | ||
all_mean_nu = [840336134453781.4, 629326620516047.8, 482703137570394.2, 397614314115308.1, 344530577088716.56, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where this all_mean_nu
comes from? Is it specific to a dataset? Does it need to be updated if the input dataset change (elasticc --> lsst)?
fink_science/orphans/filter_utils.py
Outdated
flux_r = mag_to_flux(mag_r) | ||
flux_filtmax = mag_to_flux(mag_filtmax) | ||
|
||
# choose values of -beta between -(p-1)/2 and -p/2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is p
?
Linked to issue(s): Closes #402
If this is a new release, did you issue the corresponding schema in fink-client?
This is a new module, not a new release.
What changes were proposed in this pull request?
A new science module.
How is the issue this PR is referenced against solved with this PR?
The PR solves the issue.
How was this patch tested?
Not tested within Fink.