-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
component: prometheus: enable/disable http jobs like rules #316
base: main
Are you sure you want to change the base?
Conversation
Jira: https://issues.redhat.com/browse/RHOAIENG-87 Some jobs scrape http endpoint /probe (versus those which use kubernetes_sd_configs for example) and they start collect metrics as soon as they are configured. While enabled component's rules are deployed when deployement is available, it does not help since they fetch stale metrics from jobs. Configure them in the prometheus.yml file (field in mounted ConfigMap) similar way as `rules_files` are configured: put into separate data fields and substitute array of `scrape_configs` in prometheusContent unmarshaled map. Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ykaliuta The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@ykaliuta: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
@@ -97,6 +97,73 @@ func (c *Component) ConfigComponentLogger(logger logr.Logger, component string, | |||
return logger.WithName("DSC.Components." + component) | |||
} | |||
|
|||
func getJobName(job map[any]any) (string, bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have not touched the detail of this PR, but should component.go
change get into ODH first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have not touched the detail of this PR, but should
component.go
change get into ODH first?
100%
It's a branch to test it first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that in downstream they anyway should come as one patch (upstream changes do not break anything there, no problem), but looks like having the code without prometheus changes even if does not work as expected, does not break existing setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, this is similar situation to MR's PR.
we can use this one to test first and backport to ODH as long as we have the logic applied to both upstream and downstream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll split it for the real submission, it's fine.
return name, true | ||
} | ||
|
||
func getJobIdx(scrapeConfigs *[]any, jobName string) (int, bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am thinking about this function:
does "bool" really needed? if it is to check get the index in order for the "removal" work in updateJob()
then either it return 0 (which is not found in the jobmap) or a non-0 index.
e.g "scrape_configs element is not array" and the final return, are treated the same.
so
idx, exists := getJobIdx(&scrapeConfigs, name)
switch {
case enable && !exists:
scrapeConfigs = append(scrapeConfigs, job)
case !enable && exists:
scrapeConfigs = append(scrapeConfigs[:idx], scrapeConfigs[idx+1:]...)
default:
return
}
can be
idx := getJobIdx(&scrapeConfigs, name)
if idx != 0 {
if enable {
(*prometheusContent)["scrape_configs"] = append(scrapeConfigs, job)
} else {
(*prometheusContent)["scrape_configs"] = append(scrapeConfigs[:idx], scrapeConfigs[idx+1:]...)
}
right?
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's split it into 2 parts.
-
special idx value for not found vs flag.
0 is a valid index, but it's possible to use negative value for example. But I found it is very much C-style (I use it in C all the time) where in Go idiomatically use a flag and/or error value. -
Code refactoring.
2.1) it should not append if the job already exists, so it must be still a different branch. At least for rules is makes the check to work enable when it's already enabled.
2.2) although it can be written as
if enable && idx < 0 {
(*prometheusContent)["scrape_configs"] = append(scrapeConfigs, job)
} else if !enable && idx >= 0 {
(*prometheusContent)["scrape_configs"] = append(scrapeConfigs[:idx], scrapeConfigs[idx+1:]...)
}
and it's a matter of taste. I found preparing the list and then update it in one place nicer.
do we still need this PR? |
yes. Nothing changed on the topic. Problem should persist and the solution still not verified. |
Jira: https://issues.redhat.com/browse/RHOAIENG-87
Some jobs scrape http endpoint /probe (versus those which use kubernetes_sd_configs for example) and they start collect metrics as soon as they are configured. While enabled component's rules are deployed when deployement is available, it does not help since they fetch stale metrics from jobs.
Configure them in the prometheus.yml file (field in mounted ConfigMap) similar way as
rules_files
are configured: put into separate data fields and substitute array ofscrape_configs
in prometheusContent unmarshaled map.Description
How Has This Been Tested?
Screenshot or short clip
Merge criteria