-
-
Notifications
You must be signed in to change notification settings - Fork 446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/Severity status #487
base: master
Are you sure you want to change the base?
Conversation
8f3471c
to
97e431f
Compare
5fe383a
to
161bf83
Compare
This PR covers "Support severity for PagerDuty" feature: #412 |
ad00df2
to
9cf7ac1
Compare
I am interested in this feature so I will make a review and propose some changes. Please note that I am not a maintainer and my views may different than TwiN's. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't finished my review, I still have some things to try out.
Here are additional things I couldn't add to my review (if you know how to prepare comments in a review, let me know how, I'm more used to GitLab on that):
issue: Can you rebase you PR? Both commits require a frontend rebuild (those are the only conflicts).
issue: The GitLab alerting provider was added recently, is it possible to add support for sending the severity on this provider? Currently, there is a alerting.gitlab.severity
parameter, but it should be removed and replaced by the severity from this feature. (Need more input from @TwiN about this)
can be one of
critical, high, medium, low, info, unknown
suggestion: It could be possible, for each alerting provider, to disable alerts if severity doesn't reach a minimal threshold (by default, the least critical severity). This can be done in a future issue.
I think I will work on this.
question/suggestion: I don't understand why both severity
and severity_status
exist in results.
I think the endpoint_result_conditions
table should have a new column severity
with the level used when success
is false. Then the endpoint_results
table can have a single severity
column.
The Severity
struct can then be removed.
I will try to make this improvement.
thought: I'm wondering if the type for a condition could be something else than a string (while keeping it possibly a string), e.g.:
conditions:
- condition: "[RESPONSE_TIME] < 1s"
severity: high
- "[CERTIFICATE_EXPIRATION] > 72h" # critical severity (the default)
I don't know if Go permits this, so I will have to try.
thought(non-blocking) : It doesn't seem to be possible to add multiple conditions that consume the same variables, with different severity. The second time a variable is used, it returns 0
. This can be addressed in another issue.
(It was an issue on my end)
type SeverityStatus int | ||
|
||
// Severity status enums | ||
const ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The names for severity levels were suggested previously and seem to be what GitLab uses for alert severity.
suggestion: What about using levels that are more often used by other software, such as PagerDuty (as used here in PagerDuty alerting) or Python (with logging levels):
critical
error
(instead ofhigh
)warning
(instead ofmedium
)info
(instead oflow
)
(Note that Python also has a debug
level but it doesn't have any meaning here.)
_, _ = s.db.Exec(` | ||
ALTER TABLE endpoint_results | ||
ADD IF NOT EXISTS domain_expiration BIGINT NOT NULL DEFAULT 0, | ||
ADD IF NOT EXISTS severity_status INTEGER NOT NULL DEFAULT 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought: I know this is also the case for domain_expiration
, but with this command you end up with a schema that is different than from creating the table with the CREATE TABLE IF NOT EXISTS
statement present above, since the rows don't have a DEFAULT
attribute in that statement. It is also the case in sqlite. (cc @TwiN)
|
||
// Severity status enums | ||
const ( | ||
None SeverityStatus = iota |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion(non-blocking): The default severity (used for database migrations) could be named unknown
instead of none
.
_, _ = s.db.Exec(`ALTER TABLE endpoint_results ADD domain_expiration INTEGER NOT NULL DEFAULT 0`) | ||
_, _ = s.db.Exec(` | ||
ALTER TABLE endpoint_results ADD domain_expiration INTEGER NOT NULL DEFAULT 0; | ||
ALTER TABLE endpoint_results ADD severity_status INTEGER NOT NULL DEFAULT 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: If the first ADD
statement fails (and it is likely to), the second will not be run (Parse error: duplicate column domain_expiration
).
// Returns separated SeverityStatus and condition | ||
func (c Condition) sanitizeSeverityCondition() (SeverityStatus, string) { | ||
severityStatus, complexCondition := Critical, string(c) | ||
regex, _ := regexp.Compile(`(Low|Medium|High|Critical) :: (.+)`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: The regexp could be case-insensitive.
regex, _ := regexp.Compile(`(Low|Medium|High|Critical) :: (.+)`) | |
regex, _ := regexp.Compile(`(?i)(Low|Medium|High|Critical) :: (.+)`) |
9cf7ac1
to
9d04469
Compare
@bestwebua Can we work together? I didn't know you were active and I made most of changes from my review here: https://github.com/Exagone313/gatus/tree/severity-status A notable change is that I simplified how the severity is computed from the results, there is a field I also made various improvements in the code (for example, the I think commits could be squashed, as long as |
@Exagone313 Of course I'm active. I have received your comments just today. Thanks, let it be. So what the next actions? |
Are you up for a pair programming session on Jitsi (or something like that)? I think it would be best to be able to share some code in real time. |
@Exagone313 Sorry, today I'm not able to pair session. But we can combine our commits in the different branch and make a review. |
In my branch there are still some issues or improvements to be made:
|
9d04469
to
a6668b2
Compare
* Added ConditionResult.Severity * Added Result.Severity * Added Result.SeverityStatus * Added Result#determineSeverityStatus, tests * Added SeverityStatus type * Added Condition#sanitizeSeverityCondition, tests * Updated Condition#evaluate, tests * Updated Endpoint#EvaluateHealth, tests * Updated storage adapters * Updated Endpoint component (vue) * Recompiled assets * Updated readme * Added critical severity, add result severity to pagerduty
a6668b2
to
945bbe7
Compare
is there any chance of progress of this pull request or it's abandoned? This feature is essential for real world monitoring system. |
I was thinking about taking over the PR but I have been working on other things, I may tackle it again in a few weeks though. |
Summary
This PR has been inspired by: #227, #412
ConditionResult.Severity
Result.Severity
Result.SeverityStatus
Result#determineSeverityStatus
, testsSeverityStatus
typeCondition#sanitizeSeverityCondition
, testsCondition#evaluate
, testsEndpoint#EvaluateHealth
, testsAlertProvider#buildRequestBody
, testsScreenshorts
Checklist
README.md
, if applicable.