Skip to content

Commit

Permalink
TRD ignore provider error - do not send alert (#21)
Browse files Browse the repository at this point in the history
* TRD ignore provider error - do not send alert

The TRD chart runs tezos reward distribution software for delegations.
We support sending slack alerts when reward distribution fail. Here, we
address a common false positive:

Provider errors are usually because tzkt is rate limited or busy.

example:

│ 2024-09-29 21:02:16,534 - MainThread - INFO - --------------------------------------------                                                                                                                                                 │
│ 2024-09-29 21:02:16,535 - MainThread - INFO - BAKING ADDRESS is
│ 2024-09-29 21:02:16,535 - MainThread - INFO - PAYMENT ADDRESS is
│ 2024-09-29 21:02:16,535 - MainThread - INFO - --------------------------------------------                                                                                                                                                 │
│ 2024-09-29 21:02:16,537 - MainThread - INFO - [Plugins] No plugins enabled                                                                                                                                                                 │
│ 2024-09-29 21:02:16,539 - MainThread - INFO - Initial cycle set to -1                                                                                                                                                                      │
│ 2024-09-29 21:02:16,542 - MainThread - INFO - Application is READY!                                                                                                                                                                        │
│ 2024-09-29 21:02:16,544 - producer  - INFO - No failed payment files found under directory '/trd/reports/xxx/payments/failed' on or after cycle '-1'                                                      │
│ 2024-09-29 21:02:16,545 - MainThread - INFO - --------------------------------------------                                                                                                                                                 │
│ 2024-09-29 21:02:16,624 - producer  - ERROR - Unable to fetch current cycle from provider tzkt, Not synced. Exiting.                                                                                                                       │
│ 2024-09-29 21:02:16,626 - consumer0 - WARNING - Exit signal received. Terminating...                                                                                                                                                       │
│ 2024-09-29 21:02:16,626 - MainThread - INFO - Application stop handler called: 12                                                                                                                                                          │
│ 2024-09-29 21:02:16,628 - producer  - INFO - TRD Exit triggered by producer, exit code: 8                                                                                                                                                  │
│ 2024-09-29 21:02:16,629 - MainThread - INFO - TRD is shutting down...                                                                                                                                                                      │
│ 2024-09-29 21:02:16,630 - MainThread - INFO - --------------------------------------------------------                                                                                                                                     │
│ 2024-09-29 21:02:16,631 - MainThread - INFO - Sensitive operations are in progress!                                                                                                                                                        │
│ 2024-09-29 21:02:16,631 - MainThread - INFO - Please wait while the application is being shut down!                                                                                                                                        │
│ 2024-09-29 21:02:16,632 - MainThread - INFO - --------------------------------------------------------                                                                                                                                     │
│ 2024-09-29 21:02:16,632 - MainThread - INFO - Lock file removed!                                                                                                                                                                           │
│ 2024-09-29 21:02:16,633 - MainThread - INFO - Shutdown due to error!, exit code: 1                                                                                                                                                         │
│ Tezos Reward Distributor (TRD) is Starting

We also modify TRD to add a specific error code for this specific benign
case:
tezos-reward-distributor-organization/tezos-reward-distributor#713

* add link to list of exit codes
  • Loading branch information
nicolasochem authored Oct 4, 2024
1 parent 618c2f1 commit 1935aa0
Showing 1 changed file with 15 additions and 3 deletions.
18 changes: 15 additions & 3 deletions charts/tezos-reward-distributor/scripts/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -23,20 +23,32 @@ python src/main.py \
${dry_run_arg}

# if TRD fails, send a slack alert
if [ $? -ne 0 ]; then
# Some exit codes are excluded. List of exit codes:
# https://github.com/tezos-reward-distributor-organization/tezos-reward-distributor/blob/cdf7d3884bdf880c5e13267c6d6ad3af470b2e4e/src/util/exit_program.py#L6
exit_code=$?
if [ $exit_code -ne 0 ]; then
# check if bot token and channel are set
if [ -z "${SLACK_BOT_TOKEN}" ] || [ -z "${SLACK_CHANNEL}" ]; then
echo "TRD failed, but SLACK_BOT_TOKEN or SLACK_CHANNEL is not set, failing job"
exit 1
fi
python -c "
echo "TRD exited in error, exit code is ${exit_code}, maybe send slack alert"
EXIT_CODE=${exit_code} python -c "
import os
import sys
import requests
import json
slack_bot_token = os.getenv('SLACK_BOT_TOKEN')
slack_channel = os.getenv('SLACK_CHANNEL')
baker_alias = os.getenv('BAKER_ALIAS')
exit_code = os.getenv('EXIT_CODE')
if exit_code == '9':
print(f'TRD returned exit code 9 (PROVIDER_BUSY) for Tezos baker {baker_alias}. Not alerting.')
sys.exit(0)
else:
message = f'TRD Payout failed for Tezos baker {baker_alias}, exit code {exit_code}.'
response = requests.post(
'https://slack.com/api/chat.postMessage',
Expand All @@ -46,7 +58,7 @@ response = requests.post(
},
data=json.dumps({
'channel': slack_channel,
'text': f'TRD Payout failed for Tezos baker {baker_alias}'
'text': message
})
)
Expand Down

0 comments on commit 1935aa0

Please sign in to comment.