You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently when you you use the CLI to make calls to an agent, it will check the existing status file in the node path and use that for the connection information. When it's running this file contains useful running information. When it stops gracefully this file reports that it is currently stopped.
However when the agent crashes and fails to shut down gracefully. this status file is left as is from the moment of the crash. It is never updated. So we have a situation where the status file reports a running agent but we can't connect to it.
ERROR:polykey.PolykeyClient.WebSocketClient:ErrorWebSocketConnectionLocal: WebSocket Connection local error - WebSocket could not open due to internal error
ERROR:polykey.PolykeyClient.WebSocketClient.WebSocketConnection 0:ErrorWebSocketConnectionLocal: WebSocket Connection local error - WebSocket could not open due to internal error
ErrorPolykeyCLIUnexpectedError: An unexpected error occured - Thrown 'ErrorWebSocketConnectionLocal'
cause: ErrorWebSocketConnectionLocal: WebSocket could not open due to internal error
As of now this was expected behaviour. But this feedback looks worse than the actual problem of the node not running. We need better feedback for this scenario.
So we need to following changes.
If a Websocket client fails to connect then we need a nicer error to be returned without all this error logging from the logger.
If we take the connection info from the status file but fail to connect with these details, we need the nicer connection failure message AND report that the status file was incorrect and attempt to correct the status file.
Clean up the error reporting if we fail to connect with a websocket. WE shouldn't get a bunch of ERROR level logs, we should catch the connection failure and report it directly with a nicer formatted error.
We need a more specific error reported if we failed to connect with details taken from a status file with the --node-path option.
We need to clean up the status file if we determine it to be stale and orphaned.
The text was updated successfully, but these errors were encountered:
aryanjassal
changed the title
Better feedback when agent isn't running after ungraceful exit.
Better feedback when agent isn't running after ungraceful exit
Dec 10, 2024
Specification
Currently when you you use the CLI to make calls to an agent, it will check the existing status file in the node path and use that for the connection information. When it's running this file contains useful running information. When it stops gracefully this file reports that it is currently stopped.
However when the agent crashes and fails to shut down gracefully. this status file is left as is from the moment of the crash. It is never updated. So we have a situation where the status file reports a running agent but we can't connect to it.
As of now this was expected behaviour. But this feedback looks worse than the actual problem of the node not running. We need better feedback for this scenario.
So we need to following changes.
Additional context
Related: #198 (comment)
Related: #198
Tasks
--node-path
option.The text was updated successfully, but these errors were encountered: