Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keepalive missed / reconnecting #291

Open
phifuh opened this issue Feb 4, 2024 · 4 comments
Open

Keepalive missed / reconnecting #291

phifuh opened this issue Feb 4, 2024 · 4 comments
Milestone

Comments

@phifuh
Copy link

phifuh commented Feb 4, 2024

TL;DR I assuming that this issue has something to do with socket connection.

I have an Issue which is hard to explain. Every ~30/45 minutes I have some network ananomly. When watching a stream/youtube video it just buffers for a second and when playing games, you dont notice any lag or anything, I am assuming that there might be a short lag spike burst or some sort of socket reconnect issue.

When this network issue accours i get the message
"keepalive missed, connection lost. reconnecting..."

In rare cases I get another traceback:

"exception while resubscribing
Traceback (most recent call last):
  File "x\twitchAPI\eventsub\websocket.py", line 376, in _resubscribe     
    await self._subscribe(**sub)
  File "x\twitchAPI\eventsub\websocket.py", line 221, in _subscribe       
    raise EventSubSubscriptionError(result.get('message'))
twitchAPI.type.EventSubSubscriptionError: websocket transport session does not exist or has already disconnected
exception while resubscribing

I looked into the Chat class :

the check_connection() is always True

I tried changing some numbers on:

  • join_timeout
  • ping_frequency
  • ping_jitter
  • reconnect_delay_steps

but this has no effect.

I tried upgrading from v 4.0.1 to 4.2.0 but its still the same issue, the only difference I noticed is that there is a new message

Reached timeout for websocket receive, will attempt a reconnect

In rare situation it automaticly reconnects after 5-10 minutes.

I assuming that has something to do with socket connection & not with the twitch implementaiton

@kon72
Copy link

kon72 commented Jun 10, 2024

Faced the same issue with current HEAD revision (b608a57)

Here is the verbose logging:

I0610 17:27:56.068628 6128431104 websocket.py:472] [twitchAPI.eventsub.websocket] got session keep alive
I0610 17:28:06.068538 6128431104 websocket.py:472] [twitchAPI.eventsub.websocket] got session keep alive
W0610 17:28:26.154411 6128431104 websocket.py:311] [twitchAPI.eventsub.websocket] keepalive missed, connection lost. reconnecting...
I0610 17:28:26.154987 6128431104 websocket.py:256] [twitchAPI.eventsub.websocket] reconnecting using wss://eventsub.wss.twitch.tv/ws...
I0610 17:28:36.157303 6128431104 websocket.py:360] [twitchAPI.eventsub.websocket] websocket is closing
I0610 17:28:36.158023 6128431104 websocket.py:464] [twitchAPI.eventsub.websocket] new session id: AgoQKLpyiSJiRmiA0htJeKxdmRIGY2VsbC1h
I0610 17:28:36.158257 6128431104 websocket.py:392] [twitchAPI.eventsub.websocket] resubscribe to all active subscriptions of this websocket...
I0610 17:28:36.158655 6128431104 websocket.py:218] [twitchAPI.eventsub.websocket] subscribe to channel.update version 2 with condition {'broadcaster_user_id': '79948010'}
I0610 17:28:36.602202 6128431104 websocket.py:472] [twitchAPI.eventsub.websocket] got session keep alive
I0610 17:28:36.738714 6128431104 websocket.py:351] [twitchAPI.eventsub.websocket] Websocket closing: 4003 - Connection unused, you have to create a subscription within 10 seconds
E0610 17:28:37.402554 6128431104 websocket.py:399] [twitchAPI.eventsub.websocket] exception while resubscribing
Traceback (most recent call last):
  File "/Users/kon/Library/Caches/pypoetry/virtualenvs/twitchalert-aUT8sfyF-py3.11/lib/python3.11/site-packages/twitchAPI/eventsub/websocket.py", line 397, in _resubscribe
    await self._subscribe(**sub)
  File "/Users/kon/Library/Caches/pypoetry/virtualenvs/twitchalert-aUT8sfyF-py3.11/lib/python3.11/site-packages/twitchAPI/eventsub/websocket.py", line 237, in _subscribe
    raise EventSubSubscriptionError(result.get('message'))
twitchAPI.type.EventSubSubscriptionError: websocket transport session does not exist or has already disconnected

It seems in line

message: WSMessage = await self._connection.receive()

self._connection.receive() keeps listening to the stale connection even after closing it and reconnecting to the new connection.
It will receive aiohttp.WSMsgType.CLOSED message after the 10 seconds timeout, and then start listening to the new connection, but at that time Twitch refuses the subscription with a 4003 error.

4003 | Connection unused | When you connect to the server, you must create a subscription within 10 seconds or the connection is closed. The time limit is subject to change.

I think the easiest solution is to set the <10s timeout for self._connection.receive() like:

...
try:
    message: WSMessage = await self._connection.receive(timeout=1)
except TimeoutError:
    continue
...

@Latent-Logic
Copy link
Contributor

I think I found one edge case causing the Websocket closing: 4003 - Connection unused, you have to create a subscription within 10 seconds where an error while re-subscribing could cause the library to forget the subscriptions it had and then forever loop. I've opened #319 as an attempt to address that case.

@Julian-O
Copy link

I came here to report: I had the same problem - about once every 2 or 3 hours. Yesterday, I tried @kon72's solution and left it running for 12 hours overnight, and I came back here to confirm it worked (without commenting on whether it is the ideal solution - I haven't studied the code).

I now see @Latent-Logic has a patch that takes a slightly different approach. I haven't tried that, but seems carefully considered, so I am hopeful.

@Latent-Logic
Copy link
Contributor

So my PR is only addressing one failure case when you've already hit a reconnect case and a 2nd issue can cause the library to forget the subscriptions that have been registered while trying to re-subscribe. I can see how the above receive timeout would much better handle the case when the keepalive missed, connection lost. reconnecting... message from _task_reconnect_handler is replacing the underlying self._connection instance. Currently investigating how I could integrate additional protections for that.

@Teekeks Teekeks added this to the v5.0.0 milestone Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants