Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Observe any exception on HTTP connection scavenge. #110982

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

salvois
Copy link

@salvois salvois commented Dec 29, 2024

Fixes #110951

@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Dec 29, 2024
Copy link
Contributor

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Copy link
Member

@MihaZupan MihaZupan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for filing the issue and looking to fix it.

We don't want to unconditionally suppress the error here as it's meaningful for the request.
Instead, we'll have to figure out in what case it's possible that no thread would observe the exception.

It's not immediately obvious to me how that could happen with .NET 9.0.
Do you mind double-checking that you were indeed running on 9.0 and not 8.0?

@@ -267,8 +267,9 @@ async ValueTask<int> ReadAheadWithZeroByteReadAsync()

return read;
}
catch (Exception error) when (TransitionToCompletedAndTryOwnCompletion())
catch (Exception error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TransitionToCompletedAndTryOwnCompletion controls who is responsible for observing the exception.
If a thread has already flipped _readAheadTaskStatus to CompletionReserved, it's also responsible for observing any exceptions thrown by this method.
The exact exception is meaningful as the request may fail with it as the inner exception.

If it has not (task status is still Started), the method itself will observe and suppress the exception. In that case we don't necessarily care that/which exception occured, just that the read completed before a request attempted to use the connection.

@salvois
Copy link
Author

salvois commented Dec 30, 2024

Thank you for filing the issue and looking to fix it.

Thank you for taking time to review and being so fast in it! I drafted the PR mainly to understand how much I went off road with my interpretation. I'm still trying to create a repro under Functional.Tests but have been unsuccessful so far.

We don't want to unconditionally suppress the error here as it's meaningful for the request. Instead, we'll have to figure out in what case it's possible that no thread would observe the exception.

Long shot here: what if CheckUsabilityOnScavenge got an exception while the read-ahead task is CompletionReserved, because PrepareForReuse ran just at the "right" time, and then the connection is disposed? It looks to me nobody may be observing the read-ahead task in this case.
But I'm still biased by the hope of a "false positive" error here.

It's not immediately obvious to me how that could happen with .NET 9.0. Do you mind double-checking that you were indeed running on 9.0 and not 8.0?

I'm afraid we are: I double checked the content of our self-contained deployed package, and it contains, among the others, System.Net.Http.dll 9.0.24.52809 (2024-10-29) and runtimeconfig.json specifying Microsoft.NETCore.App and Microsoft.AspNetCore.App 9.0.0.

@MihaZupan
Copy link
Member

MihaZupan commented Dec 30, 2024

If PrepareForReuse returns true, the caller is expected to always follow up with a call into HttpConnection.SendAsync, which should always observe the _readAheadTask result.

/// The caller MUST call SendAsync afterwards if this method returns true, or dispose the connection if it returns false.</summary>

Either when doing the initial read here
ValueTask<int> vt = _readAheadTask;
_readAheadTask = default;
int bytesRead;
if (vt.IsCompleted)
{
bytesRead = vt.Result;
}
else
{
if (NetEventSource.Log.IsEnabled() && !async)
{
Trace($"Pre-emptive read completed asynchronously for a synchronous request.");
}
bytesRead = await vt.ConfigureAwait(false);
}

or if an exception occurs here
if (_readAheadTask != default)
{
Debug.Assert(_readAheadTaskStatus is ReadAheadTask_CompletionReserved or ReadAheadTask_Completed);
LogExceptions(_readAheadTask.AsTask());
}

If there's a case where we could throw between PrepareForReuse and the try block in HttpConnection.SendAsync, we would indeed leak unobserved faulted tasks, but I don't know how that'd happen.

But I'm still biased by the hope of a "false positive" error here.

It is most likely a harmless issue, but we should still track down the root cause.

@salvois
Copy link
Author

salvois commented Dec 30, 2024

If PrepareForReuse returns true, the caller is expected to always follow up with a call into HttpConnection.SendAsync

Makes sense to me, and confirms my understanding, thanks.
I was looking at the code path where PrepareForReuse returns false but the Exchange should indeed protect us.

What about the code paths that check for a read-ahead task not yet started, though?

In my understanding, there could be a case where both PrepareForReuse and CheckUsabilityOnScavenge see ReadAheadTaskHasStarted == false, because they set _readAheadTaskStatus later, not atomically.
If this is the case, they'd both start a read-ahead task (which doesn't sound good to me), moreover TransitionToCompletedAndTryOwnCompletion could well see ReadAheadTask_CompletionReserved as set here:

_readAheadTaskStatus = ReadAheadTask_CompletionReserved;

Of course, apologies if I misunderstood the implementation!

@MihaZupan
Copy link
Member

a case where both PrepareForReuse and CheckUsabilityOnScavenge see ReadAheadTaskHasStarted == false

While not obvious from the methods themselves, for both of those, the caller first removes the connection from the pool such that only 1 thread has access to the connection at any given point, hence why they use non-synchronized checks.

Note how both are called after popping the connection from the stack:

while (connections.TryPop(out HttpConnection? connection))
{
if (connection.IsUsable(nowTicks, pooledConnectionLifetime, pooledConnectionIdleTimeout))

while (_http11Connections.TryPop(out connection))
{
if (CheckExpirationOnGet(connection))
{
if (NetEventSource.Log.IsEnabled()) connection.Trace("Found expired HTTP/1.1 connection in pool.");
connection.Dispose();
continue;
}
if (!connection.PrepareForReuse(async))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Net.Http community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HttpClient may leak UnobservedTaskException in CheckUsabilityOnScavenge
2 participants