Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(network): introduce relay reservation score #2634

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

RolandSherwin
Copy link
Member

No description provided.

ant-networking/src/relay_manager.rs Outdated Show resolved Hide resolved
ant-networking/src/relay_manager.rs Outdated Show resolved Hide resolved
ant-networking/src/relay_manager.rs Show resolved Hide resolved
ant-networking/src/relay_manager.rs Show resolved Hide resolved
}

#[derive(Debug, Default, Clone)]
struct ReservationStat {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall this to be the score of the recent X number attempts , so that it can reflect the realtime of the health ?

self.log_reservation_score();
}

/// Clean up the stats for relay servers that we are no longer connected to.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this make us re-connect to a faulty-server and experience another round of tracking then drop ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we could add a block list here.

ant-networking/src/relay_manager.rs Outdated Show resolved Hide resolved
continue;
};

if elapsed < MAX_DURATION_TO_TRACK_INCOMING_CONNECTIONS_PER_PEER {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not quite sure, MAX_DURATION_TO_TRACK_INCOMING_CONNECTIONS_PER_PEER is a fixed duration.
However, according to the log msg, waiting to be established is an event driven status change, not time-driven
Don't get why not timed out means it is still waiting to be established?

Copy link
Member Author

@RolandSherwin RolandSherwin Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A random peer could make multiple incoming connections through multiple of our relays, so we can't just act on the events, instead we have to give ourselves some time to collect all these incoming connections.

Sometimes we could get 1 incoming connection, sometimes 4, we can't be sure.

}

for from_peer in to_remove {
debug!("Removing {from_peer:?} from the incoming_connections_from_remote_peer");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

incoming_connections_from_remote_peer only got pruned when got updated within above block of code
what if entries remained in the container due to some bug/error, which makes it growing infinitely ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only scenario where this can happen is from the below code

 let Some(latest_time) = connections.iter().map(|(_, _, time, _)| time).max() else {
                debug!("The incoming connections from {from_peer:?} are empty. Skipping.");
                continue;
            };

            let Ok(elapsed) = SystemTime::now().duration_since(*latest_time) else {
                debug!("Could not obtain elapsed time.");
                continue;
            };

This can only happen if there is a bug in our code. The cleanup does not depend on the events, it depends on just the MAX_DURATION_TO_TRACK_INCOMING_CONNECTIONS_PER_PEER.

Just depending on a timer is fine because a connection cannot take more than 15s before a ConnectionEstablished or a IncomingConnectionError is emitted tbh.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. fine then.

just This can only happen if there is a bug in our code could happen

@RolandSherwin RolandSherwin force-pushed the relay_healthh branch 2 times, most recently from e09a5aa to 1ce8fc6 Compare January 16, 2025 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants