-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Protocol vs Endpoint #89
Comments
I quite like the Endpoint abstraction as the component that knows how to send messages to nodes. I think it could be refined and brought down to a minimal SendMessage method plus a FindNode method, which I think is such a fundamental operation that it warrants promotion to a method. The current
The query only needs an opaque I think to switch to using Possibly that could be solved by passing the query a function that encapsulates the call to Another thought is that, as shown, the |
I am not in favor of replacing the
I agree that the In addition to The benefit of a minimal This last method ( |
Not at all. The endpoint simply translates FindNode into an appropriate message and calls SendMessage. The advantage is the caller doesn't need to know any specifics of the protocol/wire format for finding closer nodes. Currently if we want to perform a connectivity check on a peer as part of a background process we have to inject a way to construct the appropriate find node message. Moving that responsibility to the endpoint instead seems appropriate for such a fundamental operation. |
IMO the message format should be a system parameter, and not part of the Otherwise, it means that a different
In If providing the request message to every query isn't an option (i.e because all queries use the same format), the best ways seems to go with a Another argument against having a |
@guillaumemichel I think we are talking past one another. The implementation of an Endpoint belongs elsewhere but the Endpoint interface belongs in go-kademlia and deals only in types like The libp2p implementation of Endpoint does that by using protobuf messages with (for kad/1.0.0) the correct message type set in the protobuf. Those messages conforms to the kad.Request and kad.Response interfaces. go-kademlia knows nothing about the specific details of the message implementation. However, I am saying that the find node operation is of fundamental utility for building any dht. So, rather than polluting go-kademlia with the details of how to construct a find node request we should let the endpoint implementation do it. The libp2p endpoint implementation of the FindNode method (assuming kad/1.0.0 libp2p protocol) would construct a protobuf with the message type of FIND_NODE, send it to the node, and return the response to go-kademlia as a kad.Response.
This is exactly how the coordinator and query state machine work. They are not dependent on message implementations. See https://github.com/plprobelab/go-kademlia/blob/main/coord/coordinator.go#L443 But, for a process inside go-kademlia that is not user-initiated and wants to explore the network, e.g. during refresh, we currently have to inject a way to create a suitable kad.Request for the query the process wants to run. I want to eliminate that injection and allow the endpoint to create the message for us and send it. |
👍 IMO we should separate in distinct modules (1) the message format (
If the message format is packed with the transport endpoint, it means that the day that someone wants to use another protocol than kad/1.0.0 along with libp2p, they have to create a new
IMO this isn't the role of the
I now understand better why you feel the need of having a It think that it would make more sense to have the The |
This is already the case. We don't have refresh/probe yet but the same design is already there for including candidates in the table https://github.com/plprobelab/go-kademlia/blob/main/routing/include.go#L85 and it responds with a general event that instructs the coordinator to send a message (via the endpoint) to issue a find node request (https://github.com/plprobelab/go-kademlia/blob/main/routing/include.go#L176) |
To be more precise, I am arguing for a GetClosestNodes function that sends a message to a node asking it for its list of closest nodes to a specified kademlia key.The return value would be a list of NodeInfos (NodeID + Addresses) |
func TableRefresh(rt routing.Table, endpoint Endpoint) {
// ...
for _, bucket := rt.buckets {
// ...
// gets the key of the peer to refresh
nodeIds, addresses := endpoint.GetClosestNodes(node, key)
// ...
// updates routing table
}
}
func (endpoint *EndpointImpl) GetClosestNodes(node NodeID, key kad.Key) ([]NodeID, []Address) {
req := endpoint.FindNodeRequest(key) // wire format dependent on the endpoint implementation
resp := endpoint.SendReqGetResp(node, req) // response would come back in the only wire format supported by the endpoint
return resp.GetNodeIDs, resp.GetAddresses
} I don't like IMO the following would be more adapted as it allows func TableRefresh(rt routing.Table, endpoint Endpoint, wireFormat FormatType) {
// ...
for _, bucket := rt.buckets {
// ...
// gets the key of the peer to refresh
nodeIds, addresses := endpoint.GetCloestNodes(node, key, wireFormat)
// ...
// updates routing table
}
}
func (endpoint *EndpointImpl) GetClosestNodes(node NodeID, key kad.Key, wireFormat FormatType) ([]NodeID, []Address) {
req := wireFormat.FindNodeReq(key) // wire format not dependent on the endpoint implementation
response := endpoint.SendReqGetResp(node, req)
resp := response.(wireFormat.FindNodeResp)
return resp.GetNodeIDs, resp.GetAddresses
} Then, we can even move the message generation to the caller to keep the endpoint minimal. func TableRefresh(rt routing.Table, endpoint Endpoint, wireFormat FormatType) {
// ...
for _, bucket := rt.buckets {
// ...
// gets the key of the peer to refresh
req := wireFormat.FindNodeReq(key)
response := endpoint.SendReqGetResp(node, req)
resp := response.(wireFormat.FindNodeResp)
nodeIds := resp.GetNodeIDs
addresses := resp.GetAddresses
// ...
// updates routing table
}
} A method on the func (wireFormat FormatType) FindNodeReq(endpoint Endpoint, node NodeID, key kad.Key) FindNodeResp {
req := wireFormat.FindNodeReq(key)
response := endpoint.SendReqGetResp(node, req)
return response.(wireFormat.FindNodeResp)
}
func TableRefresh(rt routing.Table, endpoint Endpoint, wireFormat FormatType) {
// ...
for _, bucket := rt.buckets {
// ...
// gets the key of the peer to refresh
nodeIds, addresses := wireFormat.FindNodeReq(endpoint, node, key)
// ...
// updates routing table
}
} It feels better to me to have wire format handling parsing in the I find that 4. is the most elegant way to proceed, but I guess that 2. would also be fine. Let me know what you think @iand |
I think I must be misunderstanding something. The IPFS Endpoint implementation in go-kademlia/ipfs absolutely knows about the wire format. It creates protobuf messages that are sent on streams. |
Alright! I think that a reorganization created confusion about it. Currently there is a massive However, I originally designed them to be distinct modules (see how the repo was before it got reorganized). The libp2pendpoint module is an implementation of the I agree that mixing multiple module implementations that aren't directly related to each other in the same package doesn't help to make it clear. We should probably (1) create subfolders within the libp2p package for each of the modules implementations, or (2) put back the modules where they were, until we migrate go-libp2p-kad-dht components to go-libp2p-kad-dht. You can read more here about the initial idea behind the
Not exactly. It takes as input a So the I am happy to discuss it over a call if it can help. |
Right now, we're working with a Endpoint interfaces to abstract away the concrete communication between two nodes in the network. Here are the interfaces
As an alternative approach, I'm proposing a
Protocol
interface with the following signature:The assumption here is that in a Kademlia network any record type (provider, ipns, peer) has the two primitives
Get
andPut
. AGet
request to a remote node will always return a list of results (Records() []R
) plus a list of nodes that know more (CloserNodes() N
).In the libp2p/IPFS case a concrete implementation of the Protocol interface (only GET) could look like this:
When I start a query in go-kademlia I need to differentiate between a GET or PUT query and then pass it the specific protocol I want to use. E.g., there should be another
Protocol
interface implementation for provider records. If I want to get the provider record for a certain key, I'd start a query by enqueuing aneventAddQuery
event and pass it the protocol it should use. When go-kademlia starts the query it callsGet
on the protocol and uses the resultCloserNodes
result to guide further queries. The genericRecord
type allows type safety. A concreteProtocol
implementation would return the correct record type to the caller.A Put operation would use some kind of base-protocol that implements node discovery Get/Put operations to find the closest node to a certain key. Only the last step of storing the record would use the concrete record-specific protocol.
There are a few issues with this approach:
Record
interfaces and then when passing the record out to the user back toR
.Get
takes a Kademlia key at the moment. In IPFS land we're querying for the preimage of the Kademlia key, so there must be a way to combine both and pass it into theGet
method.I formulated the above in a hurry and have some proposals for the issues that I'll post here next week.
The text was updated successfully, but these errors were encountered: