Docs: add a design document for the Client #78

CuriousGeorgiy · 2023-12-11T07:52:16Z

This patch adds a document that states the requirements and use cases we have in mind for the Client, and then proposes a design that fulfills them.

docs/client-design.md

drewdzzz

Will we have combinators (any, all, waitFor)?
We've given up on template sockets?

docs/client-design.md

drewdzzz · 2024-01-23T17:20:04Z

docs/client-design.md

+	 * 3. The response has not already been retrieved previously.
+	 */
+	std::optional<Response> get_response();
+};


We could create extract-once class:

add isReady method

set_callback should be rvalue method

get_response -> get_result (result is actually variant<Error, Response>) - rvalue method

reset_callback will be dropped. Anyway, I don't see any use cases for it.

Then, user cannot get response after callback is set and vice versa.

add isReady method

I don't understand, why we do need this method, if we have get_status, which we could check for request readiness?

set_callback should be rvalue method

This sounds like a cool solution. I suggest we keep the request object and pass it to the callback later on.

get_response -> get_result (result is actually variant<Error, Response>) - rvalue method

I don't like this, since it would be nice to share the ConnectionError object between the connection and the requests (a variant does not allow to store references).

reset_callback will be dropped. Anyway, I don't see any use cases for it.

Initially, I thought there might be some conditions when an application would want to remove the callback. But now I come to think that it is indeed redundant.

You could store a pointer to an error in variant instead of reference - it is private anyway.

Yeah, isReady is not needed.

Now, you cannot do anything to the Request when callback is set, but you can access it when response is taken.

I won't resist to current design, but we could do the same for get_response (Result is needed for this purpose) to make Request completely safe. I'd leave it up to @alyapunov.

CuriousGeorgiy · 2024-01-24T11:47:55Z

@drewdzzz

Will we have combinators (any, all, waitFor)?

I don't see how we can implement them without access to the event loop. As far as I understand, @alyapunov just wanted something like the FanOut class I proposed.

We've given up on template sockets?

Yeah, I see no value in this abstraction. We can add a typedef later on if we want to support other platforms like Windows.

drewdzzz · 2024-01-24T15:34:15Z

docs/client-design.md

+	 * 3. The response has not already been retrieved previously.
+	 */
+	std::optional<Response> get_response();
+};


Now, you cannot do anything to the Request when callback is set, but you can access it when response is taken.

I won't resist to current design, but we could do the same for get_response (Result is needed for this purpose) to make Request completely safe. I'd leave it up to @alyapunov.

drewdzzz · 2024-01-24T15:43:53Z

docs/client-design.md

+ * Tarantool instance.
+ */
+template<class IOEventProvider>
+class Connection {


Will we have different classes for a plain connection and SSL connection?

Sorry, forgot about this detail. I guess we could either add a StreamProvider template parameter or we could use dynamic polymorphism. In the first case the user won't be able to store different connections in one container. Not sure what's best here. @alyapunov

I think we can do both if we add a StreamProvider. In general overview:

UnixPlainStream and UnixSSLStream for users who doesn't need both SSL and Plain streams at once.

New UnixDynamicStream for others - there is already transport field in ConnectOpts, we will just check it in connect, send and recv.

Thus, only users who need both connection types pay the performance price.

drewdzzz · 2024-01-24T15:47:15Z

docs/client-design.md

+	 */
+	tnt::List<Request> collect();
+};
+```


About any, all, waitFor.

I don't see how we can implement them without access to the event loop.

When IOEventProvider fires the connector callback, it reads the data from socket and then places it somewhere (creating a Response).
At this moment, we could fire combinator any, associated with the request, or bump and check counter for combinator all.

What's about waitFor, we could just check deadlines for all combinators on every callback from IOEventProvider.

Our combinators will be faster then user-created ones, since we can manipulate with intrusive lists of Requests under the hood, and user will have to use callbacks.

What's about waitFor, we could just check deadlines for all combinators on every callback from IOEventProvider.

I believe we discussed that we don't want to implement timers, because the user has more precise time information than we do.

At this moment, we could fire combinator any, associated with the request, or bump and check counter for combinator all.

What interface do you propose for these combinators? It seems like this will create a lot of additional overhead (i.e., the connection objects will have to manage the combinator objects).

I don't see many overhead. There will be a snippet somewhere in the connector:

if (req.hasCallback()) req.cb(); else req.setResponse();

With combinator, it will turn into switch case:

switch (req.ret_type) { case COMBINATOR: /** Can be one function - is separated for clarity. */ combinators[req.combinator_id]->list.insert(req); combinators[req.combinator_id]->checkReady(); case CALLBACK: req.cb(); case RAW: req.setResponse(); }

Combinator is just a list of ready requests with methods setCallback, isReady, getResponse - just like an usual request, but Response can be an array of responses (for all combinator, for example).

There will be no overhead for users who don't use combinators.

What's about waitFor, let's drop it - it can be implemented later if it will be needed - we will need to check time on every response.

As for me, we could design efficient combinators while we are here (they are efficient because they use intrusive lists instead of callbacks). But it's just my opinion, so I can leave this decision to @alyapunov.

drewdzzz · 2024-01-24T16:09:15Z

docs/client-design.md

+	 * Return a list of ready request, and cancel the requests for which a 
+	 * response is not available. 
+	 */
+	tnt::List<Request> collect();


Is this solution of myTarget's problem? If so:

You mustn't cancel requests here.

Passed requests must be dynamic: you poll for ready requests, then you create a bunch of new requests and poll them along with the previous ones.

From the previous point - you must allow to populate the list of requests.

Need to come up with a better name - it is a tool polling for ready request, not for "FanOut" requests.

P.S. The idea was to create something like epoll - you ask for ready requests and get only them (no need to scan for all requests and find ready ones), but all not ready requests must be still available.

Perhaps I still don't understand the problem statement. AFAIC, there is no dynamic scenario, we just want to create a bunch of synchronization points (i.e., collect points).

In the scenario you propose the mapping between the business logic and the responses is not clear. Let's say I have requests about a VK and Mail.ru profile mixed up (which correspond to 2 different Tarantool instances and 2 different connections). After collecting the mixed up responses, how I am going to understand what the sources of the requests were?

I'll explain my understanding from the beginning.

We have myTarget's problem - collect as much information as we can in a limited amount of time, for example, 1 second. We have only anonymous user id, so, firstly we send a bunch of requests to Tarantool to get all actions with this id (email authorization, visiting book page on litres and so on). When we have received data from mail.ru email, we can send a new bunch of requests to Tarantool storing mail.ru data (name, surname, age, etc). When we have received data from litres with user's books ids, we can send another bunch to Tarantool storing litres data (we want to find book's genre by its id). After 1 second all the requests are canceled, and myTarget knows that the guy with session_id = 143 is actually Ivan Ivanov who is keen on Steven King books.

Firstly, I just can't implement such service with this FanOut. For example, I call collect after 0.1s, and no responses are ready - the job has died even before it started. So, we must not cancel requests that are not ready.

Secondly, I have many different request types (search by anonymous id, mail.ru, litres and so on). I see only two solutions:

We will have a lot of FanOuts - one per request type for every job (collecting info by id): N*M combinators.

We will have something like epoll - the combinator will associate a value with every request. It can be void * or a template parameter of the combinator. Then, every job will wake up every 0.05s, for example, and call single collect method for all request types. Here we will use only N combinators.

I would prefer the second variant.

Thirdly, let's imagine that we have access to Tarantool storing Delivery Club order history. Now myTarget knows my user id in Delivery Club and wants to fetch all my orders. By user id it fetches a lot of reponses with order id. Then, by order id, it fetches information about food to find out about my food preferences. If we decide to have a FanOut per request type, we will need an opportunity to append new requests resolving order_id into its contents - collection of requests of this type is filled dynamically.

Add a document that states the requirements for the Client, and then proposes a design that fulfills them.

CuriousGeorgiy added the do not merge Not ready to be merged label Dec 11, 2023

CuriousGeorgiy requested a review from drewdzzz December 11, 2023 07:52

CuriousGeorgiy assigned drewdzzz Dec 11, 2023

drewdzzz removed their assignment Dec 15, 2023

CuriousGeorgiy force-pushed the client-design-doc branch from b0126a2 to 7e4ac2d Compare December 28, 2023 09:18

CuriousGeorgiy assigned drewdzzz Dec 28, 2023

drewdzzz reviewed Jan 11, 2024

View reviewed changes