-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Primary and secondary client indexes and client-side transaction validation #314
base: main
Are you sure you want to change the base?
Conversation
Pull Request Test Coverage Report for Build 2534420969
💛 - Coveralls |
Added some commits with further optimizations:
Before:
After
The performance difference is still 20%-30% although absolute numbers are much more digestible now: since nothing much more than transaction code is happening on transact, the difference sticks as both client-side and server-side benefit for the same improvements. The server side though is benefiting for a deep copy implementation of models that would normally not be available to it. |
582489f
to
9640378
Compare
Some more performance improvements:
Results show 10% to 25%. There is IO involved. Looking at the profile, numbers are closer to 5%. |
note: we are using the bridge model which as a 4096 int sized array which relatively hurts performance and is not particularly representative of what we would find in ovn-k. |
266ce31
to
4158e94
Compare
Index duplication is a commit constraint and should be reported as an additional operation result. Verification should be deferred until all operations have been processed and can be taken into account. From RFC: If "indexes" is specified, it must be an array of zero or more <column-set>s. A <column-set> is an array of one or more strings, each of which names a column. Each <column-set> is a set of columns whose values, taken together within any given row, must be unique within the table. This is a "deferred" constraint, enforced only at transaction commit time, after unreferenced rows are deleted and dangling weak references are removed. Ephemeral columns may not be part of indexes. and: if all of the operations succeed, but the results cannot be committed, then "result" will have one more element than "params", with the additional element being an <error>. Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
To be able to reuse it in the client without incurring in circular dependencies. Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
Adds a client option that enables transaction verification. When transaction verification is enabled, transactions are serialized in the client. This avoids the verification of a transaction when other transactions are on flight ensuring the consistency of such verification. The verification is done by running the transaction in a similar way it is done server-side but on a throw-away database built on top of the client-cache. The primary use case for this verification is to check for duplicate indexes client-side instead of server-side and specifically client indexes which can not be checked server-side. Although support for this specific case will be introduced in later commit. Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
Introduces two types of client indexes: primary and secondary. Primary indexes are verified to no be duplicate at transaction verification, when enable, otherwise there is no difference between them. Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
So that we have DeepCopy of models available and benchmarks throw out more realistic numbers. Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
Creating the event channel has a performance cost due to being buffered and high capacity. There is no use for it in some situations, like a transaction or database cache and in such cases the event processor wont be run. Delay initialization of the event channel until that time. Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
We can run the transaction on shallow model copies of its own cache as it never modifies them directly, it uses tableUpdates instead. We can do the same for the temporary database of the validating client transaction. Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
5f6ce60
to
bd57ef7
Compare
/hold |
Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
For updates and mutates, the difference for each column is calculated and set in the Modify row of an update2 notification. Calculating this difference has a performance cost when the column is a set with thousands of unordered values. While calculating the Modify row is a spec requirement over the wire, internally updates could be handled through the Old & New rows and a transaction that is specific for validating purposes can take advantage of this. Make the cache capable of processing an update2 with nil Modify and non-nil Old & New as if it were a normal update notification and avoid setting Modify for updates on validating transactions. Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
This PR introduces two different types of client indexes, primary and secondary,
and an option to enable transaction validation.
Primary indexes are verified to be unique at transaction validation in the context
of that specific client instance (cache) when validation is enabled, otherwise there
is no difference between both types of indexes.
When transaction validation is enabled, transactions are serialized in
the client. This avoids the validation of a transaction when other
transactions are in flight ensuring the consistency of such
validation.
The validation is done by running the transaction in a similar way it
is done server-side but on a throw-away database built on top of the
client-cache.
Topics for discussion:
Transaction validation has a performance cost and is not free. Specifically checking
indexes needs to account for them being updated throughout the transaction, either via
updates, mutates or deletes; and the check needs to be deferred until all operations are processed.
Processing the transaction as it is done in the server and reusing that code is the more natural approach.
Measuring with one of the added benchmarks shows from 5% to 20% performance loss.
These benchmark tests do incur in IO that make drawing conclusions difficult. Profiling consistently shows a total extra of 4%-5% time spent:
note: we are using the bridge model which has a 4096 int sized array which relatively hurts performance and is not particularly representative of what we would find in ovn-k.
Pending to check how this really affects specific use cases like ovn-k.
Ways to go forward:
We currently don't support garbage collecting strongly referenced rows that are not
referenced any more and we don't correctly consider that on validation. Regarding indexes
and client side validation, the outcome is not too bad as it will error when it should not but
will not lead to any undesired index dup. The workarounds would be not use validation, explicitly
remove strong referenced rows in the transaction, or use a separate transaction than the one
that removes the strongly referenced row to create a similar one with the same index.