Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating More Accuracy in NS_TO_S #460

Open
wants to merge 6 commits into
base: rolling
Choose a base branch
from

Conversation

CursedRock17
Copy link

This Pull Request is done in preparation for this geometry2 issue which reutilizes logic increasing accuracy in the conversion from nanoseconds to seconds. I assumed we wanted to keep these as Macros opposed to creating functions, but this would mean the Macros continue to take multiple "types" when we just want nanoseconds (int64_t). So should they stay as macros and we just remove all tests which don't take an int64 type and begin enforcing this behavior?

Signed-off-by: CursedRock17 <mtglucas1@gmail.com>
Signed-off-by: CursedRock17 <mtglucas1@gmail.com>
Signed-off-by: CursedRock17 <mtglucas1@gmail.com>
@CursedRock17 CursedRock17 marked this pull request as ready for review April 18, 2024 00:47
Copy link
Contributor

@ahcorde ahcorde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me with green CI.

Let's put this on the queue to merge after the freeze is lifted.

Copy link
Collaborator

@fujitatomoya fujitatomoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with green CI

@clalancette
Copy link
Contributor

I think that before we merge this, we should have a companion PR in geometry2 (or somewhere else) that actually uses it. Otherwise, we can run into the situation where we have an API that nothing uses, which isn't great.

@ahcorde
Copy link
Contributor

ahcorde commented Apr 26, 2024

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

Copy link
Contributor

@ahcorde ahcorde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI is failing

@CursedRock17
Copy link
Author

CursedRock17 commented Apr 27, 2024

I'm going to roll out a fix soon, I assume the MACRO is returning long instead of a double, but it's weird that it's not throwing an error on my local tests, which require doubles as results. Also, is there something I missing with the CI having semi-random K's in the error message?

[K/home/jenkins-agent/workspace/ci_linux/ws/src/ros2/rmw_fastrtps/rmw_fastrtps_shared_cpp/src/qos.cpp:145:��[Kerro�[Kinvalid operands of type�[Kdou�[K’ an�[Klong �[K’ to binar�[Koperat�[K’

Edit:
It seems like the error is not on our part, but rather rmw_fastrtps. This call to RCUTILS_NS_TO_S is causing the problem where it calls incorrect types between double to long, because the nanoseconds that are being passed, are in the form of a double already. The line before is:

double period_in_ns = entity_qos.liveliness().lease_duration.to_ns() * 2.0 / 3.0;

Which converts lease_duration to a nanosecond which is, throughout the entire ecosystem, given to be an int64_t, seen here. I guess now it can't handle an operation between double and long, but if the user uses a long long (int64_t), the compiler can handle the implicit conversion. So should this be brought up to rmw_fastrtps, because they may be doing excess conversion on their qos values? Also, this shouldn't cause any other issues on a widespread basis, because the user is returning a double from this function, converting a double to another double doesn't make logistical sense IMO.

@clalancette
Copy link
Contributor

It seems like the error is not on our part, but rather rmw_fastrtps. This call to RCUTILS_NS_TO_S is causing the problem where it calls incorrect types between double to long, because the nanoseconds that are being passed, are in the form of a double already. The line before is:

double period_in_ns = entity_qos.liveliness().lease_duration.to_ns() * 2.0 / 3.0;

Agreed with your analysis; that line of code should probably be:

int64_t period_in_ns = entity_qos.liveliness().lease_duration.to_ns() * 2 / 3;

(though there is some risk of losing a bit of precision there).

If you'd like to open a PR over at rmw_fastrtps to change that, we can review it over there and unblock this.

Comment on lines 39 to 41
#define RCUTILS_NS_TO_S(nanoseconds) \
((1e-9 * (double)((int32_t)(nanoseconds % 1000000000l))) + \
((double)((int32_t)((nanoseconds - ((int32_t)(nanoseconds % 1000000000l))) / 1000000000l))))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment in ros2/rmw_fastrtps#755 on why I'm very nervous about making this change. Basically things that were expecting integer division before are now going to get FPU division. While it may be more accurate, it also changes the expectations.

This actually leads to another observation, which is that the type of nanoseconds is ill-defined because this is a macro.

All of that leads me to another thought. Maybe what we should do here is to make a new function called double rcutils_nanoseconds_to_seconds(double). And then we (slowly) convert uses of the macro over to that new function. That way we aren't making this change for things that don't expect it, and we can get the types right on the new function.

Thoughts?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it may be more accurate, it also changes the expectations.

Yeah as we found in this review, changing the nature of the macro induces errors in testing. To be fair, it's a beneficial change, but if it causes errors in testing there's a high chance it could lead to breakages on the users' end. Though I don't think going the route of new typed function is a better option as it just adds more to the file and alters the way code is written, when I would assume users just want to use a macro, design-wise. Unless most of the macros are rebranded to typed functions, I would feel that it's just an awkward change when it comes to DX.

This actually leads to another observation, which is that the type of nanoseconds is ill-defined because this is a macro.

Yeah while rcutils and dds implementations use int64_t it's not necessarily common knowledge for users.

I would like to create more accuracy, but since there's not a effective way to deprecate macros, I doubt it's worth breaking user behavior to do so, nor is it worth expanding on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to create more accuracy, but since there's not a effective way to deprecate macros, I doubt it's worth breaking user behavior to do so, nor is it worth expanding on.

I think that adding more accurate function calls (especially if they are constexpr) and migrating where it makes sense would work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that adding more accurate function calls (especially if they are constexpr) and migrating where it makes sense would work?

Yeah, that's what I'm thinking we should do.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(especially if they are constexpr) and migrating where it makes sense would work?

Seems fair enough to make more calls, but since rcutils is essentially C repository, did we need these split functions in a different repo?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fair enough to make more calls, but since rcutils is essentially C repository, did we need these split functions in a different repo?

Good point. It can't be constexpr here, but it can at least be a function, be const, and use types.

Alternatively, if we wanted to use C++ and constexpr, we could move it to rcpputils, but then there are certain places it can't be used (like rcutils, rcl, rcl_action, and rmw).

Signed-off-by: CursedRock17 <mtglucas1@gmail.com>
@CursedRock17
Copy link
Author

In response to this comment I went with the first option of keeping everything within rcutils as to broaden these new functions to as many repositories as possible as and continue to solve the original issue. I went ahead and made functions of all the macros that magnify the time as that's where most of our lost precision comes from, I felt the other 3 macros don't necessarily need functions of the their own (more bloat in the file).

As I was breaking down the tests I would run into oddball tests in which a value would be too large for the original uint32_t conversion, but after removing the casts to said type, all of my tests were passing with the correct values to pass. After digging the original logic comes from here with no way to find out why said conversions were used. Just wanted to know if my analysis is wrong here

Signed-off-by: CursedRock17 <mtglucas1@gmail.com>
src/time.c Outdated
{
// scale the nanoseconds separately for improved accuracy
int64_t sec, nsec;
nsec = (nanoseconds % 1000000000l);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a minor suggestion. can we have like RCUTILS_NANOSECS_PER_SEC for 1000000000 and use it else where? e.g RCUTILS_S_TO_NS, RCUTILS_NS_TO_S. the same thing can go to miliseconds and microseconds.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah seems like a great metric for users to have, especially if they design their own time-based algorithms and want standardization. I would figure it'll get used enough to be worth having 3 definitions.

Signed-off-by: CursedRock17 <mtglucas1@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants