Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opcua: nsu in nodeId (fixing #1334) #1335

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

erossignon
Copy link
Contributor

No description provided.

@sebastiankb
Copy link
Contributor

sebastiankb commented Dec 2, 2024

@erossignon many thanks!

beside of the "ns=..." nodeId, TD needs also to allow "nsu=..." as node ID. Still we need to clearify, if the nodeID can be part of the href or is dedicated ua:nodeId term is needed.

@danielpeintner
Copy link
Member

The idea is that we use href instead of the own term opcua:nodeId (we use href in other bindings too).

For example, we now have

{
    "base": "opc.tcp://192.168.120.237:4840/",
    "properties": {
        "foo": {
            "forms": [
                {
                    "opcua:nodeId": "nsu=http://example.org/SpecialNamespace/;s=\"LED\".\"State\""
                }
            ]
        }
    }
}

which would become the following

{
    "base": "opc.tcp://192.168.120.237:4840/",
    "properties": {
        "foo": {
            "forms": [
                {
                    "href": "nsu=http://example.org/SpecialNamespace/;s=\"LED\".\"State\""
                }
            ]
        }
    }
}

The problem I see are href parsers since the absolute URL would become the following

opc.tcp://192.168.120.237:4840/nsu=http://example.org/SpecialNamespace/;s=\"LED\".\"State\"

A more Web style friendly format would be to use query parameters

opc.tcp://192.168.120.237:4840/?nsu=http://example.org/SpecialNamespace/&s=\"LED\".\"State\"

Note1: the difference is the ? after the host part and that arguments get concatenated via & and not ;

Note2: it has the downside that one needs to reconstruct the format string at OPC UA side.

Question: I wonder whether people share the concern that if we purely use the OPC UA format string href parsers will destroy/corrupt the information.

Note3: we also need to be careful with some specific characters like # that get interpreted as fragment identifier. That could be the case in the nsu part. Try to parse the following string
opc.tcp://192.168.120.237/nsu=http://example.org/#foo;s=SpecialVar
on https://www.freeformatter.com/url-parser-query-string-splitter.html

@erossignon
Copy link
Contributor Author

erossignon commented Dec 3, 2024

May be one way is to encode the href portion using encodeUriComponent decodeUriComponent

encodeURIComponent("opc.tcp://192.168.120.237:4840/nsu=http://example.org/SpecialNamespace/;s=\"LED\".\"State\"")


'opc.tcp%3A%2F%2F192.168.120.237%3A4840%2Fnsu%3Dhttp%3A%2F%2Fexample.org%2FSpecialNamespace%2F%3Bs%3D%22LED%22.%22State%22'
)

to prevent any clashes

@egekorkan
Copy link
Member

Going with URL encoding was also the preliminary decision taken for MQTT at w3c/wot-binding-templates#292 (comment)

@danielpeintner
Copy link
Member

Going with URL encoding was also the preliminary decision taken for MQTT at w3c/wot-binding-templates#292 (comment)

The problem I see with doing so is that for humans, it is not easy to read the "encoded" string.
(Not sure if this is an actual problem)

Anyhow, It seems we need a general decision that can be taken across all/other bindings...

@egekorkan
Copy link
Member

We had a chat about this together with @sebastiankb @danielpeintner @wiresio and @Kaz040. Here our points, that need to be discussed with the OPC UA community as well.

  • The TDs should be still human-readable: Encoding the entire URI breaks this. This can turn off the ones who are used to OPC UA.
  • It should be easy to copy-paste node ids from a tool like UA Expert to TDs
  • We have evaluated some aspects:
    • Using single quotes inside the double quotes: It is valid JSON, but the UA server can still have double quotes that need to be converted when copy-pasted. Surrounding a string with single quotes is not valid JSON.
    • Double quote seems to be the most annoying part since even when we do not use href it is a special character for JSON. We are not sure how common it is but in Siemens S7, it is quite common.
    • Escaping the quotes would be tolerable for someone copy-pasting. It can be done by humans easily as opposed to URL-encoding.
    • Best would be to use query parameters if people can get used to it.
    • If we want to do encoding, we can simply encode the part after nsu

@sebastiankb
Copy link
Contributor

sebastiankb commented Jan 8, 2025

In the OPC UA WoT Binding WG we discussed 2 relevant options to provide the nodeId information:


Option I:
The value of href can be used with the following patterns:

opc.tcp:// <address>:<port>/?nsu=<namespace>&s=urlEncoded(<stringId>) 
                                            &b=urlEncoded(<base64string>) 
                                            &i=<integer> 
                                            &g=<guid>

Note: The UA server endpoint information opc.tcp:// <address>:<port> can be specified in the base instead.

Pro:

  • compliant to RFC3986
  • compliant to WoT datapoint addressing assumption
  • only for sting-based IDs there is an URL encoding requested; increase readability if not everything is URL encoded

Con:

  • nodeID cannot be used as is (e.g., nsu=http://example.org/SpecialNamespace/;s=\"LED\".\"State\"); must be separated into the query parameters and URL encoding is needed for string-based IDs

Option II:

  1. The value of href is empty or only used for UA server endpoint information (e.g., if base is not used).
  2. A separate term is defined ua:nodeID that takes the UA nodeID as value, e.g.:
...
"href":"",
"ua:nodeID":"nsu=http://example.org/SpecialNamespace/;s=\"LED\".\"State\""
...

Pro:

  • compliant to RFC3986
  • nodeID cannot be used as is (e.g., no URL encoding is needed)

Con:

  • not compliant to WoT datapoint addressing assumption
  • href is most likely always empty

What do you think?

Ping @barnstee @randy-armstrong @erossignon @danielpeintner @egekorkan @relu91

@egekorkan
Copy link
Member

W3C WoT TD Task Force call today:

  • Why is the " escaped and then encoded again
  • @relu91 , @danielpeintner , @lu-zero value consistency, so option 1.
  • namespace can need url encoding too since it can contain # too
  • @sebastiankb why are there multiple query parameters and why not everything is encoded?
  • We should verify that we can always decode, whether it was encoded first or not.

@sebastiankb
Copy link
Contributor

Many thanks for your feedback!

Why is the " escaped and then encoded again

This example coming from a real sample implementation which uses nested " in the string-based nodeID.

namespace can need url encoding too since it can contain # too

OK, good point. It was hoped that the values of queries would be excluded...

@sebastiankb why are there multiple query parameters and why not everything is encoded?

Actually, the answer is already given above.
The idea was to keep the readability of nodeIDs in TDs as high as possible and to avoid the noise of URL encoding as much as possible. But if we also need to encode the namespace, then I see no advantage in defining separate query parameters. In that context, the proposal from @erossignon above makes most sense:

opc.tcp:// <address>:<port>/urlEncoded(<nodeID>)

It seems that a more readable approach is not possible...

@sebastiankb
Copy link
Contributor

sebastiankb commented Jan 9, 2025

There also seems to be a nice alternative where the UA nodeID remains as is (=readability) and keeps the WoT consistency:

opc.tcp:// <address>:<port>/#<nodeID>

The fragment identifier # takes any character, an URL encoding is not needed.

(btw: this approach was also discussed for MQTT in the past)

@barnstee
Copy link

barnstee commented Jan 9, 2025

There also seems to be a nice alternative where the UA nodeID remains as is (=readability) and keeps the WoT consistency:

opc.tcp:// <address>:<port>/#<nodeID>

The fragment identifier # takes any character, an URL encoding is not needed.

(btw: this approach was also discussed for MQTT in the past)

This is perfect!

@relu91
Copy link
Member

relu91 commented Jan 9, 2025

There also seems to be a nice alternative where the UA nodeID remains as is (=readability) and keeps the WoT consistency:

opc.tcp:// <address>:<port>/#<nodeID>

The fragment identifier # takes any character, an URL encoding is not needed.

(btw: this approach was also discussed for MQTT in the past)

It might be a clean solution from the readability point of view, but it still breaks the URI semantics explained in RFC3986 (see my old comment in the thread cited by @sebastiankb).

We can still ofc bend a little bit the rules... but it doesn't convince me 100%. Encoding the string does not seem to be a big burden from my point of view, it has the benefit of working with JSON too... (even using the fragment solution you would still to \" all the quotes)

@sebastiankb
Copy link
Contributor

sebastiankb commented Jan 10, 2025

I have just found an interesting piece of information from the OPC UA specification about NodeIds:

The URI portion of NodeIds are escaped with URI percent encoding as defined in RFC 3986. Semicolons are added to the list of reserved characters of all URI schemes.

I'm not sure I understand the examples presented, e.g.

nsu=tag:acme.com,2023:schemas:data#off%3B;b=M/RbKBsRVkePCePcx24oRA==

Why only # character is encoded here?

I will discuss this in the next UA WG meeting.

@relu91 do you have time to join us on Tuesday at 4 pm CET?

@sebastiankb
Copy link
Contributor

sebastiankb commented Jan 15, 2025

At yesterday's working group meeting, we achieved a breakthrough with the following solution for OPC UA:

opc.tcp://<address>:<port>/?id=<nodeId>

Where is:

    {address} OPC UA server (IP) address
    {port} OPC UA server port number
    {nodeId} OPC UA NodeId with the following expectations:
            1) any hash character (#) must be URL encoded (%23)
            2) any ampersand character (&) must be URL encoded (%26)  
    
    Examples:
      - "href":"opc.tcp://192.168.120.237:4840/?id=ns=10;i=12345"   
      - "href":"opc.tcp://192.168.120.237:4840/?id=nsu=http://widgets.com/schemas/hello;s=水 World"  
      - "href":"/?id=nsu=http://example.com/hello%23;s=temperature"          
      - "href":"ns=10;i=12345" (Note: the corresponding "base" value ends with "/?id=")                           
 

@egekorkan
Copy link
Member

W3C WoT TD TF call today:

  • We should add something saying that "Encoder SHOULD be special and only encode # and &" but this is not a MUST since nothing would break if someone encodes everything, other than humans trying to read it.
  • For the implementers, we should tell them that they can just decode it using any url decoder.
  • Following the both points above: Most compression algorithms specify the decoder and the encoder can be implemented as wanted (smarter or not).
  • We should probably escape % as well. We are not aware of any node id containing that but it is theoratically possible.
  • The spec text should reference both "URL Encoding/Decoding" and "Percent Encoding/Decoding". The latter is apparently the correct wording. See https://en.wikipedia.org/wiki/Percent-encoding

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants