Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QCFilterTransform should have richer syntax for describing tests and outputs #398

Open
davidpablocohn opened this issue Oct 1, 2024 · 5 comments

Comments

@davidpablocohn
Copy link
Collaborator

And maybe shouldn't be called QCFilterTransform

Args are currently limited to a string of comma-separated var:lower:upper specs. Would be much more powerful if it took a dict such that you could specify:

kwargs:
  fields:
    RTMP:
      higher_than:
        value: -40
        message: 'Temps too low: {RTMP}'
      lower_than:
        value: 40
        message: 'Temps too high: {RTMP}'
      does_not_contain:
        value 'NaN'
        message: 'Received NaN for RTMP'
        max_times: 5

Could be called TestValueTransform (?) and could have tests like

  • lower_than
  • higher_than
  • lower_than_or_equal
  • higher_than_or_equal
  • contains (with a 'negate' arg?, or the following)
  • does_not_contain
  • matches_regex

and...?

@davidpablocohn
Copy link
Collaborator Author

davidpablocohn commented Oct 4, 2024

Now thinking that rather than assuming that a canned message should be issued if the test passes/fails, there should be a few different actions available, like 'skip', 'replace', 'log_message', or ...? Could, in theory, merge with Lewis Wilkie's ValueFilterIgnoreTransform (OceanDataTools#4).

    TWNC:
      lower_than:
        value: 0.1
        action: skip    # mutually exclusive with 'message'

If we wanted to go all Swiss army knife on this, we could even have the transform take an optional writer to divert/send messages to, but maybe that's over the top?

@webbpinner
Copy link
Contributor

Interestingly the Sealog/InfluxDB integration does something similar whereby a value can be modified automatically (*-1 on GGA altitude to always get a positive depth) or if a test condition is met. I used a short-hand notation such as lt (<), lte (<=), gt (>), gte (>=), eq (==), ne (!=)

so the resulting yaml syntax looks like:

<field_name>:
  modify:
    - operation:
        - multiply: -1
<field_name>:
  modify:
    - test:
        - eq: 0
      operation:
        - setTo: null

It can also apply a modification based on the value of a different field_value:

<field_name>:
  modify:
    - test:
        - field: <other_field_name>
          lt: 0
      operation:
        - multiply: -1
        - (action 2)

I'm not suggesting this exact schema be used, just providing for context/discussion.

@davidpablocohn
Copy link
Collaborator Author

davidpablocohn commented Oct 4, 2024 via email

@webbpinner
Copy link
Contributor

In the case of Sealog the other field has to be include in the DB query... but I also include an option to not include queried fields in the output to Sealog. At first glance this might not make sense but this is how I'm able to translate NMEA GGA lat/lng to ddegs. I simply setup the modify block to look at the NorS, EorW fields and perform sign flips on the latitude, longitude fields but then exclude the NorS, EorW field vaules from the output.

@davidpablocohn
Copy link
Collaborator Author

davidpablocohn commented Oct 4, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants