-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade validation methods #1911
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## release-3.0.0 #1911 +/- ##
=================================================
- Coverage 92.69% 92.69% -0.01%
=================================================
Files 27 28 +1
Lines 2684 2710 +26
Branches 706 709 +3
=================================================
+ Hits 2488 2512 +24
- Misses 127 128 +1
- Partials 69 70 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
@rly this should be ready for review There are a couple remaining TODO items related to validation support for NWBZarr files using the file path that could potentially be bumped to a separate PR:
|
test.py
Outdated
@@ -169,7 +169,7 @@ def validate_nwbs(): | |||
is_family_nwb_file = False | |||
try: | |||
with pynwb.NWBHDF5IO(nwb, mode='r') as io: | |||
errors = pynwb.validate(io) | |||
errors = validate(io, use_cached_namespaces=False) # previously io did not validate against cached namespaces |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
errors = validate(io, use_cached_namespaces=False) # previously io did not validate against cached namespaces | |
errors = validate(io, use_cached_namespaces=False), | |
errors.append(validate(io, use_cached_namespaces=True)) |
- I think the original comment here might not make sense in the future
- The example NWB files are all generated using the version of NWB being tested, and because pynwb caches the spec by default, there should be no difference between using cached namespaces and not.
- Since the
pynwb.validate
andpynwb-validate
should be the same now, we don't really need this test anymore since we have thepynwb-validate
test below. But since thisvalidate_nwbs()
function is super conservative in its testing of every combination, then for consistency, I suggest we validate with bothuse_cached_namespaces=True
anduse_cached_namespaces=False
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I set use_cached_namespaces=True
here, I get several errors when running the validate_nwb()
section of test.py
. I had added use_cached_namespaces=False
so that it matched the previous test behavior, but if it is expected that there should be no difference between using the cached namespaces and not in this particular case, then maybe these errors are indicative of another issue?
You can replicate by running test.py
, but the errors look like this below. Maybe the mylab extension generation needs to be updated?:
2024-12-23 10:12:02,969 - INFO - Validating with pynwb.validate method.
Traceback (most recent call last):
File "/Users/smprince/anaconda3/envs/pynwb/lib/python3.11/site-packages/hdmf/validate/validator.py", line 280, in get_validator
return self.__validators[dt]
~~~~~~~~~~~~~~~~~^^^^
KeyError: 'NWBFile'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/smprince/Documents/code/pynwb/test.py", line 172, in validate_nwbs
errors = validate(io, use_cached_namespaces=True) # previously io did not validate against cached namespaces
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/smprince/anaconda3/envs/pynwb/lib/python3.11/site-packages/hdmf/utils.py", line 672, in func_call
return func(**pargs)
^^^^^^^^^^^^^
File "/Users/smprince/Documents/code/pynwb/src/pynwb/validation.py", line 191, in validate
validation_errors += _validate_helper(io=io, namespace=validation_namespace)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/smprince/Documents/code/pynwb/src/pynwb/validation.py", line 19, in _validate_helper
return validator.validate(builder)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/smprince/anaconda3/envs/pynwb/lib/python3.11/site-packages/hdmf/utils.py", line 668, in func_call
return func(args[0], **pargs)
^^^^^^^^^^^^^^^^^^^^^^
File "/Users/smprince/anaconda3/envs/pynwb/lib/python3.11/site-packages/hdmf/validate/validator.py", line 298, in validate
validator = self.get_validator(dt)
^^^^^^^^^^^^^^^^^^^^^^
File "/Users/smprince/anaconda3/envs/pynwb/lib/python3.11/site-packages/hdmf/utils.py", line 668, in func_call
return func(args[0], **pargs)
^^^^^^^^^^^^^^^^^^^^^^
File "/Users/smprince/anaconda3/envs/pynwb/lib/python3.11/site-packages/hdmf/validate/validator.py", line 283, in get_validator
raise ValueError(msg)
ValueError: data type 'NWBFile' not found in namespace mylab
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these lines:
pynwb/docs/gallery/general/extensions.py
Line 41 in 80fc386
ns_builder.include_type("ElectricalSeries", namespace="core") |
pynwb/docs/gallery/general/extensions.py
Line 267 in 80fc386
ns_builder.include_type("NWBDataInterface", namespace="core") |
could be updated to:
ns_builder.include_namespace("core")
to fix these errors. I think these changes more closely match the latest version of ndx-template create_extension_spec.py
file. However, should ns_builder.include_type
still work without including the entire namespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. For now, let's update that to ns_builder.include_namespace("core")
.
We need to think through what it means to run the pynwb validator on a file, validating against a particular non-core namespace. Should the core namespace always be included during validation, regardless of whether the extension includes the core namespace, and the choices are either use the core namespace cached in the file or the one installed with pynwb? I think so...
I think we eventually want to move toward not even allowing validation against a particular namespace. Either the file is valid given its cached namespaces (or the namespaces installed by pynwb and loaded by the user), or the file is not. Otherwise, we run into weird issues with such as:
hdmf-dev/hdmf#608 and hdmf-dev/hdmf#525
I would say let's make the above change for now, and iterate on these other ideas in a separate PR which does not need to make it in pynwb 3.0.0rc1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the core namespace always be included during validation, regardless of whether the extension includes the core namespace, and the choices are either use the core namespace cached in the file or the one installed with pynwb? I think so...
I think so too? But agreed it would definitely be helpful to compile and discuss all these different scenarios to get a better idea of the end goal validation behavior
I would say let's make the above change for now, and iterate on these other ideas in a separate PR which does not need to make it in pynwb 3.0.0rc1
Sounds good!
Co-authored-by: Ryan Ly <rly@lbl.gov>
Motivation
Addresses several issues summarized in #1808. This is a breaking change for the next major release.
validate
returns the same output whetherpaths
orio
is providedpynwb-validate
entry pointpython -m pynwb.validation_cli "test.nwb"
as an alternative to the entry pointvalidate
andvalidation cli
docstrings to point users to NWB inspectorpynwb.validate
intopynwb.validation
module so thatvalidate
is the function name andvalidation
is the module nameThis PR also modifies the validate method so that it
TODO
finish testing with(bump to later PR)ZarrIO
with file path validationChecklist
flake8
from the source directory.