Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix HDF5 dataset #1545

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Fix HDF5 dataset #1545

wants to merge 1 commit into from

Conversation

amikholap
Copy link

Hi.

I tried to use .h5 file as data source and it didn't go smooth.
Here are some fixes that allowed me to iterate through a file created using pandas.HDFStore.

  1. Don't provide parameters for superclass __new__ method (object.__new__).
    Seems some time ago HDF5Dataset was based on DenseDesignMatrix but now its just Dataset and this code path breaks.
  2. Add function calls where required.
  3. Fix interaction with iterators.
    Not sure how the interaction is designed to be.
    One problem was that pytables returns numpy array of lists instead of tuples and this breaks https://github.com/lisa-lab/pylearn2/blob/master/pylearn2/space/__init__.py#L1481 since the array is 1-dimensional.
    Second one is iterator calling np_format_as on tuple of tuples which fails is_numeric_batch check.

I'm a total novice with pylearn2. Please tell me if this can be implemented in more canonical way.

1. Don't provide parameters for superclass __new__ method (object.__new__).
2. Add function calls where required.
3. Fix interaction with iterators.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant