Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when calling fit_transform for Vset with is_async=True #51

Open
jpdunc23 opened this issue Nov 15, 2022 · 0 comments
Open

Error when calling fit_transform for Vset with is_async=True #51

jpdunc23 opened this issue Nov 15, 2022 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@jpdunc23
Copy link
Collaborator

In the example below, when using a Vset with is_async=True, the transform method expects to get a ray.objectRef and call ray.get on it, but instead gets an array:

from vflow import build_vset, init_args

import numpy as np

from sklearn.decomposition import PCA
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.utils import resample

import ray

ray.init(num_cpus=4)

X, y = make_regression(n_samples=1000, n_features=100, n_informative=1)

X_trainval, X_test, y_trainval, y_test = train_test_split(X, y)
X_train, X_val, y_train, y_val = train_test_split(X_trainval, y_trainval)

X_train, y_train = init_args([X_train, y_train], names=['X_train', 'y_train'])
X_val, y_val = init_args([X_val, y_val], names=['X_val', 'y_val'])

# create a Vset for bootstrapping from data 10 times
# we use lazy=True so that the data will not be resampled until needed
boot_set = build_vset('boot', resample, reps=10, lazy=True)

# bootstrap from training data by calling boot_fun
X_trains, y_trains = boot_set(X_train, y_train)

# hyperparameters to try
pca_params = {
    'n_components': [10, 20, 50],
    'svd_solver': ['randomized', 'full', 'auto']
}

# we could instead pass a list of distinct models and corresponding param dicts
pca_set = build_vset('PCA', PCA, pca_params, is_async=True)

X_trains_pca = pca_set.fit_transform(X_trains)
TypeError: Attempting to call `get` on the value [[-0.73763296 -1.64044139 -0.74793088 ... -0.1085027  -0.25652127
   0.11583096]
...

See #50 for a possible workaround until this is fixed.

@jpdunc23 jpdunc23 added the bug Something isn't working label Nov 15, 2022
@jpdunc23 jpdunc23 self-assigned this Nov 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant