From 3fccd65416e1cecc5e354cb3a37933a06efb037c Mon Sep 17 00:00:00 2001 From: Shaun Date: Mon, 9 Dec 2024 14:32:53 +0530 Subject: [PATCH 1/8] improve wording --- air-gapped_installations/README.md | 100 ++++++++++++++--------------- 1 file changed, 49 insertions(+), 51 deletions(-) diff --git a/air-gapped_installations/README.md b/air-gapped_installations/README.md index b1b76e22..f7943128 100644 --- a/air-gapped_installations/README.md +++ b/air-gapped_installations/README.md @@ -1,79 +1,77 @@ # Air-gapped Installation of Custom Recipes -Air gapping is a network security measure employed on one or more computers -to ensure that a secure computer network is physically isolated from unsecured networks, -such as the public Internet or an unsecured local area network. +Air gapping is a network security measure used to isolate one or more computers from unsecured networks, such as the public Internet or an unsecured local area network. This ensures that the secure computer network remains physically separated from external, potentially unsafe networks. -This documentation will guide you through the installation of Custom Recipes in such air-gapped environment. +This guide walks you through the installation process of Custom Recipes in an air-gapped environment. ## Prerequisite -- There are two DAI installations involved here. One is in an **Air-Gapped Machine** and second is in an **Internet-Facing Machine**. They will be referred this way here to avoid confusion. -- First you need to install DAI in an air gapped environment *ie. in a computer isolated from internet*. -This DAI can be installed in any Package Type (TAR SH, Docker, DEB etc.) available in https://www.h2o.ai/download/ -- While in the Internet Facing Machine, clone this repository for now and checkout the branch of your DAI `VERSION` +- Two DAI installations are required: one on an **Air-Gapped Machine** and the other on an **Internet-Facing Machine**. These terms will be used throughout the document for clarity. +- First, install DAI on the air-gapped machine (a machine isolated from the Internet). DAI can be installed using any available package type (e.g., TAR SH, Docker, DEB). You can download the installation packages from [H2O.ai Downloads](https://h2o.ai/resources/download/). +- On the Internet-facing machine, clone the repository and check out the appropriate branch for your DAI `VERSION` ``` git clone https://github.com/h2oai/driverlessai-recipes.git cd driverlessai-recipes git checkout rel-VERSION # eg. git checkout rel-1.9.0 for DAI 1.9.0 ``` -## Installation Guide -Follow along these steps to use custom recipes for DAI in an air-gapped environment: +## Installation Guide +Follow the steps below to use custom recipes for DAI in an air-gapped environment: -*Note: following steps need to be performed in Internet Facing Machine* +**Note**: *These steps should be performed on the Internet-Facing Machine.* -- Download the required version of Driverless AI TAR SH installer in a Internet facing machine from https://www.h2o.ai/download/ +1. Download the required version of the Driverless AI TAR SH installer on the Internet-Facing Machine from [H2O.ai Downloads](https://www.h2o.ai/download/). -- Run the following commands to install the Driverless AI TAR SH. Replace `VERSION` with your specific version. +2. Run the following commands to install the Driverless AI TAR SH. Replace `VERSION` with the specific version you need. + ``` + chmod 755 dai-VERSION.sh + ./dai-VERSION.sh + ``` -``` - chmod 755 dai-VERSION.sh - ./dai-VERSION.sh -``` -- Now cd to the unpacked directory. -``` - cd dai-VERSION -``` -- Copy the load_custom_recipe.py script from `driverlessai-recipes/air-gapped_installations` to `dai-VERSION` +3. Next, `cd` to the unpacked directory: + ``` + cd dai-VERSION + ``` -- Run the following python script, either in one of the following ways: +4. Copy the `load_custom_recipe.py` script from `driverlessai-recipes/air-gapped_installations` to `dai-VERSION`. - a) To load custom recipes from a local file. +5. Run the following Python script, either in one of the following ways: + + - **To load custom recipes from a local file:** ``` ./dai-env.sh python load_custom_recipe.py -username -p `` >> load_custom_recipe.log ``` - where `` is the username, e.g. jon and `` is the path to a recipe you want to upload to DAI. + - ``: The username (e.g., `jon`). + - ``: The path to the recipe you want to upload to DAI. - For example to load [daal_trees recipe](https://github.com/h2oai/driverlessai-recipes/blob/rel-1.8.8/models/algorithms/daal_trees.py) from the cloned driverlessai-recipes repo we do: - ``` - ./dai-env.sh python load_custom_recipe.py -username jon -p /home/ubuntu/driverlessai-recipes/models/algorithms/daal_trees.py >> load_custom_recipe.log - ``` + For example, to load the [daal_trees recipe](https://github.com/h2oai/driverlessai-recipes/blob/rel-1.8.8/models/algorithms/daal_trees.py) from the cloned `driverlessai-recipes` repo: + ``` + ./dai-env.sh python load_custom_recipe.py -username jon -p /home/ubuntu/driverlessai-recipes/models/algorithms/daal_trees.py >> load_custom_recipe.log + ``` - b) To load custom recipes from a URL. + - **To load custom recipes from a URL:** ``` ./dai-env.sh python load_custom_recipe.py -username -u >> load_custom_recipe.logg ``` - where `` is an http link for a url. + - ``: The URL to the custom recipe. - For example to load [catboost recipe](https://github.com/h2oai/driverlessai-recipes/blob/rel-1.8.8/models/algorithms/catboost.py) from url we do: - ``` - ./dai-env.sh python load_custom_recipe.py -username jon -u https://github.com/h2oai/driverlessai-recipes/blob/rel-1.8.8/models/algorithms/catboost.py >> load_custom_recipe.log - ``` - **Note:** you can check the `load_custom_recipe.log` file to see if the operation was successful. + For example, to load the [catboost recipe](https://github.com/h2oai/driverlessai-recipes/blob/rel-1.8.8/models/algorithms/catboost.py) from a URL: + ``` + ./dai-env.sh python load_custom_recipe.py -username jon -u https://github.com/h2oai/driverlessai-recipes/blob/rel-1.8.8/models/algorithms/catboost.py >> load_custom_recipe.log + ``` + **Note:** *You can check the `load_custom_recipe.log` file to verify if the operation was successful.* -- Once the above script was executed successfully, custom recipes and python dependencies will be installed in the - `dai-VERSION///contrib` directory, - where `` is `tmp` by default. +6.Once the script has been executed successfully, custom recipes and Python dependencies will be installed in the `dai-VERSION///contrib` directory, where `` is `tmp` by default. -- Zip the `dai-VERSION/tmp/contrib` directory and move it to the air-gapped machine and unzip there into the DAI `tmp` directory. -``` - cd dai-VERSION/``/ - zip -r user_contrib.zip ``/contrib - scp user_contrib.zip ``@``:`` -``` -- Now in the **Air-gapped Machine**, unzip the file and set permissions if necessary, e.g. -``` - cd `` - unzip user_contrib.zip - chmod -R u+rwx dai:dai ``/contrib -``` +7. Zip the `dai-VERSION/tmp/contrib` directory and move it to the air-gapped machine. Unzip it into the DAI `tmp` directory: + ``` + cd dai-VERSION/``/ + zip -r user_contrib.zip ``/contrib + scp user_contrib.zip ``@``:`` + ``` + +8. Now, on the **Air-Gapped Machine**, unzip the file and set permissions if necessary: + ``` + cd + unzip user_contrib.zip + chmod -R u+rwx dai:dai /contrib + ``` From e827534ebeb84346e7232894569a3dd4e4518a86 Mon Sep 17 00:00:00 2001 From: Shaun Date: Mon, 9 Dec 2024 14:41:43 +0530 Subject: [PATCH 2/8] minor-changes --- air-gapped_installations/README.md | 38 +++++++++++++++--------------- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/air-gapped_installations/README.md b/air-gapped_installations/README.md index f7943128..19588399 100644 --- a/air-gapped_installations/README.md +++ b/air-gapped_installations/README.md @@ -7,12 +7,12 @@ This guide walks you through the installation process of Custom Recipes in an ai ## Prerequisite - Two DAI installations are required: one on an **Air-Gapped Machine** and the other on an **Internet-Facing Machine**. These terms will be used throughout the document for clarity. - First, install DAI on the air-gapped machine (a machine isolated from the Internet). DAI can be installed using any available package type (e.g., TAR SH, Docker, DEB). You can download the installation packages from [H2O.ai Downloads](https://h2o.ai/resources/download/). -- On the Internet-facing machine, clone the repository and check out the appropriate branch for your DAI `VERSION` -``` +- On the Internet-facing machine, clone the repository and check out the appropriate branch for your DAI `VERSION`: + ``` git clone https://github.com/h2oai/driverlessai-recipes.git cd driverlessai-recipes git checkout rel-VERSION # eg. git checkout rel-1.9.0 for DAI 1.9.0 -``` + ``` ## Installation Guide Follow the steps below to use custom recipes for DAI in an air-gapped environment: @@ -22,19 +22,19 @@ Follow the steps below to use custom recipes for DAI in an air-gapped environmen 1. Download the required version of the Driverless AI TAR SH installer on the Internet-Facing Machine from [H2O.ai Downloads](https://www.h2o.ai/download/). 2. Run the following commands to install the Driverless AI TAR SH. Replace `VERSION` with the specific version you need. - ``` + ``` chmod 755 dai-VERSION.sh ./dai-VERSION.sh - ``` + ``` 3. Next, `cd` to the unpacked directory: - ``` + ``` cd dai-VERSION - ``` + ``` 4. Copy the `load_custom_recipe.py` script from `driverlessai-recipes/air-gapped_installations` to `dai-VERSION`. -5. Run the following Python script, either in one of the following ways: +5. Run the following Python script, in either one of the following ways: - **To load custom recipes from a local file:** ``` @@ -43,10 +43,10 @@ Follow the steps below to use custom recipes for DAI in an air-gapped environmen - ``: The username (e.g., `jon`). - ``: The path to the recipe you want to upload to DAI. - For example, to load the [daal_trees recipe](https://github.com/h2oai/driverlessai-recipes/blob/rel-1.8.8/models/algorithms/daal_trees.py) from the cloned `driverlessai-recipes` repo: - ``` - ./dai-env.sh python load_custom_recipe.py -username jon -p /home/ubuntu/driverlessai-recipes/models/algorithms/daal_trees.py >> load_custom_recipe.log - ``` + For example, to load the [daal_trees recipe](https://github.com/h2oai/driverlessai-recipes/blob/rel-1.8.8/models/algorithms/daal_trees.py) from the cloned `driverlessai-recipes` repo: + ``` + ./dai-env.sh python load_custom_recipe.py -username jon -p /home/ubuntu/driverlessai-recipes/models/algorithms/daal_trees.py >> load_custom_recipe.log + ``` - **To load custom recipes from a URL:** ``` @@ -54,20 +54,20 @@ Follow the steps below to use custom recipes for DAI in an air-gapped environmen ``` - ``: The URL to the custom recipe. - For example, to load the [catboost recipe](https://github.com/h2oai/driverlessai-recipes/blob/rel-1.8.8/models/algorithms/catboost.py) from a URL: - ``` - ./dai-env.sh python load_custom_recipe.py -username jon -u https://github.com/h2oai/driverlessai-recipes/blob/rel-1.8.8/models/algorithms/catboost.py >> load_custom_recipe.log - ``` + For example, to load the [catboost recipe](https://github.com/h2oai/driverlessai-recipes/blob/rel-1.8.8/models/algorithms/catboost.py) from a URL: + ``` + ./dai-env.sh python load_custom_recipe.py -username jon -u https://github.com/h2oai/driverlessai-recipes/blob/rel-1.8.8/models/algorithms/catboost.py >> load_custom_recipe.log + ``` **Note:** *You can check the `load_custom_recipe.log` file to verify if the operation was successful.* -6.Once the script has been executed successfully, custom recipes and Python dependencies will be installed in the `dai-VERSION///contrib` directory, where `` is `tmp` by default. +6. Once the script has been executed successfully, custom recipes and Python dependencies will be installed in the `dai-VERSION///contrib` directory, where `` is `tmp` by default. 7. Zip the `dai-VERSION/tmp/contrib` directory and move it to the air-gapped machine. Unzip it into the DAI `tmp` directory: - ``` + ``` cd dai-VERSION/``/ zip -r user_contrib.zip ``/contrib scp user_contrib.zip ``@``:`` - ``` + ``` 8. Now, on the **Air-Gapped Machine**, unzip the file and set permissions if necessary: ``` From d4c90bbce792a2704a85b187fb370910913308fc Mon Sep 17 00:00:00 2001 From: Shaun Date: Mon, 9 Dec 2024 15:46:17 +0530 Subject: [PATCH 3/8] Update FAQ.md --- FAQ.md | 166 +++++++++++++++++++++++++++++++-------------------------- 1 file changed, 90 insertions(+), 76 deletions(-) diff --git a/FAQ.md b/FAQ.md index da1c1b28..d4f0102c 100644 --- a/FAQ.md +++ b/FAQ.md @@ -1,79 +1,93 @@ -# H2O Driverless AI Bring Your Own Recipes +# H2O Driverless AI: Bring Your Own Recipes ## FAQ - #### Why do I need to bring my own recipes? Isn't Driverless AI smart enough of the box? - The only way to find out is to try. Most likely you'll be able to improve performance with custom recipes. Domain knowledge and intuition are essential to getting the best possible performance. - - #### What are some example recipes? - * Look at the [examples in this repository](https://github.com/h2oai/driverlessai-recipes/blob/master/README.md#sample-recipes). Some illustrative samples: - * Transformer: - * Suppose you have a string column that has values like `"A:B:10:5", "A:C:4:10", ...`. It might make sense to split these values by ":" and create four output columns, potentially all numeric, such as `[0,1,10,5], [0,2,4,10], ...` to encode the information more clearly for the algorithm to learn better from. - * PyTorch deep learning model for [text similarity analysis](https://github.com/h2oai/driverlessai-recipes/blob/master/transformers/nlp/text_embedding_similarity_transformers.py), computes a similary score for any given two text input columns. - * ARIMA model for [time-series forecasting](https://github.com/h2oai/driverlessai-recipes/blob/master/transformers/timeseries/auto_arima_forecast.py). - * Data augmentation, such as replacing a zip code with demographic information, or replacing a date column with a [National holiday flag](https://github.com/h2oai/driverlessai-recipes/blob/master/transformers/augmentation/singapore_public_holidays.py). - * Model: - * All [H2O-3 Algorithms including H2O AutoML](https://github.com/h2oai/driverlessai-recipes/blob/master/models/algorithms/h2o-3-models.py) - * Yandex [CatBoost](https://github.com/h2oai/driverlessai-recipes/blob/master/models/algorithms/catboost.py) gradient boosting - * A custom loss function for [LightGBM](https://github.com/h2oai/driverlessai-recipes/blob/master/models/custom_loss/lightgbm_with_custom_loss.py) or [XGBoost](https://github.com/h2oai/driverlessai-recipes/blob/master/models/custom_loss/xgboost_with_custom_loss.py) - * Scorer: - * Maybe you want to optimize your predictions for the [top decile](https://github.com/h2oai/driverlessai-recipes/blob/master/scorers/regression/top_decile.py) for a regression problem.. - * Maybe you care about the [false discovery rate](https://github.com/h2oai/driverlessai-recipes/blob/master/scorers/classification/binary/false_discovery_rate.py) for a binary classification problem. - * Explainer: - * Create custom recipes for model interpretability, fairness, robustness, explanations - * Create custom plots, charts, markdown reports, etc. - - #### Driverless is good enough for me, I don't want to do recipes. - Perfect. Relax and sit back. We'll keep making Driverless AI better and better with every version, so you don't have to. - Several of the recipes in this repository will likely be included in future releases of Driverless AI out of the box, after more performance improvements and hardening. - #### What's in it for me if I write a recipe? - You will get better at doing data science and you will get better results. Writing code is essential to improving your data science skills. Especially when writing data science code. Recipes are perfect for that. - #### Who can make recipes? - Anyone who can or wants to. Mostly data scientists or developers. Some of the best recipes are trivial and make a big difference, like custom scorers. - #### What do I need to make a recipe? - A text editor. All you need is to create a `.py` text file containing source code. - #### How do I start? - * Examine the [references](https://github.com/h2oai/driverlessai-recipes#reference-guide) below for the API specification and architecture diagrams. - * Look at the [examples in this repository](https://github.com/h2oai/driverlessai-recipes/blob/master/README.md#sample-recipes). - * Clone this repository and make modifications to existing recipes. - * Start an experiment and upload the recipe in the expert settings of an experiment. - #### What version of Python does Driverless AI use? - Driverless AI uses Python version 3.6, so all custom recipes will run with Python 3.6 as well. - #### How do I know whether my recipe works? - Driverless AI will tell you whether it makes the cut: - * First, it is subjected to acceptance tests. If it passes, great. If not, Driverless AI provides you with feedback on how to improve it. - * Then, you can choose to include it in your experiment(s). It will decide which recipes are best suited to solve the problem. At worst, you can cause the experiment to slow down. - #### How can I debug my recipe? - * The easiest way (for now) is to keep uploading it to the expert settings in Driverless AI until the recipe is accepted. - * Another way is to do minimal changes as shown in [this debugging example](./transformers/how_to_debug_transformer.py) and use PyCharm or a similar Python debugger. - #### What happens if my recipe is rejected during upload? - * Read the entire error message, it most likely contains the stack trace and helpful information on how to fix the problem. - * If you can't figure out how to fix the recipe, we suggest you post your questions in the [Driverless AI community Slack channel](https://www.h2o.ai/community/driverless-ai-community/#chat) - * You can also send us your experiment logs zip file, which will contain the recipe source files. - #### What happens if my transformer recipe doesn't lead to the highest variable importance for the experiment? - That's nothing to worry about. It's unlikely that your features have the strongest signal of all features. Even 'magic' Kaggle grandmaster features don't usually make a massive difference, but they still beat most of the competition. - #### What happens if my recipe is not used at all by the experiment? - * Don't give up. You learned something. - * Check the logs for failures if unsure whether the recipe worked at all or not. - * Driverless AI will ignore recipe failures unless this robustness feature is specifically disabled. Under Expert Settings, disable `skip_transformer_failures` and `skip_model_failures` if you want to fail the experiment on any unexpected errors due to custom recipes. - * Inside the experiment logs zip file, there's a folder called `details` and if it contains `.stack` files with stacktraces referring to your custom code, then you know it bombed. - #### Can I write recipes in Go, C++, Java or R? - If you can hook it up to Python, then yes. We have many recipes that use Java and C++ backends. Most of Driverless AI uses C++ backends. - #### Is there a difference between a custom recipe and the recipes shipped with Driverless AI? - No. Same code base. No performance penalty. No calling overhead. Same inputs and outputs. - #### Why are some models implemented as transformers? - Separating of work. With the transformer API, we can replace *only* the particular input column(s) with out-of-fold estimates of the target column. All other columns (features) can be processed by other transformers. The combined union of all features is then passed to the model(s) which can yield higher accuracy than a model that only sees the particular input column(s). For more information about the flow of data, see the technical [references](https://github.com/h2oai/driverlessai-recipes#reference-guide) section. - #### How can I control which custom recipes are active, and how can I disable all custom recipes? - Recipes are meant to be built by people you trust and each recipe should be code-reviewed before going to production. If you don't want custom code to be executed by Driverless AI, set `enable_custom_recipes=false` in the config.toml, or add the environment variable `DRIVERLESS_AI_ENABLE_CUSTOM_RECIPES=0` at startup of Driverless AI. This will disable all custom transformers, models and scorers. If you want to keep all previously uploaded recipes enabled and disable the upload of any new recipes, set `enable_custom_recipes_upload=false` or `DRIVERLESS_AI_ENABLE_CUSTOM_RECIPES_UPLOAD=0` at startup of Driverless AI. - #### What if I keep changing the same recipe over and over? - If you upload a new version of a recipe, it will become the new default version for that recipe. Previously run experiments using older versions of that recipe will continue to work, and use the older version. New experiments will use the new version. - #### Who can see my recipe? - Everyone with access to the Driverless AI instance can run all recipes, even if they were uploaded by someone else. Recipes remains on the instance that runs Driverless AI. Experiment logs may contain relevant information about your recipes (such as their source code), so double-check before you share them. - #### How do I delete all recipes on my instance? - If you really need to delete all recipes, you can delete the `contrib` folder inside the `data_directory` (usually called `tmp`) and restart Driverless AI. Caution: Previously created experiments using custom recipes will not be able to make predictions any longer, so this is not recommended unless you also delete all related experiments as well. - #### Are MOJOs supported for experiments that use custom recipes? - In most cases (especially for complex recipes), MOJOs won’t be available out of the box. But, it is possible to get the MOJO. Contact support@h2o.ai for more information about creating MOJOs for custom recipes. (**Note**: The Python Scoring Pipeline features full support for custom recipes.) - #### How do I share my recipe with the world? - We encourage you to share your recipe in this repository. If your recipe works, please make a pull request and improve the experience for everyone! - + +### Why do I need to bring my own recipes? +Custom recipes can improve performance. Domain knowledge and intuition are essential for achieving optimal results. + +### What are some example recipes? +See the [examples in this repository](https://github.com/h2oai/driverlessai-recipes/blob/master/README.md#sample-recipes). Examples include: + +- **Transformer**: + - Split string columns into multiple numeric columns (e.g., `"A:B:10:5"` becomes `[0,1,10,5]`). + - **PyTorch** deep learning model for [text similarity analysis](https://github.com/h2oai/driverlessai-recipes/blob/master/transformers/nlp/text_embedding_similarity_transformers.py). + - **ARIMA** model for [time-series forecasting](https://github.com/h2oai/driverlessai-recipes/blob/master/transformers/timeseries/auto_arima_forecast.py). + - **Data augmentation**, like replacing zip codes with demographic info or using a [national holiday flag](https://github.com/h2oai/driverlessai-recipes/blob/master/transformers/augmentation/singapore_public_holidays.py). + +- **Model**: + - H2O-3 [algorithms](https://github.com/h2oai/driverlessai-recipes/blob/master/models/algorithms/h2o-3-models.py), including H2O AutoML. + - **Yandex [CatBoost](https://github.com/h2oai/driverlessai-recipes/blob/master/models/algorithms/catboost.py)** gradient boosting. + - Custom loss functions for [LightGBM](https://github.com/h2oai/driverlessai-recipes/blob/master/models/custom_loss/lightgbm_with_custom_loss.py) or [XGBoost](https://github.com/h2oai/driverlessai-recipes/blob/master/models/custom_loss/xgboost_with_custom_loss.py). + +- **Scorer**: + - Optimize for the [top decile](https://github.com/h2oai/driverlessai-recipes/blob/master/scorers/regression/top_decile.py) in regression tasks. + - Improve the [false discovery rate](https://github.com/h2oai/driverlessai-recipes/blob/master/scorers/classification/binary/false_discovery_rate.py) for binary classification. + +- **Explainer**: + - Create custom recipes for model interpretability, fairness, robustness, and explanations. + - Generate custom plots, charts, markdown reports, and more. + +### Is Driverless AI sufficient without custom recipes? +Driverless AI continues to improve with each version, so you may not need custom recipes. However, adding your own recipes can optimize performance for specific use cases. + +### What's in it for me if I write a recipe? +Writing recipes improves your data science skills and helps achieve better results. It is one of the best ways to enhance your expertise. + +### Who can write recipes? +Anyone with the necessary expertise can contribute. Data scientists and developers typically write recipes, though even simple recipes can have a significant impact. + +### What do I need to write a recipe? +A text editor and knowledge of Python. Recipes are written as `.py` files with the source code. + +### How do I start? +- Review the [API specifications](https://github.com/h2oai/driverlessai-recipes#reference-guide) and architecture diagrams. +- Review the [examples in this repository](https://github.com/h2oai/driverlessai-recipes/blob/master/README.md#sample-recipes). +- Upload your recipe in the Expert Settings during the experiment setup. + +### What version of Python does Driverless AI use? +Driverless AI uses Python 3.6. Ensure your recipes are compatible with this version. + +### How do I know if my recipe works? +Driverless AI will notify you whether your recipe passes the acceptance tests. If it fails, feedback will guide you on how to fix it. + +### How can I debug my recipe? +Upload your recipe to the Expert Settings and use the experiment log for debugging. Alternatively, make minimal changes as shown in [this debugging example](./transformers/how_to_debug_transformer.py) and debug with a Python debugger, like PyCharm. + +### What happens if my recipe is rejected during upload? +Review the error message, which usually includes a stack trace and hints for fixing the issue. If you need help, ask questions in the [Driverless AI community Slack channel]](https://www.h2o.ai/community/driverless-ai-community/#chat). You can also send your experiment logs zip file, which will contain the recipe source files. + +### What if my transformer recipe doesn't lead to the highest variable importance? +Features created by your transformer might not have the strongest signal, but they can still improve the overall model performance. + +### What happens if my recipe is not used in the experiment? +Driverless AI will use the best-performing recipes. Check the experiment logs for errors related to your recipe. You can also disable recipe failures in Expert Settings. + +### Can I write recipes in Go, C++, Java, or R? +You can use any language as long as you can interface it with Python. Many recipes rely on Java and C++ backends. + +### Is there a difference between custom recipes and those shipped with Driverless AI? +Custom recipes are treated the same as built-in recipes. There is no performance penalty or calling overhead. + +### Why are some models implemented as transformers? +The transformer API allows flexibility. For example, transformers can process specific input columns while leaving others unchanged, resulting in improved accuracy. + +### How can I control which custom recipes are active? How can I disable all custom recipes? +Recipes can be disabled by setting `enable_custom_recipes=false` in the `config.toml` file or using the `DRIVERLESS_AI_ENABLE_CUSTOM_RECIPES=0` environment variable. To disable uploading new recipes, set `enable_custom_recipes_upload=false` or `DRIVERLESS_AI_ENABLE_CUSTOM_RECIPES_UPLOAD=0`. + +### What if I keep changing the same recipe? +When you upload a new version of a recipe, it becomes the default. Older experiments will continue using the previous version. + +### Who can see my recipe? +Anyone with access to the Driverless AI instance can run any uploaded recipe, but recipes are shared only within the instance. + +### How do I delete all recipes on my instance? +To delete all recipes, remove the `contrib` folder from the data directory (usually `tmp`) and restart Driverless AI. This will prevent old experiments from making predictions unless related experiments are also deleted. + +### Are MOJOs supported for experiments that use custom recipes? +In most cases, MOJOs are not available for custom recipes. Contact support@h2o.ai for more details. + +### How do I share my recipe with the community? +Contribute to this repository by making a pull request. If your recipe works, it can help others optimize their experiments. + ## References ### Custom Transformers * sklearn API @@ -98,4 +112,4 @@ ![BYOR Architecture Diagram](reference/DriverlessAI_BYOR.png) ### Webinar Webinar: [Extending the H2O Driverless AI Platform with Your Recipes](https://www.brighttalk.com/webcast/16463/360533/extending-the-h2o-driverless-ai-platform-with-your-recipes) -Website: [H2O Driverless AI Recipes](https://www.h2o.ai/products-h2o-driverless-ai-recipes/) +Website: [H2O Driverless AI Recipes](https://www.h2o.ai/products-h2o-driverless-ai-recipes/) \ No newline at end of file From bc8b2a2d5cc86f84b4ba0c73cef41e169c9fcb1d Mon Sep 17 00:00:00 2001 From: Shaun <124687868+shaunyogeshwaran@users.noreply.github.com> Date: Thu, 12 Dec 2024 23:01:18 +0530 Subject: [PATCH 4/8] Update FAQ.md Co-authored-by: Oshini Nugapitiya <52423997+oshi98@users.noreply.github.com> --- FAQ.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/FAQ.md b/FAQ.md index d4f0102c..3b5a9505 100644 --- a/FAQ.md +++ b/FAQ.md @@ -41,7 +41,7 @@ A text editor and knowledge of Python. Recipes are written as `.py` files with t ### How do I start? - Review the [API specifications](https://github.com/h2oai/driverlessai-recipes#reference-guide) and architecture diagrams. - Review the [examples in this repository](https://github.com/h2oai/driverlessai-recipes/blob/master/README.md#sample-recipes). -- Upload your recipe in the Expert Settings during the experiment setup. +- Upload your recipe in the Expert Settings section during the experiment setup. ### What version of Python does Driverless AI use? Driverless AI uses Python 3.6. Ensure your recipes are compatible with this version. From 4d6eb51ec7e1f3b918468a62368c6e6b01fb9af7 Mon Sep 17 00:00:00 2001 From: Shaun <124687868+shaunyogeshwaran@users.noreply.github.com> Date: Thu, 12 Dec 2024 23:01:37 +0530 Subject: [PATCH 5/8] Update FAQ.md Co-authored-by: Oshini Nugapitiya <52423997+oshi98@users.noreply.github.com> --- FAQ.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/FAQ.md b/FAQ.md index 3b5a9505..ec855fbe 100644 --- a/FAQ.md +++ b/FAQ.md @@ -53,7 +53,7 @@ Driverless AI will notify you whether your recipe passes the acceptance tests. I Upload your recipe to the Expert Settings and use the experiment log for debugging. Alternatively, make minimal changes as shown in [this debugging example](./transformers/how_to_debug_transformer.py) and debug with a Python debugger, like PyCharm. ### What happens if my recipe is rejected during upload? -Review the error message, which usually includes a stack trace and hints for fixing the issue. If you need help, ask questions in the [Driverless AI community Slack channel]](https://www.h2o.ai/community/driverless-ai-community/#chat). You can also send your experiment logs zip file, which will contain the recipe source files. +Review the error message, which usually includes a stack trace and hints for fixing the issue. If you need help, ask questions in the [Driverless AI community Slack channel](https://www.h2o.ai/community/driverless-ai-community/#chat). You can also send your experiment logs zip file, which will contain the recipe source files. ### What if my transformer recipe doesn't lead to the highest variable importance? Features created by your transformer might not have the strongest signal, but they can still improve the overall model performance. From e7d82d80a1f064796be88a565c5cf13a21885b9c Mon Sep 17 00:00:00 2001 From: Shaun <124687868+shaunyogeshwaran@users.noreply.github.com> Date: Mon, 16 Dec 2024 11:48:31 +0530 Subject: [PATCH 6/8] Update FAQ.md Co-authored-by: Oshini Nugapitiya <52423997+oshi98@users.noreply.github.com> --- FAQ.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/FAQ.md b/FAQ.md index ec855fbe..c263a6e4 100644 --- a/FAQ.md +++ b/FAQ.md @@ -59,7 +59,7 @@ Review the error message, which usually includes a stack trace and hints for fix Features created by your transformer might not have the strongest signal, but they can still improve the overall model performance. ### What happens if my recipe is not used in the experiment? -Driverless AI will use the best-performing recipes. Check the experiment logs for errors related to your recipe. You can also disable recipe failures in Expert Settings. +H2O Driverless AI will use the best-performing recipes. Check the experiment logs for errors related to your recipe. You can also disable recipe failures in Expert Settings. ### Can I write recipes in Go, C++, Java, or R? You can use any language as long as you can interface it with Python. Many recipes rely on Java and C++ backends. From 8cea9011893c028eae3cfc93af02e6af10778b80 Mon Sep 17 00:00:00 2001 From: Shaun Date: Mon, 16 Dec 2024 11:55:41 +0530 Subject: [PATCH 7/8] review suggestions --- FAQ.md | 18 +++++++++--------- air-gapped_installations/README.md | 14 +++++++------- 2 files changed, 16 insertions(+), 16 deletions(-) diff --git a/FAQ.md b/FAQ.md index c263a6e4..6183ec6f 100644 --- a/FAQ.md +++ b/FAQ.md @@ -26,8 +26,8 @@ See the [examples in this repository](https://github.com/h2oai/driverlessai-reci - Create custom recipes for model interpretability, fairness, robustness, and explanations. - Generate custom plots, charts, markdown reports, and more. -### Is Driverless AI sufficient without custom recipes? -Driverless AI continues to improve with each version, so you may not need custom recipes. However, adding your own recipes can optimize performance for specific use cases. +### Is H2O Driverless AI sufficient without custom recipes? +H2O Driverless AI continues to improve with each version, so you may not need custom recipes. However, adding your own recipes can optimize performance for specific use cases. ### What's in it for me if I write a recipe? Writing recipes improves your data science skills and helps achieve better results. It is one of the best ways to enhance your expertise. @@ -43,17 +43,17 @@ A text editor and knowledge of Python. Recipes are written as `.py` files with t - Review the [examples in this repository](https://github.com/h2oai/driverlessai-recipes/blob/master/README.md#sample-recipes). - Upload your recipe in the Expert Settings section during the experiment setup. -### What version of Python does Driverless AI use? -Driverless AI uses Python 3.6. Ensure your recipes are compatible with this version. +### What version of Python does H2O Driverless AI use? +H2O Driverless AI uses Python 3.6. Ensure your recipes are compatible with this version. ### How do I know if my recipe works? -Driverless AI will notify you whether your recipe passes the acceptance tests. If it fails, feedback will guide you on how to fix it. +H2O Driverless AI will notify you whether your recipe passes the acceptance tests. If it fails, feedback will guide you on how to fix it. ### How can I debug my recipe? Upload your recipe to the Expert Settings and use the experiment log for debugging. Alternatively, make minimal changes as shown in [this debugging example](./transformers/how_to_debug_transformer.py) and debug with a Python debugger, like PyCharm. ### What happens if my recipe is rejected during upload? -Review the error message, which usually includes a stack trace and hints for fixing the issue. If you need help, ask questions in the [Driverless AI community Slack channel](https://www.h2o.ai/community/driverless-ai-community/#chat). You can also send your experiment logs zip file, which will contain the recipe source files. +Review the error message, which usually includes a stack trace and hints for fixing the issue. If you need help, ask questions in the [H2O Driverless AI community Slack channel](https://www.h2o.ai/community/driverless-ai-community/#chat). You can also send your experiment logs zip file, which will contain the recipe source files. ### What if my transformer recipe doesn't lead to the highest variable importance? Features created by your transformer might not have the strongest signal, but they can still improve the overall model performance. @@ -64,7 +64,7 @@ H2O Driverless AI will use the best-performing recipes. Check the experiment log ### Can I write recipes in Go, C++, Java, or R? You can use any language as long as you can interface it with Python. Many recipes rely on Java and C++ backends. -### Is there a difference between custom recipes and those shipped with Driverless AI? +### Is there a difference between custom recipes and those shipped with H2O Driverless AI? Custom recipes are treated the same as built-in recipes. There is no performance penalty or calling overhead. ### Why are some models implemented as transformers? @@ -77,10 +77,10 @@ Recipes can be disabled by setting `enable_custom_recipes=false` in the `config. When you upload a new version of a recipe, it becomes the default. Older experiments will continue using the previous version. ### Who can see my recipe? -Anyone with access to the Driverless AI instance can run any uploaded recipe, but recipes are shared only within the instance. +Anyone with access to the H2O Driverless AI instance can run any uploaded recipe, but recipes are shared only within the instance. ### How do I delete all recipes on my instance? -To delete all recipes, remove the `contrib` folder from the data directory (usually `tmp`) and restart Driverless AI. This will prevent old experiments from making predictions unless related experiments are also deleted. +To delete all recipes, remove the `contrib` folder from the data directory (usually `tmp`) and restart H2O Driverless AI. This will prevent old experiments from making predictions unless related experiments are also deleted. ### Are MOJOs supported for experiments that use custom recipes? In most cases, MOJOs are not available for custom recipes. Contact support@h2o.ai for more details. diff --git a/air-gapped_installations/README.md b/air-gapped_installations/README.md index 19588399..1f6b7991 100644 --- a/air-gapped_installations/README.md +++ b/air-gapped_installations/README.md @@ -5,9 +5,9 @@ Air gapping is a network security measure used to isolate one or more computers This guide walks you through the installation process of Custom Recipes in an air-gapped environment. ## Prerequisite -- Two DAI installations are required: one on an **Air-Gapped Machine** and the other on an **Internet-Facing Machine**. These terms will be used throughout the document for clarity. -- First, install DAI on the air-gapped machine (a machine isolated from the Internet). DAI can be installed using any available package type (e.g., TAR SH, Docker, DEB). You can download the installation packages from [H2O.ai Downloads](https://h2o.ai/resources/download/). -- On the Internet-facing machine, clone the repository and check out the appropriate branch for your DAI `VERSION`: +- Two H2O Driverless AI (DAI) installations are required: one on an **Air-Gapped Machine** and the other on an **Internet-Facing Machine**. These terms will be used throughout the document for clarity. +- First, install H2O Driverless AI on the air-gapped machine (a machine isolated from the Internet). Driverless AI can be installed using any available package type (e.g., TAR SH, Docker, DEB). You can download the installation packages from [H2O.ai Downloads](https://h2o.ai/resources/download/). +- On the Internet-facing machine, clone the repository and check out the appropriate branch for your Driverless AI `VERSION`: ``` git clone https://github.com/h2oai/driverlessai-recipes.git cd driverlessai-recipes @@ -15,13 +15,13 @@ This guide walks you through the installation process of Custom Recipes in an ai ``` ## Installation Guide -Follow the steps below to use custom recipes for DAI in an air-gapped environment: +Follow the steps below to use custom recipes for H2O Driverless AI in an air-gapped environment: **Note**: *These steps should be performed on the Internet-Facing Machine.* -1. Download the required version of the Driverless AI TAR SH installer on the Internet-Facing Machine from [H2O.ai Downloads](https://www.h2o.ai/download/). +1. Download the required version of the H2O Driverless AI TAR SH installer on the Internet-Facing Machine from [H2O.ai Downloads](https://www.h2o.ai/download/). -2. Run the following commands to install the Driverless AI TAR SH. Replace `VERSION` with the specific version you need. +2. Run the following commands to install the H2O Driverless AI TAR SH. Replace `VERSION` with the specific version you need. ``` chmod 755 dai-VERSION.sh ./dai-VERSION.sh @@ -62,7 +62,7 @@ Follow the steps below to use custom recipes for DAI in an air-gapped environmen 6. Once the script has been executed successfully, custom recipes and Python dependencies will be installed in the `dai-VERSION///contrib` directory, where `` is `tmp` by default. -7. Zip the `dai-VERSION/tmp/contrib` directory and move it to the air-gapped machine. Unzip it into the DAI `tmp` directory: +7. Zip the `dai-VERSION/tmp/contrib` directory and move it to the air-gapped machine. Unzip it into the Driverless AI `tmp` directory: ``` cd dai-VERSION/``/ zip -r user_contrib.zip ``/contrib From 18305daa136a0f31408a3ae0e0a38aa44a8777a4 Mon Sep 17 00:00:00 2001 From: Shaun <124687868+shaunyogeshwaran@users.noreply.github.com> Date: Tue, 17 Dec 2024 11:47:38 +0530 Subject: [PATCH 8/8] Update FAQ.md Co-authored-by: Sajith Ariyarathna --- FAQ.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/FAQ.md b/FAQ.md index 6183ec6f..cb73245f 100644 --- a/FAQ.md +++ b/FAQ.md @@ -44,7 +44,7 @@ A text editor and knowledge of Python. Recipes are written as `.py` files with t - Upload your recipe in the Expert Settings section during the experiment setup. ### What version of Python does H2O Driverless AI use? -H2O Driverless AI uses Python 3.6. Ensure your recipes are compatible with this version. +H2O Driverless AI uses Python 3.11. Ensure your recipes are compatible with this version. ### How do I know if my recipe works? H2O Driverless AI will notify you whether your recipe passes the acceptance tests. If it fails, feedback will guide you on how to fix it.