From 2b7a2d1527e99381f98dec918fa19bcfce5b44db Mon Sep 17 00:00:00 2001 From: elenasamuylova <67064421+elenasamuylova@users.noreply.github.com> Date: Thu, 9 Jan 2025 14:46:16 +0000 Subject: [PATCH] Docs: misc updates (#1397) --- docs/book/SUMMARY.md | 2 +- docs/book/examples/examples.md | 4 +- docs/book/examples/tutorial-cloud.md | 20 ++------ docs/book/examples/tutorial-llm.md | 2 +- docs/book/presets/text-overview.md | 69 +++++----------------------- 5 files changed, 21 insertions(+), 76 deletions(-) diff --git a/docs/book/SUMMARY.md b/docs/book/SUMMARY.md index bb78e515c8..06415bc0f1 100644 --- a/docs/book/SUMMARY.md +++ b/docs/book/SUMMARY.md @@ -18,7 +18,7 @@ * [Regression Performance](presets/reg-performance.md) * [Classification Performance](presets/class-performance.md) * [NoTargetPerformance](presets/no-target-performance.md) - * [Text Overview](presets/text-overview.md) + * [Text Evals](presets/text-overview.md) * [Recommender System](presets/recsys.md) * [Tutorials and Examples](examples/README.md) * [All Tutorials](examples/examples.md) diff --git a/docs/book/examples/examples.md b/docs/book/examples/examples.md index 448bd638f2..cd2ab3662e 100644 --- a/docs/book/examples/examples.md +++ b/docs/book/examples/examples.md @@ -45,8 +45,8 @@ To better understand the Evidently use cases, refer to the **detailed tutorials* Title | Code example | Blog post --- | --- | --- -Understand ML model decay in production (regression example) | [Jupyter notebook](../../../examples/data_stories/bicycle_demand_monitoring.ipynb) | [How to break a model in 20 days. A tutorial on production model analytics.](https://evidentlyai.com/blog/tutorial-1-model-analytics-in-production) -Compare two ML models before deployment (classification example) | [Jupyter notebook](../../../examples/data_stories/ibm_hr_attrition_model_validation.ipynb) | [What Is Your Model Hiding? A Tutorial on Evaluating ML Models.](https://evidentlyai.com/blog/tutorial-2-model-evaluation-hr-attrition) +Understand ML model decay in production (regression example) | [Jupyter notebook](https://github.com/evidentlyai/community-examples/blob/main/tutorials/bicycle_demand_monitoring.ipynb) | [How to break a model in 20 days. A tutorial on production model analytics.](https://evidentlyai.com/blog/tutorial-1-model-analytics-in-production) +Compare two ML models before deployment (classification example) | [Jupyter notebook](https://github.com/evidentlyai/community-examples/blob/main/tutorials/ibm_hr_attrition_model_validation.ipynb) | [What Is Your Model Hiding? A Tutorial on Evaluating ML Models.](https://evidentlyai.com/blog/tutorial-2-model-evaluation-hr-attrition) Evaluate and visualize historical data drift | [Jupyter notebook](../../../examples/integrations/mlflow_logging/historical_drift_visualization.ipynb) | [How to detect, evaluate and visualize historical drifts in the data.](https://evidentlyai.com/blog/tutorial-3-historical-data-drift) Monitor NLP models in production | [Colab](https://colab.research.google.com/drive/15ON-Ub_1QUYkDbdLpyt-XyEx34MD28E1) | [Monitoring NLP models in production: a tutorial on detecting drift in text data](https://www.evidentlyai.com/blog/tutorial-detecting-drift-in-text-data) Create ML model cards |[Jupyter notebook](https://github.com/evidentlyai/community-examples/tree/main/tutorials/How_to_create_an_ML_model_card.ipynb) | [A simple way to create ML Model Cards in Python](https://www.evidentlyai.com/blog/ml-model-card-tutorial) diff --git a/docs/book/examples/tutorial-cloud.md b/docs/book/examples/tutorial-cloud.md index 7ed7e90c4f..d937ed9914 100644 --- a/docs/book/examples/tutorial-cloud.md +++ b/docs/book/examples/tutorial-cloud.md @@ -58,19 +58,9 @@ Let's quickly look at an example monitoring Dashboard. If you do not have one yet, [create an Evidently Cloud account](https://app.evidently.cloud/signup). -## 2. Create a team +## 2. View a demo project -Go to the main page, click on "plus" sign and create a new Team. For example, "personal" Team. - -## 3. View a demo project - -Click on "Generate Demo Project" inside your Team. It will create a Project for a toy regression model that forecasts bike demand. - -![](../.gitbook/assets/cloud/generate_demo_project.png) - -It'll take a few moments to populate the data. In the background, Evidently will run the code to generate Reports and Test Suites for 20 days. Once it's ready, open the Project to see a monitoring Dashboard. - -Dashboards Tabs will show data quality, data drift, and model quality over time. +View an example Demo Project for a Regression Model for bike demand forecasting. Dashboards Tabs will show data quality, data drift, and model quality over time. ![](../.gitbook/assets/cloud/demo_dashboard.gif) @@ -194,7 +184,7 @@ Now, you need to create a new Project. You can do this programmatically or in th {% tabs %} {% tab title="UI" %} -Click on the “plus” sign on the home page. Create a Team if you do not have one yet. Type your Project name and description. +Click on the “plus” sign on the home page. Type your Project name and description. ![](../.gitbook/assets/cloud/add_project_wide-min.png) @@ -209,10 +199,10 @@ project = ws.get_project("PROJECT_ID") {% endtab %} {% tab title="API" %} -Use the `create_project` command to create a new Project. Add a name and description. Copy the Team ID from the [teams page](https://app.evidently.cloud/teams). +Use the `create_project` command to create a new Project. Add a name and description. Copy the ID of your organization from the [organizations page](https://app.evidently.cloud/organizations). ```python -project = ws.create_project("My test project", team_id="YOUR_TEAM_ID") +project = ws.create_project("My test project", org_id="YOUR_ORG_ID") project.description = "My project description" project.save() ``` diff --git a/docs/book/examples/tutorial-llm.md b/docs/book/examples/tutorial-llm.md index a137b3aaff..30d9bf6022 100644 --- a/docs/book/examples/tutorial-llm.md +++ b/docs/book/examples/tutorial-llm.md @@ -137,7 +137,7 @@ assistant_logs.head(3) To be able to save and share results and get a live monitoring dashboard, create a Project in Evidently Cloud. Here's how to set it up: * **Sign up**. If you do not have one yet, create a free [Evidently Cloud account](https://app.evidently.cloud/signup) and name your Organization. -* **Create an Organization** when you log in for the first time. Get an ID of your organization. [Organizations page](https://app.evidently.cloud/organizations). +* **Get an Organization ID**. Get an ID of your organization on the [organizations page](https://app.evidently.cloud/organizations). * **Get your API token**. Click the **Key** icon in the left menu to go. Generate and save the token. ([Token page](https://app.evidently.cloud/token)). * **Connect to Evidently Cloud**. Pass your API key to connect. diff --git a/docs/book/presets/text-overview.md b/docs/book/presets/text-overview.md index 63b2883aaf..d1d79eb7e1 100644 --- a/docs/book/presets/text-overview.md +++ b/docs/book/presets/text-overview.md @@ -2,19 +2,11 @@ * **Report**: for visual analysis or metrics export, use the `TextEvals`. -# Use case +# Text Evals Report -You can evaluate and explore text data: +To visually explore the descriptive properties of text data, you can create a new Report object and generate `TextEvals` preset for the column containing the text data. It's best to define your own set of `descriptors` by passing them as a list to the `TextEvals` preset. For more details, see [how descriptors work](../tests-and-reports/text-descriptors.md). -**1. To monitor input data for NLP models.** When you do not have true labels or actuals, you can monitor changes in the input data (data drift) and descriptive text characteristics. You can run batch checks, for example, comparing the latest batch of text data to earlier or training data. You can often combine it with evaluating [Prediction Drift](target-drift.md). - -**2. When you are debugging the model decay.** If you observe a drop in the model performance, you can use this report to understand changes in the input data patterns. - -**3. Exploratory data analysis.** You can use the visual report to explore the text data you want to use for training. You can also use it to compare any two datasets. - -# Text Overview Report - -If you want to visually explore the text data, you can create a new Report object and use the `TextEvals`. +If you don’t specify descriptors, the Preset will use default statistics. ## Code example @@ -27,7 +19,7 @@ text_overview_report.run(reference_data=ref, current_data=cur) text_overview_report ``` -Note that to calculate text-related metrics, you must also import additional libraries: +Note that to calculate some text-related metrics, you may also need to also import additional libraries: ``` import nltk @@ -36,21 +28,11 @@ nltk.download('wordnet') nltk.download('omw-1.4') ``` -## How it works - -The `TextEvals` provides an overview and comparison of text datasets. -* Generates a **descriptive summary** of the text columns in the dataset. -* Performs **data drift detection** to compare the two texts using the domain classifier approach. -* Shows distributions of the **text descriptors** in two datasets, and their **correlations** with other features. -* Performs **drift detection for text descriptors**. - ## Data Requirements -* You can pass **one or two** datasets. The **reference** dataset serves as a benchmark. Evidently analyzes the change by comparing the **current** production data to the **reference** data. If you pass a single dataset, there will be no comparison. - +* You can pass **one or two** datasets. Evidently will compute descriptors both for the **current** production data and the **reference** data. If you pass a single dataset, there will be no comparison. * To run this preset, you must have **text columns** in your dataset. Additional features and prediction/target are optional. Pass them if you want to analyze the correlations with text descriptors. - -* **Column mapping**. You must explicitly specify the columns that contain text features in [column mapping](../input-data/column-mapping.md) to run this report. +* **Column mapping**. Specify the columns that contain text features in [column mapping](../input-data/column-mapping.md). ## How it looks @@ -58,13 +40,7 @@ The report includes 5 components. All plots are interactive. **Aggregated visuals in plots.** Starting from v 0.3.2, all visuals in the Evidently Reports are aggregated by default. This helps decrease the load time and report size for larger datasets. If you work with smaller datasets or samples, you can pass an [option to generate plots with raw data](../customization/report-data-aggregation.md). You can choose whether you want it on not based on the size of your dataset. -### 1. Text Column Summary - -The report first shows the **descriptive statistics** for the text column(s). - -![](<../.gitbook/assets/reports/metric_column_summary_text-min.png>) - -### 2. Text Descriptors Distribution +### Text Descriptors Distribution The report generates several features that describe different text properties and shows the distributions of these text descriptors. @@ -80,34 +56,13 @@ The report generates several features that describe different text properties an ![](<../.gitbook/assets/reports/metric_text_descriptors_distribution_oov-min.png>) -### 3. Text Descriptors Correlations - -If the dataset contains numerical features and/or target, the report will show the **correlations between features and text descriptors** in the current and reference dataset. It helps detects shifts in the relationship. - -#### Text length - -![](<../.gitbook/assets/reports/metric_text_descriptors_correlation_text_length-min.png>) - -#### Non-letter characters - -![](<../.gitbook/assets/reports/metric_text_descriptors_correlation_nlc-min.png>) - -#### Out-of-vocabulary words - -![](<../.gitbook/assets/reports/metric_text_descriptors_correlation_oov-min.png>) - - -### 4. Text Column Drift - -If you pass two datasets, the report performs drift detection using the default [data drift method for texts](../reference/data-drift-algorithm.md) (domain classifier). It returns the ROC AUC of the binary classifier model that can discriminate between reference and current data. If the drift is detected, it also shows the top words that help distinguish between the reference and current dataset. - -![](<../.gitbook/assets/reports/metric_column_drift_text-min.png>) +#### Sentiment -### 5. Text Descriptors Drift +Shows the distribution of text sentiment (-1 negative to 1 positive). -If you pass two datasets, the report also performs drift detection for text descriptors to show statistical shifts in patterns between test characteristics. +#### Sentence Count -![](<../.gitbook/assets/reports/metric_text_descriptors_drift-min.png>) +Shows the sentence count. ## Metrics output @@ -115,7 +70,7 @@ You can also get the report output as a JSON or a Python dictionary. ## Report customization -* You can [specify a different drift detection threshold](../customization/options-for-statistical-tests.md). +* You can [choose your own descriptors](../tests-and-reports/text-descriptors.md). * You can use a [different color schema for the report](../customization/options-for-color-schema.md). * You can create a different report or test suite from scratch, taking this one as an inspiration.