18 Dec 15:46

jfcalvo

12160d0

v2.6.0 Latest

Latest

🔆 Release highlights

Push to hub

Export your dataset to the Hugging Face Hub directly from the Argilla UI:

1️⃣ Go to your dataset
2️⃣ Click on Push to Hub
3️⃣ Make sure you include your username or organization and a Hub Access Token with write permissions

Share your progress

Share your annotation progress on any Argilla dataset with the world!

1️⃣ In your dataset, click on "Share progress"
2️⃣ Open your preferred social media platform
3️⃣ Start a post and paste the copied text
4️⃣ Publish and share with the world!

Update user data

You can update all information of a user. Here's an example of how to update the role of a user:

import argilla as rg

client = rg.Argilla(api_url="<ARGILLA_API_URL>", api_key="<ARGILLA_API_KEY>")

user = client.users("username")
user.role = "admin"
user.update()

Change record fields

You can now update the content of record fields.

import argilla as rg

client = rg.Argilla(api_url="<ARGILLA_API_URL>", api_key="<ARGILLA_API_KEY>")

dataset = client.datasets("my_dataset")
record = next(dataset.records(limit=1))
record.fields["text"] = "this is my updated text"
record.update()

# or several records at once
records = list(dataset.records(...))

for record in records:
    record.fields["text"] = "this is my updated text"

dataset.records.log(records)

Changelog v2.6.0

[ENHANCEMENT] argilla server: Return users on dataset progress by @frascuchon in #5701
[CI] Update base docker image by @frascuchon in #5705
🔥 Fix highlight on bulk by @damianpumar in #5698
🔥 Improve plugins loaders by @damianpumar in #5697
fix: 🐛 Send visible_options prop only when the questions has more… by @damianpumar in #5716
[FEATURE] UI - update dataset list by @leiyre in #5684
[Docs] configure issue form by @sdiazlor in #5703
Assign field to a span question by @leiyre in #5717
[FEATURE]: Adding Functionality To Update Users by @sean-hickey-wf in #5615
[CI] Fix argilla-frontend build by adding the package-lock.json file by @frascuchon in #5731
fix: UI - use last_activity_at in the dataset list by @leiyre in #5741
[CI] fix install deps using python.3.13 by @frascuchon in #5745
[BUGFIX] prevent errors when updating user by @frascuchon in #5742
[BUGFIX] argilla: prevent enum literal validation errors by @frascuchon in #5679
🎉 Improve styles file weight by @leiyre in #5724
[FEATURE] Add support to update record fields by @frascuchon in #5685
🚑 feat/check version by @damianpumar in #5738
[BUGFIX] [TESTS] Remove custom isoformat parsing and let pydantic do the work by @frascuchon in #5752
[BUGFIX] Fetch dataset setting when iterate client.datasets by @frascuchon in #5753
[BUGFIX] argilla: review datasest import with new export flow by @frascuchon in #5756
[FEATURE-BRANCH] feat: dataset export to the Hub by @jfcalvo in #5730
Feat/improve export hover by @damianpumar in #5764
[CHORE] Add missing fixed entries by @frascuchon in #5765
✨ Add share component by @damianpumar in #5727
[RELEASES] v2.6.0 by @jfcalvo in #5762

New Contributors

@sean-hickey-wf made their first contribution in #5615

Full Changelog: v2.5.0...v2.6.0

Contributors

jfcalvo, frascuchon, and 4 other contributors

Assets 2

29 Nov 12:28

frascuchon

v2.5.0

2a58a30

v2.5.0

🔆 Release highlights

Webhooks

You can now create and manage webhooks to support your workflows!

Webhooks allow you to submit real-time information to other applications whenever a specific event occurs within Argilla. Here's an example of how you can set up a webhook in Argilla:

import argilla as rg

@rg.webhook_listener("record.completed")
async def record_completed(record: rg.Record, **kwargs):
    print (f"Record {record.id} has been completed")

Visit the Argilla documentation for more information.

A redesigned home page

Captura de pantalla 2024-11-29 a las 12 49 32

Argilla's home page has been redesigned to provide a better user experience. The new home page now shows a new
dataset card view, which provides a better overview of the datasets and annotation progress.

Python 3.13 and Pydantic v2 support

The Argilla server (and SDK) now supports Python 3.13 and Pydantic 2.0.0. This means that you can now install and use both SDK and server with Python 3.13 in the same Python environment!

pip install argilla
pip install argilla-server

python -m argilla_server

Other improvements

We've added a high contrast theme to help users with visual impairments. To change the theme go to "My settings" and choose your preferred theme. Thanks @paulbauriegel for this! 🎉
You can select the language that you'd like to display in the Argilla UI, also from the "My settings" page. Your language isn't there? Visit the Argilla documentation to learn how you can add yours.

Changelog v2.5.0

[BUGFIX] argilla server: Prevent update dataset.updated_at when updating dataset.last_activity_at column by @frascuchon in #5656
Docs: Typo Fix by @RahulK4102 in #5642
[Docs] : fix typos in docs by @FarukhS52 in #5612
[CONFIG] argilla server: Review and update dependencies by @frascuchon in #5649
Improve German translation and some aria attributes by @paulbauriegel in #5658
Add a high-contrast theme & improvements for the forced-colors mode by @paulbauriegel in #5661
[BUGFIX]: argilla server: install default psycopg2 driver used by alembic by @frascuchon in #5672
(Typo): Update README.md by @kaleaditya779 in #5655
[CONFIG] argilla: Add Python 3.13 support by @frascuchon in #5652
[ENHANCEMENT][REFACTOR] SDK: allow to remove settings by @frascuchon in #5584
fix: improve logic for detecting ChatFields by @leiyre in #5667
[BUGFIX] argilla frontend: Avoid call router.push when opening an external URL by @frascuchon in #5675
[BUGFIX] visualisation of highlighted text by @leiyre in #5678
Dataset Creation UI fixes & Improvements by @leiyre in #5670
[BUGFIX] Show Import data if user is admin or owner by @leiyre in #5688
docs: Add missing server configuration env vars by @frascuchon in #5676
[REFACTOR] argilla server: Remove passlib dependency by @frascuchon in #5674
[FEATURE] UI - Add language selection in user settings by @leiyre in #5690
⚡️ Fix highlight text by @damianpumar in #5693
[FEATURE] Add Webhooks by @jfcalvo in #5467
🚑 Add missing translation by @damianpumar in #5696
Docs - Add docs for adding a language by @paulbauriegel in #5640
[BUGFIX] argilla server: Prevent passing non-string values to text fields by @frascuchon in #5682
[REFACTOR] argilla server: using pydantic v2 by @frascuchon in #5666
fix: Resolve failing tests after pydantic V2 merge by @frascuchon in #5700
[DOCS] Deploy on spaces review by @sdiazlor in #5704
[REFACTOR] argilla: Align questions to Resource API by @frascuchon in #5680
[CHORE] Review changelogs by @frascuchon in #5707
[EXAMPLES][DOCS] review basic webhooks example by @frascuchon in #5710
[BUGFIX] argilla: allow change default distribution values by @frascuchon in #5719
[DOCS] review 2.5.0 docs by @frascuchon in #5723
[RELEASES] v2.5.0 by @frascuchon in #5706

New Contributors

@RahulK4102 made their first contribution in #5642
@FarukhS52 made their first contribution in #5612
@kaleaditya779 made their first contribution in #5655

Full Changelog: v2.4.1...v2.5.0

Contributors

jfcalvo, frascuchon, and 7 other contributors

Assets 2

11 Nov 11:37

frascuchon

v2.4.1

b483560

v2.4.1

This release includes some argilla-server fixes:

Fixed redirection problems after users sign-in using HF OAuth. (#5635)
Fixed highlighting of the searched text in text, span and chat fields (#5678)
Fixed validation for rating question when creating a dataset (#5670)
Fixed question name based on question type when creating a dataset (#5670)
Fixed error so now _touch_dataset_last_activity_at function is not updating dataset's updated_at column. (#5656)

Full Changelog: v2.4.0...v2.4.1

Assets 2

30 Oct 17:05

frascuchon

v2.4.0

86fb672

v2.4.0

🔆 Release highlights

Import Hub datasets from the UI

import_hub_dataset.mp4

In this release, we’ve focused all of our efforts in bringing you a new feature to import datasets from the Hugging Face Hub directly within our UI, making it easier and faster to get started with your AI projects.

To get started, click on the “Import dataset from Hugging Face” button and paste the repo id of the dataset you want to use. Argilla will process the columns of the dataset and map them to Fields or Questions. Then, you can add more questions or remove any unnecessary fields by selecting the “No mapping” options. All the changes you make will be automatically reflected in the preview.

Once you’re happy with the result you simply need to provide a name for your dataset, select a workspace and (if applicable) a split. Then, Argilla will start importing the dataset.

Note

If your dataset is bigger than 10k records, at this stage Argilla will only import the first 10k. You can import the rest of the dataset using the Argilla SDK: simply click on the “Import data” button in the dataset and use the code snippet provided.

If you want to make extra changes, like customizing the titles of your fields and questions, don’t worry, you can always go back to the Dataset Settings page after the dataset has been created.

Learn more about this new feature in our docs.

Deploy an Argilla Space directly from the SDK

If you're working from the SDK and don't want to leave to start your Argilla server, you can start an Argilla deployment on Spaces with a simple line of code:

import argilla as rg

client = rg.Argilla.deploy_on_spaces(api_key="12345678")

Learn more in our docs.

Changelog v2.4.0

Enhancement/improve-error-messaging-for-role-forbidden by @burtenshaw in #5554
refactor: add DatasetPublishValidator class by @jfcalvo in #5568
feat: set CREATOR_USER_ID to avoid difficulties with creation in orga… by @davidberenstein1957 in #5556
[Refactor] remove name validations for dataset workspaces and usernames by @frascuchon in #5575
fix: SPACES_CREATOR_USER_ID -> SPACE_CREATOR_USER_ID by @davidberenstein1957 in #5590
[FIX] Prevent duplicated field text by @leiyre in #5592
feat: Add basic support to bool features by @frascuchon in #5576
feat: Add support to other than str values for terms metadata properties by @frascuchon in #5594
[BUGFIX] argilla server: parse fields for record schemas by @frascuchon in #5600
correct phrase on docs: "a recod question" -> "a question" by @HeAndres in #5599
docs: update filter_dataset.md by @eltociear in #5571
feat: 5108 feature add method to deploy on spaces through huggingface hub by @davidberenstein1957 in #5547
docs: add quickstart update for deploy on spaces by @davidberenstein1957 in #5550
Typo: missing comma by @ACMCMC in #5565
Typo fix by @ACMCMC in #5566
Fix typo by @ACMCMC in #5567
[REFACTOR] argilla server: moving all record validators by @frascuchon in #5603
[BUGFIX] argilla server: Prevent convert ChatFieldValue objects by @frascuchon in #5605
Introducing Argilla Guru on Gurubase.io by @kursataktas in #5608
[PERF][IMPROVEMENT] argilla server: improve computation for dataset progress and metrics by @frascuchon in #5618
[PERF] argilla server: Reduce general transaction time by @frascuchon in #5609
fix: Prevent compute metrics for draft datasets by @frascuchon in #5624
Refine German translations and update non-localized UI elements by @paulbauriegel in #5632
[BUGFIX] Catch None in image feature columns by @burtenshaw in #5626
feat: added support for with_vectors with query filter in sdk by @bharath97-git in #5638
perf: Using search engine to compute the total number of records for user metrics by @frascuchon in #5641
[IMPROVEMENT] feat(helm): add support for default storage class in PVCs by @dme86 in #5628
Feature - Improve Accessibility for Screenreaders by @paulbauriegel in #5634
[FEATURE-BRANCH] Argilla direct import from Hub by @jfcalvo in #5572
fix: remove unnecesary exposed ports for Argilla Docker compose file by @jfcalvo in #5644
Dataset creation feature final QA by @leiyre in #5646
[CI] argilla frontend: Remove invalid workflow permissions by @frascuchon in #5647
[CI] Configure workflow permissions by @frascuchon in #5648
chore: update changelogs for release 2.4.0 by @jfcalvo in #5650
chore: small improvement installing dependencies for HF Spaces Dockerfile by @jfcalvo in #5651
fix: skip helmlint pre-commit hook on CI because helm command is not available by @jfcalvo in #5654
Import from hub docs by @nataliaElv in #5631
[RELEASE] 2.4.0 by @frascuchon in #5643

New Contributors

@HeAndres made their first contribution in #5599
@ACMCMC made their first contribution in #5565
@kursataktas made their first contribution in #5608
@bharath97-git made their first contribution in #5638
@dme86 made their first contribution in #5628

Full Changelog: v2.3.1...v2.4.0

Contributors

jfcalvo, frascuchon, and 11 other contributors

Assets 2

08 Oct 09:45

jfcalvo

v2.3.1

9b837db

v2.3.1

What's Changed

This is a patch release fixing an error listing current user datasets:

Fixed error listing current user datasets and not filtering by current user id. (#5583)

Full Changelog: v2.3.0...v2.3.1

Assets 2

03 Oct 15:11

frascuchon

v2.3.0

1e54a48

v2.3.0

🌟 Release highlights

Custom Fields: the most powerful way to build custom annotation tasks

We heard you. This new type of field gives you full control over how data is presented to annotators.

With custom fields, you can use your own CSS, HTML, and even Javascript (welcome interactive fields!). Moreover, you can populate your fields with custom structures like custom_field={"image1": ..., "image_2": ..., etc.}.

Here's an example:

Imagine you want to show two images and a prompt to your users.

With a custom field

With the new custom field, you can configure something like this:

And you can set this up with a few lines of code:

css_template = """
<style>
#container {
    display: flex;
    flex-direction: column;
    font-family: Arial, sans-serif;
}
.prompt {
    margin-bottom: 10px;
    font-size: 16px;
    line-height: 1.4;
    color: #333;
    background-color: #f8f8f8;
    padding: 10px;
    border-radius: 5px;
    box-shadow: 0 1px 3px rgba(0,0,0,0.1);
}
.image-container {
    display: flex;
    gap: 10px;
}
.column {
    flex: 1;
    position: relative;
}
img {
    max-width: 100%;
    height: auto;
    display: block;
}
.image-label {
    position: absolute;
    top: 10px;
    right: 10px;
    background-color: rgba(255, 255, 255, 0.7);
    color: black;
    padding: 5px 10px;
    border-radius: 5px;
    font-weight: bold;
}
</style>
"""

html_template = """
<div id="container">
    <div class="prompt"><strong>Prompt:</strong> {{record.fields.images.prompt}}</div>
    <div class="image-container">
        <div class="column">
            <img src="{{record.fields.images.image_1}}" />
            <div class="image-label">Image 1</div>
        </div>
        <div class="column">
            <img src="{{record.fields.images.image_2}}" />
            <div class="image-label">Image 2</div>
        </div>
    </div>
</div>
"""

custom_field = rg.CustomField(
    name="images",
    template=css_template + html_template,
)

# and the log records like this
rg.Record(
    fields={
        "prompt": prompt,
         "image_1": schnell_uri,
         "image_2": dev_uri,
   }
)

Before the custom field

Before this release, you were forced to use two ImageField and a TextField, which would be displayed sequentially, limiting the ability to compare the images side-by-side, with clear labels, prompt text, etc. It would look like this:

How to get started with custom fields

Here we've shown a basic presentation-oriented custom field but you can set up anything you can think of, leveraging JS, html, and css. Imagination is the limit!

To get started check the docs: https://docs.argilla.io/v2.3/how_to_guides/custom_fields/

Other features

Support for similarity search from the SDK and other search and filtering improvements.
New Helm chart deployment configuration.
Support credentials from colab secrets.

An other changes and fixes

Changed

Changed the repr method for SettingsProperties to display the details of all the properties in Setting object. (#5380)
Changed error messages when creating datasets with insufficient permissions. (#5540)

Fixed

Fixed serialization of ChatField when collecting records from the hub and exporting to datasets. (#5554)
Fixed error when creating default user with existing default workspace. (#5558)
Fixed the deployment yaml used to create a new Argilla server in K8s. Added USERNAME and PASSWORD to the environment variables of pod template. (#5434)
Fix autofill form on sign-in page #5522
Support copy on clipboard for no secure context #5535

New Contributors

@not-lain made their first contribution in #5541

Thanks to

@bikash119 for Helm chart in #5512

Full Changelog: v2.2.2...v2.3.0

Contributors

bikash119 and not-lain

Assets 2

25 Sep 14:55

frascuchon

v2.2.2

4c2af95

v2.2.2

What's Changed

This is a patch release with certain fixes to the SDK

Fixed

Fixed from_hub with unsupported column names. (#5524)
Fixed from_hub with missing dataset subset configuration value. (#5524)

Changed

Changed from_hub to only generate fields not questions for strings in the dataset. (#5524)

Full Changelog: v2.2.1...v2.2.2

Assets 2

23 Sep 11:52

jfcalvo

v2.2.1

d1eee08

v2.2.1

What's Changed

This is a patch release with certain fixes to the SDK:

Fixed from_hub errors when columns names contain uppercase letters. (#5523)
Fixed from_hub errors when class feature values contains unlabelled values. (#5523)
Fixed from_hub errors when loading cached datasets. (#5523)

Full Changelog: v2.2.0...v2.2.1

Assets 2

19 Sep 14:59

jfcalvo

v2.2.0

e1b2e6e

v2.2.0

🌟 Release highlights

Important

Argilla server 2.2.0 adds support for background jobs. These background jobs allow us to run jobs that might take a long time at request time. For this reason we now rely on Redis and Python RQ workers.

So to upgrade your Argilla instance to version 2.2.0 you need to have an available Redis server. See the Redis get-started documentation for more information or the Argilla server configuration documentation.

If you have deployed Argilla server using the docker-compose.yaml, you should download the docker-compose.yaml file again to bring the latest changes to set Redis and Argilla workers

Workers are needed to process Argilla's background jobs. You can run Argilla workers with the following command:

python -m argilla_server worker

ChatField: working with text conversations in Argilla

chat_field.mp4

You can now work with text conversations natively in Argilla using the new ChatField. It is especially designed to make it easier to build datasets for conversational Large Language Models (LLMs), displaying conversational data in the form of a chat.

Here's how you can create a dataset with a ChatField:

import argilla as rg

client = rg.Argilla(api_url="<api_url>", api_key="<api_key>")

settings = rg.Settings(
	fields=[rg.ChatField(name="chat")],
	questions=[...]
)

dataset = rg.Dataset(
	name="chat_dataset",
	settings=settings,
	workspace="my_workspace",
	client=client
)

dataset.create()

record = rg.Record(
	fields={
		"chat": [
			{"role": "user", "content": "Hello World, how are you?"},
			{"role": "assistant", "content": "I'm doing great, thank you!"}
		]
	}
)

dataset.records.log([record])

Read more about how to use this new field type here and here.

Adjust task distribution settings

You can now modify task distribution settings at any time, and Argilla will automatically recalculate the completed and pending records. When you update this setting, records will be removed from or added to the pending queues of your team accordingly.

You can make this change in the dataset settings page or using the SDK:

import argilla as rg

client = rg.Argilla(api_url="<api_url>", api_key="<api_key>")

dataset = client.datasets("my_dataset")
dataset.settings.distribution.min_submitted = 2
dataset.update()

Track team progress from the SDK

The Argilla SDK now provides a way to retrieve data on annotation progress. This feature allows you to monitor the number of completed and pending records in a dataset and also the number of responses made by each user:

import argilla as rg

client = rg.Argilla(api_url="<api_url>", api_key="<api_key>")

dataset = client.datasets("my_dataset")

progress = dataset.progress(with_users_distribution=True)

The expected output looks like this:

{
    "total": 100,
    "completed": 50,
    "pending": 50,
    "users": {
        "user1": {
           "completed": { "submitted": 10, "draft": 5, "discarded": 5},
           "pending": { "submitted": 5, "draft": 10, "discarded": 10},
        },
        "user2": {
           "completed": { "submitted": 20, "draft": 10, "discarded": 5},
           "pending": { "submitted": 2, "draft": 25, "discarded": 0},
        },
        ...
}

Automatic settings inference

When you import a dataset using the from_hub method, Argilla will automatically infer the settings, such as the fields and questions, based on the dataset Features. This will save you time and effort when working with datasets from the Hub.

import argilla as rg

client = rg.Argilla(api_url="<api_url>", api_key="<api_key>")

dataset = rg.Dataset.from_hub("yahma/alpaca-cleaned")

Task templates

We've added pre-built templates for common dataset types, including text classification, ranking, and rating tasks. These templates provide a starting point for your dataset creation, with pre-configured settings. You can use these templates to get started quickly, without having to configure everything from scratch.

import argilla as rg

client = rg.Argilla(api_url="<api_url>", api_key="<api_key>")

settings = rg.Settings.for_classification(labels=["positive", "negative"])

dataset = rg.Dataset(
	name="my_dataset",
	settings=settings,
	client=client,
	workspace="my_workspace",
)

dataset.create()

🌟 Release highlights

Image Field

Argilla now supports multimodal datasets with the introduction of a native ImageField. This new type of field allows you to work seamlessly with image data, making it easier to annotate and curate datasets that combine text and images.

Here's an example of a dataset with an image field:

import argilla as rg

client = rg.Argilla(...)

settings = rg.Settings(
	fields = [
		rg.ImageField(name="image"),
		rg.TextField(name="caption")
	],
	questions = [
		rg.LabelQuestion(
			name="good_or_bad", 
			title="Is the caption good or bad",
			labels=["good", "bad"]
		),
		rg.TextQuestion(name="comments")
	]
)

dataset = rg.Dataset(name="image_captions", settings=settings)
dataset.create()

record = rg.Record(
	fields= {
	  "image": "https://docs.argilla.io/dev/assets/logo.svg", 
	  "caption": "This is the Argilla logo"
	}
)
dataset.records.log([record])

Dark Mode

Argilla seems too bright for you? You can now try our new Dark Mode: a theme designed to reduce eye strain and give a new modern look to the app. You can enable Dark Mode under "My Settings".

Spanish Translation

Captura de pantalla 2024-09-05 a las 17 28 29

We're committed to making Argilla accessible to a broader audience. With the addition of Spanish translation, we're taking another step towards breaking language barriers and enabling more teams to collaborate on data curation projects.
There's nothing you need to do to enable it: Argilla will automatically switch to Spanish when your browser's main language is set to Spanish. ¡Disfrutadla!

Import any dataset from the Hugging Face Hub

The from_hub method just got a major boost! You can now input your own settings, allowing you to use this method with almost any dataset from the Hugging Face Hub, not just Argilla datasets.

Here's how easy it is to import a dataset from the Hub:

import argilla as rg

client = rg.Argilla(...)

settings = rg.Settings(
    fields=[
        rg.TextField(name="input"),
    ],
    questions=[
        rg.TextQuestion(name="output"),
    ],
)

dataset = rg.Dataset.from_hub(
    repo_id="yahma/alpaca-cleaned",
    settings=settings,
)

Adaptable text areas for TextQuestion's, providing a better user experience in the UI.
Enhanced messaging for empty queues, keeping you informed when no records are available in the UI.

Full Changelog: v2.0.1...v2.1.0

Assets 2

Releases: argilla-io/argilla

v2.6.0

🔆 Release highlights

Push to hub

Share your progress

Update user data

Change record fields

Changelog v2.6.0

New Contributors

Contributors

v2.5.0

🔆 Release highlights

Webhooks

A redesigned home page

Python 3.13 and Pydantic v2 support

Other improvements

Changelog v2.5.0

New Contributors

Contributors

v2.4.1

v2.4.0

🔆 Release highlights

Import Hub datasets from the UI

Deploy an Argilla Space directly from the SDK

Changelog v2.4.0

New Contributors

Contributors

v2.3.1

What's Changed

v2.3.0

🌟 Release highlights

Custom Fields: the most powerful way to build custom annotation tasks

With a custom field

Before the custom field

How to get started with custom fields

Other features

Changed

Fixed

New Contributors

Thanks to

Contributors

v2.2.2

What's Changed

Fixed

Changed

v2.2.1

What's Changed

v2.2.0

🌟 Release highlights

ChatField: working with text conversations in Argilla

Adjust task distribution settings

Track team progress from the SDK

Automatic settings inference

Task templates

Release 2.1.0

🌟 Release highlights

Image Field

Dark Mode

Spanish Translation

Import any dataset from the Hugging Face Hub

Other Notable Fixes and Improvements