Skip to content

Commit

Permalink
Briefly document the label re-using feature
Browse files Browse the repository at this point in the history
  • Loading branch information
samsucik committed Apr 22, 2024
1 parent 4428106 commit eb88f9c
Showing 1 changed file with 14 additions and 0 deletions.
14 changes: 14 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,20 @@ rewriting the `postprocess` function in `prompterator/postprocess_output.py`. Th
receive one raw model-generated text at a time and should output its postprocessed version. Both
the raw and the postprocessed text are kept and saved.

### Reusing labels for repeatedly encountered examples

While iterating your prompt on a dataset, you may find yourself annotating a model output that you
already annotated in an earlier round. You can choose to automatically reuse such previously
assigned labels by toggling "reuse past labels". To speed up your annotation process even more,
you can toggle "skip past label rows" so that you only go through the rows for which no
previously assigned label was found.

How this feature works:
- Existing labels are searched for in the current list of files in the sidebar, where a match
requires both the `response` and all the input columns' values to match.
- If multiple different labels are found for a given input+output combination (a sign of
inconsistent past annotation work), the most recent label is re-used.

## Paper

You can find more information on Prompterator in the associated paper: https://aclanthology.org/2023.emnlp-demo.43/
Expand Down

0 comments on commit eb88f9c

Please sign in to comment.