generated from iai-group/template-project
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
40 changed files
with
1,207 additions
and
55 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,4 @@ | ||
# Data | ||
|
||
This folder should contain the description of data that was used in the project, including datasets, queries, ground truth, run files, etc. Files under 10MB can be stored on GitHub, larger files should be stored on a server (e.g., gustav1). This README should provide a comprehensive overview of all the data that is used and where it originates from (e.g., part of an official test collection, generated using code in this repo or a third-party tool, etc.). | ||
* `llm_prompts`: LLM prompts are stored in this folder. | ||
* `nl_annotations`: Evaluation dataset is contained in this folder. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# NL to API Prompts | ||
|
||
This folder contains the prompts for the NL to API model. The prompts are simple text files with one prompt per file. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
What is user's intent with the following statement? The options are: | ||
|
||
ADD - statement of fact or preference, | ||
GET - Asking for information, | ||
DELETE - Request to delete or remove an item, | ||
UNKNOWN - Non of the above | ||
|
||
Statement: | ||
------------------------------ | ||
{statement} | ||
------------------------------ | ||
Answer: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
What is the sentiment towards "{object}" in the following statement? Answer 1 for positive, -1 for negative, or N/A when sentiment is not applicable. | ||
|
||
Statement: | ||
------------------------------ | ||
{statement} | ||
------------------------------ | ||
Answer: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
Return pipe-separated subject, predicate, and object from the following statement. If a field is not applicable, output N/A. | ||
|
||
Example: | ||
------------------------------ | ||
I like cats. | ||
------------------------------ | ||
Answer: I | like | cats | ||
|
||
Example: | ||
------------------------------ | ||
Hello John. | ||
------------------------------ | ||
Answer: N/A | N/A | John | ||
|
||
Statement: | ||
------------------------------ | ||
{statement} | ||
------------------------------ | ||
Answer: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
"Sentence", "Intent", "Subject", "Predicate", "Object", "Preference" | ||
"Bob lives in New York.","ADD", "Bob", "lives in", "New York", | ||
"Diana is a fan of Steven Spielberg's work.", "ADD", "Diana", "fan of", "Steven Spielberg movies", 1 | ||
"Charlie doesn't prefer Steven Spielberg's movies.","ADD", "Charlie", "doesn't prefer", "Steven Spielberg movies", -1 | ||
"Alice admires Robert Downey Jr..","ADD", "Alice", "admires", "Robert Downey Jr.", 1 | ||
"Ethan is a fan of Interstellar.","ADD", "Ethan", "fan of", "Interstellar", 1 | ||
"Diana told me she loves movies directed by Quentin Tarantino.","ADD", "Diana", "loves", "movies directed by Quentin Tarantino", 1 | ||
"Bob's sister is Diana.","ADD", "Bob", "sister of", "Diana", | ||
"Bob admires Emma Watson.","ADD", "Bob", "admires", "Emma Watson", 1 | ||
"Diana likes sci-fi movies.","ADD", "Diana", "likes", "sci-fi movies", 1 | ||
"Charlie admires Tom Hanks.","ADD", "Charlie", "admires", "Tom Hanks", 1 | ||
"Do I like Pulp Fiction?,","GET", "I", "like", "Pulp Fiction", | ||
"Do I like movies directed by Mel Gibson?,","GET", "I", "like", "Mel Gibson movies", | ||
"Do I prefer Pulp Fiction","GET", "I", "prefer", "Pulp Fiction", 1 | ||
"Do I prefer Rambo movies,","GET", "I", "prefer", "Rambo movies", 1 | ||
"Do I like movies featuring actors that have played Macbeth? Which of these movies do I prefer?,","UNKNOWN", , , , | ||
"Do I like action movies?,","GET", "I", "like", "Action movies", | ||
"Do I prefer romantic comedies?,","GET", "I", "prefer", "Romantic comedies", 1 | ||
"Do I hate romantic comedies?,","GET", "I", "hate", "Action movies", -1 | ||
"How many action movies are stored in my PKG?,","UNKNOWN", "", "", "", | ||
"Save The Godfather as my favourite movie,","ADD", "I", "favourite movie", "The Godfather", 1 | ||
"I enjoy watching action movies with Alice,","ADD", "I", "enjoy watching", "action movies with Alice", 1 | ||
"I went to the movie theatre yesterday and loved Oppenheimer,","ADD", "I", "loved", "Oppenheimer", 1 | ||
"Note that I would never watch cheesy romcom unless with my friends,","UNKNOWN", "", "", "", | ||
"I am married to Bob,","ADD", "I", "married to", "Bob", | ||
"My husband doesn't like romantic comedies,","ADD", "My husband", "dislikes", "romantic comedies", -1 | ||
"Emma is my mother,","ADD", "I", "son of", "Emma", | ||
"My husband has seen all the Christopher Nolan movies and a big fan,","ADD", "My husband", "big fan of", "Christopher Nolan movies", 1 | ||
"Remove Forrest Gump from my movie library.","DELETE","", "", "Forrest Gump", | ||
"I am no longer married to Bob", "DELETE", "I", "married to", "Bob", | ||
"Discard everything directed by Steven Spielberg.", "DELETE", "", "directed by", "Steven Spielberg", |
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
"""Dataclasses for the annotations used in the PKG API.""" | ||
|
||
from dataclasses import dataclass, field | ||
from typing import List, Optional, Union | ||
|
||
from pkg_api.pkg_types import URI | ||
|
||
|
||
@dataclass | ||
class Concept: | ||
"""Class representing a SKOS concept.""" | ||
|
||
description: str | ||
related_entities: List[URI] = field(default_factory=list) | ||
broader_entities: List[URI] = field(default_factory=list) | ||
narrower_entities: List[URI] = field(default_factory=list) | ||
|
||
|
||
@dataclass | ||
class Triple: | ||
"""Class representing a subject, predicate, object triple.""" | ||
|
||
subject: Union[URI, str, None] = None | ||
predicate: Union[URI, str, None] = None | ||
object: Union[URI, Concept, str, None] = None | ||
|
||
|
||
@dataclass | ||
class Preference: | ||
"""Class representing a preference.""" | ||
|
||
topic: Union[URI, Concept, str] | ||
weight: float | ||
|
||
|
||
@dataclass | ||
class PKGData: | ||
"""Represents a statement annotated with a triple and a preference.""" | ||
|
||
statement: str | ||
triple: Optional[Triple] = None | ||
preference: Optional[Preference] = None |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
"""Intents for the API.""" | ||
|
||
from enum import Enum, auto | ||
|
||
|
||
class Intent(Enum): | ||
"""Enum for intents.""" | ||
|
||
ADD = auto() | ||
GET = auto() | ||
DELETE = auto() | ||
UNKNOWN = auto() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
"""NL to PKG module.""" | ||
|
||
from .annotators.annotator import StatementAnnotator | ||
from .annotators.three_step_annotator import ThreeStepStatementAnnotator | ||
from .entity_linking.entity_linker import EntityLinker | ||
from .llm.llm_connector import LLMConnector | ||
from .llm.prompt import Prompt | ||
from .nl_to_pkg import NLtoPKG | ||
|
||
__all__ = [ | ||
"StatementAnnotator", | ||
"ThreeStepStatementAnnotator", | ||
"EntityLinker", | ||
"LLMConnector", | ||
"Prompt", | ||
"NLtoPKG", | ||
] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
"""Class for annotating a natural language query. | ||
The main purpose is to return the intent (ADD | GET | DELETE) and to | ||
annotate the triple (subject, predicate, object) and the preference (1 | | ||
-1) in the query. | ||
""" | ||
|
||
|
||
from abc import ABC, abstractmethod | ||
from typing import Tuple | ||
|
||
from pkg_api.core.annotations import PKGData | ||
from pkg_api.core.intents import Intent | ||
from pkg_api.nl_to_pkg.llm.prompt import Prompt | ||
|
||
|
||
class StatementAnnotator(ABC): | ||
def __init__(self) -> None: | ||
"""Initializes the statement annotator.""" | ||
self._prompt = Prompt() | ||
|
||
@abstractmethod | ||
def get_annotations(self, statement: str) -> Tuple[Intent, PKGData]: | ||
"""Returns a tuple of the intent and the annotated statement. | ||
Args: | ||
statement: The statement to be annotated. | ||
Raises: | ||
NotImplementedError: If the method is not implemented. | ||
Returns: | ||
A tuple of the intent and the annotated statement as PKGData. | ||
""" | ||
raise NotImplementedError |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,143 @@ | ||
"""A three-step annotator for annotating a statement. | ||
This module contains a three-step annotator for annotating a statement | ||
with a triple and a preference using LLM. | ||
""" | ||
|
||
|
||
import re | ||
from abc import ABC | ||
from typing import Optional, Tuple | ||
|
||
from pkg_api.core.annotations import PKGData, Preference, Triple | ||
from pkg_api.core.intents import Intent | ||
from pkg_api.nl_to_pkg.llm.llm_connector import LLMConnector | ||
from pkg_api.nl_to_pkg.llm.prompt import Prompt | ||
|
||
_DEFAULT_PROMPT_PATHS = { | ||
"intent": "data/llm_prompts/default/intent.txt", | ||
"triple": "data/llm_prompts/default/triple.txt", | ||
"preference": "data/llm_prompts/default/preference.txt", | ||
} | ||
|
||
|
||
def is_number(value: str) -> bool: | ||
"""Returns True if a value is a number, False otherwise. | ||
Args: | ||
value: The value to be checked. | ||
Returns: | ||
True if the value is a number, False otherwise. | ||
""" | ||
try: | ||
float(value) | ||
return True | ||
except ValueError: | ||
return False | ||
|
||
|
||
class ThreeStepStatementAnnotator(ABC): | ||
def __init__(self) -> None: | ||
"""Initializes the three-step statement annotator.""" | ||
self._prompt_paths = _DEFAULT_PROMPT_PATHS | ||
self._prompt = Prompt() | ||
self._valid_intents = {intent.name for intent in Intent} | ||
self._llm_connector = LLMConnector() | ||
|
||
def get_annotations(self, statement: str) -> Tuple[Intent, PKGData]: | ||
"""Returns a tuple with annotations for a statement. | ||
Args: | ||
statement: The statement to be annotated. | ||
Returns: | ||
The intent and the annotations. | ||
""" | ||
intent = self._get_intent(statement) | ||
triple = self._get_triple(statement) | ||
preference = ( | ||
self._get_preference(statement, triple.object) | ||
if triple and isinstance(triple.object, str) | ||
else None | ||
) | ||
return intent, PKGData(statement, triple, preference) | ||
|
||
def _get_intent(self, statement: str) -> Intent: | ||
"""Returns the intent for a statement. | ||
Args: | ||
statement: The statement to be annotated. | ||
Returns: | ||
The intent. | ||
""" | ||
prompt = self._prompt.get_prompt( | ||
self._prompt_paths["intent"], statement=statement | ||
) | ||
response = self._llm_connector.get_response(prompt) | ||
response_terms = response.split() | ||
if len(self._valid_intents.intersection(response_terms)) == 1: | ||
return next( | ||
intent for intent in Intent if intent.name in response_terms | ||
) | ||
return Intent.UNKNOWN | ||
|
||
def _get_triple(self, statement: str) -> Optional[Triple]: | ||
"""Returns the triple for a statement. | ||
Args: | ||
statement: The statement to be annotated. | ||
Returns: | ||
The triple comprised of subject, predicate, and object or None. | ||
""" | ||
prompt = self._prompt.get_prompt( | ||
self._prompt_paths["triple"], statement=statement | ||
) | ||
response = self._llm_connector.get_response(prompt) | ||
response_terms = [ | ||
None if term.strip() == "N/A" else term.strip() | ||
for term in response.split("|") | ||
] | ||
if len(response_terms) == 3: | ||
return Triple(*response_terms) | ||
return None | ||
|
||
def _get_preference( | ||
self, statement: str, triple_object: str | ||
) -> Optional[Preference]: | ||
"""Returns the preference for a statement. | ||
Args: | ||
statement: The statement to be annotated. | ||
triple_object: The object of the triple. It is only used in string | ||
form. | ||
Raises: | ||
TypeError: If the triple object is not a string. | ||
Returns: | ||
The preference. | ||
""" | ||
if not isinstance(triple_object, str): | ||
raise TypeError( | ||
f"Triple object must be of type str, not {type(triple_object)}." | ||
) | ||
|
||
prompt = self._prompt.get_prompt( | ||
self._prompt_paths["preference"], | ||
statement=statement, | ||
object=triple_object, | ||
) | ||
response = self._llm_connector.get_response(prompt) | ||
response_terms = [ | ||
term.strip() for term in re.split(r"[ .,;]+", response) | ||
] | ||
preference = next( | ||
(term for term in response_terms if is_number(term)), | ||
None, | ||
) | ||
if preference: | ||
return Preference(triple_object, float(preference)) | ||
return None |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
"""Abstract class for entity linking.""" | ||
|
||
|
||
from abc import ABC, abstractmethod | ||
|
||
from pkg_api.core.annotations import PKGData | ||
|
||
|
||
class EntityLinker(ABC): | ||
"""Entity linker for linking entities to the PKG or available KGs.""" | ||
|
||
@abstractmethod | ||
def link_annotation_entities(self, pkg_data: PKGData) -> PKGData: | ||
"""Resolves the pkg data annotations if possible. | ||
Args: | ||
pkg_data: The PKG data to be resolved. | ||
Raises: | ||
NotImplementedError: If the method is not implemented. | ||
Returns: | ||
The resolved PKG data annotations. | ||
""" | ||
raise NotImplementedError |
Oops, something went wrong.