Skip to content

Commit

Permalink
Merge branch 'main' into feature/add-population-frontend
Browse files Browse the repository at this point in the history
  • Loading branch information
NoB0 committed Feb 11, 2024
2 parents 13d3b9b + 3489b9a commit 6139bf0
Show file tree
Hide file tree
Showing 59 changed files with 3,140 additions and 396 deletions.
12 changes: 11 additions & 1 deletion .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,9 @@ jobs:
pip install --upgrade pip
pip install -r requirements.txt
- name: Install Graphviz
uses: tlylt/install-graphviz@v1

- name: Run black
shell: bash
run: pre-commit run black --all-files
Expand Down Expand Up @@ -68,6 +71,9 @@ jobs:
pip install --upgrade pip
pip install -r requirements.txt
- name: Install Graphviz
uses: tlylt/install-graphviz@v1

- name: Run mypy
shell: bash
run: pre-commit run mypy --all-file
Expand All @@ -92,6 +98,10 @@ jobs:
pip install --upgrade pip
pip install -r requirements.txt
pip install pytest-github-actions-annotate-failures
- name: Install graphviz
uses: tlylt/install-graphviz@v1


- name: PyTest with code coverage
continue-on-error: true
Expand Down Expand Up @@ -191,4 +201,4 @@ jobs:
Current Branch | Main Branch |
| ------ | ------ |
![Coverage Badge](https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/NoB0/8446f35dc373966dc971fb9237483cce/raw/coverage.${{ env.REPO_NAME }}.${{ github.event.number }}.json) | ![Coverage Badge](https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/NoB0/8446f35dc373966dc971fb9237483cce/raw/coverage.${{ env.REPO_NAME }}.main.json) |
edit-mode: replace
edit-mode: replace
3 changes: 3 additions & 0 deletions .github/workflows/merge.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,9 @@ jobs:
pip install --upgrade pip
pip install -r requirements.txt
pip install pytest-github-actions-annotate-failures
- name: Install Graphviz
uses: tlylt/install-graphviz@v1

- name: PyTest with code coverage
continue-on-error: true
Expand Down
65 changes: 60 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,68 @@
# IAI Project Template
# PKG API: A Tool for Personal Knowledge Graph Management

[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
![Coverage Badge](https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/NoB0/8446f35dc373966dc971fb9237483cce/raw/coverage.pkg-api.main.json)
![Python version](https://img.shields.io/badge/python-3.9-blue)

This repository serves as a template for software projects.
The PKG API is a tool for managing personal knowledge graphs (PKGs). It provides a simple solution for end users and service providers to administrate and interact with the users' PKGs through natural language statements and simple web forms.
The representation of a statement inside the PKG is defined by the [PKG vocabulary](http://w3id.org/pkg/).
Within the API, two key modules are present: one for processing natural language statements ([NL2PKG](#nl2pkg)), and another for generating and executing SPARQL queries against the PKG ([PKG connector](#pkg-connector)).

# Testing and GitHub actions
The PKG API is served as a RESTful API and we provide a user interface, PKG Client, that allows users to manage their PKG online.

Using `pre-commit` hooks, `flake8`, `black`, `mypy`, `docformatter`, and `pytest` are locally run on every commit. For more details on how to use `pre-commit` hooks see [here](https://github.com/iai-group/guidelines/tree/main/python#install-pre-commit-hooks).
![Overview](docs/source/_static/PKG_API_overview.png)

Similarly, Github actions are used to run `flake8`, `black` and `pytest` on every push and pull request. The `pytest` results are sent to [CodeCov](https://about.codecov.io/) using their API for to get test coverage analysis. Details on Github actions are [here](https://github.com/iai-group/guidelines/blob/main/github/Actions.md).
## PKG API

### NL2PKG

This module is responsible for processing natural language statements. The processing is divided into two steps: (1) natural language understanding handled by [`annotators`](pkg_api/nl_to_pkg/annotators) and (2) [`entity_linking`](pkg_api/nl_to_pkg/entity_linking).

Available annotators and entity linkers:

* [`StatementAnnotator`](pkg_api/nl_to_pkg/annotators/annotator.py)
- [`ThreeStepStatementAnnotator`](pkg_api/nl_to_pkg/annotators/three_step_annotator.py): Annotates statements using a three-step approach: (1) intent recognition, (2) Subject-Predicate-Object triple extraction, and (3) preference extraction.
* [`EntityLinker`](pkg_api/nl_to_pkg/entity_linking/entity_linker.py)
- [`RELEntityLinker`](pkg_api/nl_to_pkg/entity_linking/rel_entity_linking.py): Links entities using [Radboud Entity Linker](https://rel.readthedocs.io/en/latest/) API.
- [`SpotlightEntityLinker`](pkg_api/nl_to_pkg/entity_linking/spotlight_entity_linker.py): Links entities using DBpedia Spotlight.

### PKG connector

The PKG connector is responsible for executing SPARQL queries against the PKG.
[Utilities functions](pkg_api/utils.py) are responsible for generating SPARQL queries based on the intent of the user. For example, if a user wants to add a statement to the PKG, a tailored INSERT query is generated.

### Server

The backend server is a [Flask](https://flask.palletsprojects.com/en/3.0.x/) server. It is responsible for connecting the users and service providers to PKGs.

#### Starting the server

Before starting the server, make sure that the [requirements](requirements.txt) are installed and that CORS is disabled in your web browser.

To start the server, run the following command:

```bash
flask --app pkg_api/server run --debug
```

Note the `--debug` flag is optional, but it is recommended to use it during development.

By default, the server will run locally on port 5000. In case you want to run the server on a different port, you can specify the port using the `--port` flag.

## PKG Client

The user interface is a React application that communicates with the server to manage the PKG. More details on how to run PKG Client can be found [here](pkg-client/README.md).

:warning: Note that you need to update `PKG_API_BASE_URL` in the [configuration](pkg_client/public/config.json) in case the server is not running on the default port.

## Demo

<https://github.com/iai-group/pkg-api/assets/28621493/cf51ab83-b7c9-4441-93c9-abd2bae18a98>

## Conventions

We follow the [IAI Python Style Guide](https://github.com/iai-group/styleguide/tree/main/python).

## Contributors

PKG API is developed and maintained by the [IAI group](https://iai.group/) at the University of Stavanger.
7 changes: 7 additions & 0 deletions config/entity_linking/dbpedia_spotlight.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
url: "https://api.dbpedia-spotlight.org/en/annotate"
headers:
accept: "application/json"
params:
confidence: 0.5
support: 50
types: null
3 changes: 2 additions & 1 deletion data/README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
# Data

This folder should contain the description of data that was used in the project, including datasets, queries, ground truth, run files, etc. Files under 10MB can be stored on GitHub, larger files should be stored on a server (e.g., gustav1). This README should provide a comprehensive overview of all the data that is used and where it originates from (e.g., part of an official test collection, generated using code in this repo or a third-party tool, etc.).
* `llm_prompts`: LLM prompts are stored in this folder.
* `nl_annotations`: Evaluation dataset is contained in this folder.
12 changes: 12 additions & 0 deletions data/llm_prompts/cot/intent.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
What is user's intent with the following statement? The options are:

ADD - Statement of fact or preference, examples: "I live in Stavanger", "David doesn't like Matrix"
GET - Asking for information, examples: "Do I live in Stavanger?", "Does David like the Matrix?"
DELETE - Request to delete or remove an item, examples: "I dont live in Stavanger anymore", "Remove Matrix from my library"
UNKNOWN - None of the above, examples: "How many movies have I watched?", "How is the weather today?"

Statement:
------------------------------
{statement}
------------------------------
Answer:
19 changes: 19 additions & 0 deletions data/llm_prompts/cot/preference.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
What is the preference towards "{object}" in the following statement? Answer 1 for positive, -1 for negative, or N/A when sentiment is not applicable.

Example:
------------------------------
I like cats.
------------------------------
Answer: 1

Example:
------------------------------
I hate romcom
------------------------------
Answer: -1

Statement:
------------------------------
{statement}
------------------------------
Answer:
19 changes: 19 additions & 0 deletions data/llm_prompts/cot/triple.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
Return pipe-separated subject, predicate, and object from the following statement. If a field is not applicable, output N/A. If there are multiple subjects and objects, prefer the one which is about a preference or fact mentioned in the statement.

Example:
------------------------------
I like cats.
------------------------------
Answer: I | like | cats

Example:
------------------------------
Hello John.
------------------------------
Answer: N/A | N/A | John

Statement:
------------------------------
{statement}
------------------------------
Answer:
12 changes: 12 additions & 0 deletions data/llm_prompts/default/intent.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
What is user's intent with the following statement? The options are:

ADD - Statement of fact or preference, typically contains the verbs "is", "likes", "prefers", "told me", "admires", "went to", "am", "has seen".
GET - Asking for information, typically it starts with "Do I", "Have I" etc.
DELETE - Request to delete or remove an item, typically contains the verbs "remove", "delete" or "discard"
UNKNOWN - None of the above, statement may contain more than one intent.

Statement:
------------------------------
{statement}
------------------------------
Answer:
7 changes: 7 additions & 0 deletions data/llm_prompts/default/preference.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
What is the sentiment towards "{object}" in the following statement? Answer 1 for positive, -1 for negative, or N/A when sentiment is not applicable.

Statement:
------------------------------
{statement}
------------------------------
Answer:
19 changes: 19 additions & 0 deletions data/llm_prompts/default/triple.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
Return pipe-separated subject, predicate, and object from the following statement. If a field is not applicable, output N/A.

Example:
------------------------------
I like cats.
------------------------------
Answer: I | like | cats

Example:
------------------------------
Hello John.
------------------------------
Answer: N/A | N/A | John

Statement:
------------------------------
{statement}
------------------------------
Answer:
31 changes: 31 additions & 0 deletions data/nl_annotations/test.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
"Sentence", "Intent", "Subject", "Predicate", "Object", "Preference"
"Bob lives in New York.","ADD", "Bob", "lives in", "New York",
"Diana is a fan of Steven Spielberg's work.", "ADD", "Diana", "is a fan of", "Steven Spielberg work", 1
"Charlie doesn't prefer Steven Spielberg's movies.","ADD", "Charlie", "doesn't prefer", "Steven Spielberg movies", -1
"Alice admires Robert Downey Jr..","ADD", "Alice", "admires", "Robert Downey Jr.", 1
"Ethan is a fan of Interstellar.","ADD", "Ethan", "is a fan of", "Interstellar", 1
"Diana told me she loves movies directed by Quentin Tarantino.","ADD", "Diana", "loves", "movies directed by Quentin Tarantino", 1
"Bob's sister is Diana.","ADD", "Bob", "sister of", "Diana",
"Bob admires Emma Watson.","ADD", "Bob", "admires", "Emma Watson", 1
"Diana likes sci-fi movies.","ADD", "Diana", "likes", "sci-fi movies", 1
"Charlie admires Tom Hanks.","ADD", "Charlie", "admires", "Tom Hanks", 1
"Do I like Pulp Fiction?,","GET", "I", "like", "Pulp Fiction",
"Do I like movies directed by Mel Gibson?,","GET", "I", "like", "Mel Gibson movies",
"Do I prefer Pulp Fiction","GET", "I", "prefer", "Pulp Fiction", 1
"Do I prefer Rambo movies,","GET", "I", "prefer", "Rambo movies", 1
"Do I like movies featuring actors that have played Macbeth? Which of these movies do I prefer?,","UNKNOWN", , , ,
"Do I like action movies?,","GET", "I", "like", "Action movies",
"Do I prefer romantic comedies?,","GET", "I", "prefer", "Romantic comedies", 1
"Do I hate romantic comedies?,","GET", "I", "hate", "Action movies", -1
"How many action movies are stored in my PKG?,","UNKNOWN", "", "", "",
"Save The Godfather as my favourite movie,","ADD", "I", "favourite movie", "The Godfather", 1
"I enjoy watching action movies with Alice,","ADD", "I", "enjoy watching", "action movies with Alice", 1
"I went to the movie theatre yesterday and loved Oppenheimer,","ADD", "I", "loved", "Oppenheimer", 1
"Note that I would never watch cheesy romcom unless with my friends,","UNKNOWN", "", "", "",
"I am married to Bob,","ADD", "I", "married to", "Bob",
"My husband doesn't like romantic comedies,","ADD", "My husband", "dislikes", "romantic comedies", -1
"Emma is my mother,","ADD", "Emma", "is", "my mother",
"My husband has seen all the Christopher Nolan movies and a big fan,","ADD", "My husband", "big fan of", "Christopher Nolan movies", 1
"Remove Forrest Gump from my movie library.","DELETE","", "", "Forrest Gump",
"I am no longer married to Bob", "DELETE", "I", "married to", "Bob",
"Discard everything directed by Steven Spielberg.", "DELETE", "", "is directed by", "Steven Spielberg",
1 change: 1 addition & 0 deletions data/pkg_visualizations/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# PKG Visualizations
Binary file added docs/source/_static/PKG_API_overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
32 changes: 28 additions & 4 deletions pkg_api/connector.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
"""Connector to triplestore."""
import os
from enum import Enum

from rdflib import Graph
from rdflib.query import Result

from pkg_api.pkg_types import URI
from pkg_api.core.namespaces import PKGPrefixes
from pkg_api.core.pkg_types import URI

# Method to create/load the RDF graph
# Method to execute the SPARQL query
Expand Down Expand Up @@ -34,25 +36,47 @@ def __init__(
rdf_store: Type of RDF store to use.
rdf_store_path: Path to the RDF store.
"""
self._rdf_store_path = f"{rdf_store_path}.ttl"
self._graph = Graph(rdf_store.value, identifier=owner)
self._bind_namespaces()
if os.path.exists(self._rdf_store_path):
self._graph.parse(self._rdf_store_path, format="turtle")
self._graph.open(rdf_store_path, create=True)

def _bind_namespaces(self) -> None:
"""Binds namespaces to the graph."""
for prefix, namespace in PKGPrefixes.__members__.items():
self._graph.bind(prefix.lower(), namespace.value)

def execute_sparql_query(self, query: str) -> Result:
"""Execute SPARQL query.
"""Executes SPARQL query.
Args:
query: SPARQL query.
"""
return self._graph.query(query)

def execute_sparql_update(self, query: str) -> None:
"""Execute SPARQL update.
"""Executes SPARQL update.
Args:
query: SPARQL update.
"""
self._graph.update(query)

def close(self) -> None:
"""Close the connection to the triplestore."""
"""Closes the connection to the triplestore."""
self.save_graph()
self._graph.close()

def save_graph(self) -> None:
"""Saves the graph to a file.
Raises:
FileNotFoundError: If the directory to store the graph does not
exist.
"""
directory = os.path.dirname(self._rdf_store_path)
if not os.path.exists(directory):
raise FileNotFoundError(f"Directory {directory} does not exist.")
self._graph.serialize(self._rdf_store_path, format="turtle")
Loading

0 comments on commit 6139bf0

Please sign in to comment.