-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of Dense Retrievals #49
Comments
Hey @hosseinfani, Looking ahead, our next steps involve obtaining the clueweb12, clueweb09, and gov2 datasets. Similar to robust04, for gov2, we'll need to sign a contract, and they will send us a copy of the drive, as explained here. |
Hi @DelaramRajaei |
@hosseinfani |
Hi @hosseinfani, I wanted to provide you with an update on the indexing process. I downloaded the antique and dbpedia corpus and converted their format to the required jsonl format as mentioned in the documentation. I uploaded the jsonls in the Teams > RePir channel > files > Datasets & indexes > Corpora. Currently, I'm facing an issue when using pyserini for indexing. There seems to be a conflict with pygaggle, but I successfully removed pygaggle and used other libraries. However, I'm still encountering some issues with the library. Hi @yogeswarl, I noticed that you created the dense indexes for |
Here is the issue, I will keep a record of all my findings as I work on the task of refining all aspects of the retrieval system on different datasets using dense retrievals.
The text was updated successfully, but these errors were encountered: