Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Created artist synonym generator and artist list files #305

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

adaveinthelife
Copy link
Collaborator

I wrote this script to address the issue #302 which requested a script be created to generate CSV files for the Assistant and Alexa that would contain synonyms for artist titles.

To run this script, you would type artists-synonym-generator artists.txt

The artists.txt file contains a list of current artists that we have in the collections.

@adaveinthelife adaveinthelife requested a review from hyzhak July 30, 2018 16:10
Copy link
Member

@hyzhak hyzhak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good :)
there are few things should be fixed.

and I got an error:

./artist-synonyms-generator.sh: line 63: : No such file or directory

any ideas why?

And please try to run it locally as well, to check that everything works fine.

Be sure you have excluded csv files from repository ;)

trap "kill $!" EXIT #Die with parent if we die prematurely

#Overwrite current synonym files if they exist
echo -n > alexa-artist-list-with-synonyms.csv
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems we should add both of csv files to gitignore

#While loop to read through each line of artists list
while IFS='' read -r line || [[ -n "$line" ]]; do
name="$line"
echo -n $name"," >> alexa-artist-list-with-synonyms.csv
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems you combing here spaces and tabs. Please use spaces only

echo -n $name | sed 's/and.*//'>> google-artist-list-with-synonyms.csv
echo -n "\"">> google-artist-list-with-synonyms.csv
fi
if [[ $name == "The"* ]]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems could catch words like Therapy, Thelda, theft and etc. We don't have such artists now, but we would be able have them later

@@ -0,0 +1,67 @@
#!/bin/bash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please changes this file to executable:

chmod +x bin/artist-synonyms-generator.sh

@hyzhak
Copy link
Member

hyzhak commented Aug 1, 2018

I have generated Alexa and Google cvs and found that synonyms have an extra new line.
screen shot 2018-08-01 at 15 58 25

@hyzhak
Copy link
Member

hyzhak commented Aug 1, 2018

as well I have tried to upload entities to Google Action (draf copy) and got error:

Errors in 'google-artist-list-with-synonyms' entity: Name length is greater than 30 chars.
Entry value length is bigger than 512.

once I have renamed file I got:

Errors in 'google-artist' entity: Entry value length is bigger than 512.

as well we may want add step to README which would describe what should be done to get desired result

@hyzhak
Copy link
Member

hyzhak commented Aug 2, 2018

@adaveinthelife thanks for the improvement, it looks much better. I only found some extra commas (,) for a case when you have a few different rules
screen shot 2018-08-02 at 16 16 34

.gitignore Outdated
@@ -8,4 +8,5 @@ node_modules
functions/firebase-debug.log
npm-debug.log
.DS_Store

bin/alexa-artist-list-with-synonyms.csv
Copy link
Member

@hyzhak hyzhak Aug 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls update the name of excluded files to the new ones

@hyzhak
Copy link
Member

hyzhak commented Aug 2, 2018

@adaveinthelife as well Alexa Skill fails on a new list of entities
screen shot 2018-08-02 at 19 25 42

in particular for artist Joe Keyes The Late Bloomer. Please do manual testing to be sure that this script works fine

@adaveinthelife
Copy link
Collaborator Author

adaveinthelife commented Aug 13, 2018

Still doing more testing on this but I've realized something that may cause us an issue. So as it currently stands - many of the names have parenthesis and square brackets in their names. This is a problem because the system will not allow you do raw bulk entry with these chars in the name (either in artist name or synonyms). Most likely because the system would have no way of matching these artists because the special chars would never be translated to dialog. If I write a rule to remove these, will it break the music pipeline?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants