I wonder, can we improve final score, if we encode each word and masking some numeric entry followed by classification, rather than character level classification. #11

shreeshiv · 2020-05-03T16:23:16Z

I wonder, can we improve final score, if we encode each word and masking some numeric entry followed by classification, rather than character level classification for task 3?

patrick22414 · 2020-05-22T11:31:20Z

Thank you @shreeshiv ! Constructing a dictionary is indeed a valid approach and, as I believe, a common practice in NLP. And yes, there is a solid chance that it may improve performance. However, it also comes with some disadvantages, such as we won't be able to detect a word outside the constructed dictionary, and it puts more heavy lifting on encoding.

In our case, we thought it is very likely that a non-dictionary word will appear in the test set, such as abbreviations, shop names, or menu entries. Characters, on the other hand, are easy to encode and can deal with new words, and have yielded satisfying results.

However, I do encourage you to explore a word-based approach if you would like!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I wonder, can we improve final score, if we encode each word and masking some numeric entry followed by classification, rather than character level classification. #11

I wonder, can we improve final score, if we encode each word and masking some numeric entry followed by classification, rather than character level classification. #11

shreeshiv commented May 3, 2020

patrick22414 commented May 22, 2020

I wonder, can we improve final score, if we encode each word and masking some numeric entry followed by classification, rather than character level classification. #11

I wonder, can we improve final score, if we encode each word and masking some numeric entry followed by classification, rather than character level classification. #11

Comments

shreeshiv commented May 3, 2020

patrick22414 commented May 22, 2020