Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Positive/negative highest ranking terms #31

Merged
merged 9 commits into from
Mar 22, 2024

Conversation

x-tabdeveloping
Copy link
Owner

No description provided.

@x-tabdeveloping x-tabdeveloping requested a review from rbroc March 21, 2024 13:18
Copy link
Collaborator

@rbroc rbroc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good! just a couple of comments/questions:

  • more of a matter of taste, but for less repeated code the logic that is now in _topics_table could actually be implemented as part of get_topics (as it was before)? not a biggie tho, can leave as is right now and potentially discuss this for later refactoring;
  • following the same logic as with keywords, should the user be able to see "lowest" ranking docs for a topic -- should there be an option to return negative documents in _highest_ranking_docs?

turftopic/base.py Show resolved Hide resolved
@x-tabdeveloping
Copy link
Owner Author

Regarding get_topics() I'm not sure about that one. I only included that method so that the uninitiated, who come over from e.g. Gensim, and do not know how topic models work in sklearn can get a nice JSON serializable Python object. So it's basically an interface for people coming from the more software-development side of things who don't like NumPy and Pandas magic.
On the other hand I don't personally like it, and I'm not sure it is worth the hustle to implement negative terms in it. What do you think @rbroc ?

@x-tabdeveloping
Copy link
Owner Author

Regarding the other point: Yes, totally! I just haven't implemented it yet. But if you look at the issue's description I already outlined that I wanna do it :D

@rbroc
Copy link
Collaborator

rbroc commented Mar 22, 2024

Regarding get_topics() I'm not sure about that one. I only included that method so that the uninitiated, who come over from e.g. Gensim, and do not know how topic models work in sklearn can get a nice JSON serializable Python object. So it's basically an interface for people coming from the more software-development side of things who don't like NumPy and Pandas magic.
On the other hand I don't personally like it, and I'm not sure it is worth the hustle to implement negative terms in it. What do you think @rbroc ?

Fine by me to leave as is for now - I like how “semantically” transparent get_topics is but this is something we can re-evaluate later. :)

Copy link
Collaborator

@rbroc rbroc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I’ll leave it to you to decide if for transparency you want to rename show_negative arguments to show_lowest.

@x-tabdeveloping x-tabdeveloping removed the request for review from jankounchained March 22, 2024 10:57
@x-tabdeveloping x-tabdeveloping merged commit 1639f28 into main Mar 22, 2024
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants