Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend OpenAI finish_reason handling #1985

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

steven-solomon
Copy link

@steven-solomon steven-solomon commented May 10, 2024

As part of #1979, I added:

  • explicit handling for the finish_reason values: length, content_filter, and stop
  • handlingfinish_reason for any unspecified cases
  • log repr_doc_ids in all error cases to improve debugging

Fixes #1979

@steven-solomon steven-solomon marked this pull request as ready for review May 10, 2024 14:58
Copy link
Owner

@MaartenGr MaartenGr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the delay, I left some comments here and there.

bertopic/representation/_openai.py Outdated Show resolved Hide resolved
bertopic/representation/_openai.py Show resolved Hide resolved
bertopic/representation/_openai.py Outdated Show resolved Hide resolved
bertopic/representation/_openai.py Outdated Show resolved Hide resolved
bertopic/representation/_openai.py Outdated Show resolved Hide resolved
bertopic/representation/_openai.py Outdated Show resolved Hide resolved
@steven-solomon
Copy link
Author

@MaartenGr, I made the following updates based on your feedback:

  • Renamed choice to output
  • Introduce a logging prefix OpenAI Topic Representation
  • Improve copy of logs based on your feedback
    • Modified references to doc_ids to use document IDs
    • Fixed typos

Open questions:

@steven-solomon steven-solomon requested a review from MaartenGr May 22, 2024 19:31
@MaartenGr
Copy link
Owner

@steven-solomon Apologies for the delay, last few weeks were busy. It seems that there are some conflicts with the main branch. Other than that, I just ran the pipeline but expect this to run without any issues. Moreover, it seems there might be an issue with selecting the correct output variable.

@MaartenGr MaartenGr mentioned this pull request Jun 6, 2024
4 tasks
@steven-solomon
Copy link
Author

Moreover, it seems there might be an issue with selecting the correct output variable.

I think I see what you mean. The definition of output on L221 should be up another level so it is accessible in L243. Is that correct or is there something else amiss?

I am planning on adding the change to use the model's logger in the next day or two.

@MaartenGr
Copy link
Owner

@steven-solomon I think I see what you mean. The definition of output on L221 should be up another level so it is accessible in L243. Is that correct or is there something else amiss?

No, I meant that the output variable response on L242 and L240 are not doing anything:

                if self.exponential_backoff:
                    response = completions_with_backoff(self.client, model=self.model, prompt=prompt, **self.generator_kwargs)
                else:
                    response = self.client.completions.create(model=self.model, prompt=prompt, **self.generator_kwargs)

since they are not accessed on L243:

                label = output.text.strip()

@steven-solomon
Copy link
Author

@MaartenGr I have made the switch the pull the logger from the topic_model parameter and added a fix for L243.

@MaartenGr
Copy link
Owner

@steven-solomon There is no topic_model.logger attribute so that wouldn't work, unfortunately. What was wrong with the previous approach?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle Responsible AI scenarios for OpenAI
2 participants