Skip to content

Reproducibility and Dimensionality Reduction Issues with BERTopic Using High-Dimensional Embeddings #2244

Answered by MaartenGr
Feng-Xin-yu asked this question in Q&A
Discussion options

You must be logged in to vote

Could this issue be due to incompatibility between the stella_en_v5 model and the BERTopic library, or something intrinsic to the model itself? I have tested this on multiple servers and observed the same problem.

It shouldn't be related to BERTopic since all it is doing is creating embeddings. You could test creating the embeddings yourself with sentence-transformers and see whether they change if you run it multiple times.

I also noticed a minor topic representation reproducibility issue: when using ChatGPT to assist with topic naming, the topic representations exhibit slight variations across runs. Is this caused by the inherent randomness of ChatGPT's responses?

That is related th…

Replies: 1 comment 5 replies

Comment options

You must be logged in to vote
5 replies
@Feng-Xin-yu
Comment options

@Feng-Xin-yu
Comment options

@MaartenGr
Comment options

@Feng-Xin-yu
Comment options

@MaartenGr
Comment options

Answer selected by Feng-Xin-yu
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants