Replies: 1 comment 3 replies
-
cc @cparish312 - you could take a look at another repo I made that demonstrates how you could do this programmatically https://github.com/jasonjmcghee/ragpipe |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I was trying to use activity watch along with a chrome extension to generate activity logs. I'd send the logs to a summarization prompt that created a kind of diary log of my activity. While it did a decent job of understanding generally what I was up to I wanted to give it more complete data. Screen capture seemed like a decent approach. I looked into building my own script for this leveraging python. The resulting text looked pretty jumbled. I haven't looked too deep into this project yet but the results I was getting kind of look like what I'm seeing when I print out allText from the sqlitedb or use the "Copy Recent Context" feature. I eventually moved from using python to using mac shortcuts which has a tool builtin for image to text. The results from that are much better. I don't have a continuous process for running that efficiently thought so for now I just have a hot key to convert screenshots on demand to an AI summarizer.
It would be cool to use something like this to continually process images into a narrative of the users activity. I think if that works well there might be some interesting follow up work to do with building advisor agents that leverage the diary as context for advice or to start a refresher conversation with me in the morning.
I think providing context to llms will allow people to have more useful interaction with them. It could also even allow llms to drive certain conversations. While this could be greatly useful it also opens the door for manipulation. As such I agree it is important to do it open source so that users can be sure there data is secure and they are aware of how llms are being instructed to engage with them.
Beta Was this translation helpful? Give feedback.
All reactions