LLMs and MIMIC #1593
-
Hello, I'm currently researching possibilities for LLMs working in intensive care. MIMIC and especially MIMIC-Note are of high interest for me in this field. However, I was wondering if there are any official guidelines yet in place for working with these models. I assume that I can't just send MIMIC Data over an API like langchain in order to feed them into a running model like ChatGPT. Locally training and running a model should be within the scope of possibilities however, correct? Data safety and reasonable care are high priorities in our research and we want to make sure we're operating within the given legal guidelines. Thanks so much in advance for the answer. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 5 replies
-
Thank you for being thoughtful about this!
For LLMs, we are carrying on with the principles behind that second point. So, as of now (July 2023), our stance is:
For the 2nd option, you often have to request to omit human review of the data. It's easy to do but it's an extra step. If the reasons are asked, they are two-fold: it's highly sensitive data, and you have not been granted the right to share the data with the humans for review. In general we try to keep data on HIPAA compliant cloud services, to be safe, but it's not a strict requirement. |
Beta Was this translation helpful? Give feedback.
-
Thank you for clarifying answers! @alistairewj A follow up on this: We would like to use an externally-managed HPC (in the UK) to house the data and develop models on the data on the HPC. Encrypting the data would ensure that the HPC admins have no access to it. However, it would make data analysis and model-development near impossible. Are there any other ways in which we could work with MIMIC data on HPCs while adhering to the DUA? For example, would it be sufficient to sign an agreenment with the HPC where we outline our roles and define access limitations to the data, such that the HPC admins will be lawfully bound to not access particular data? Would you have any pointers to any groups in the UK who have successfully worked with MIMIC data on HPCs? Thank you so much! |
Beta Was this translation helpful? Give feedback.
Thank you for being thoughtful about this!
For LLMs, we are carrying on with the principles behind that second point. So, as of now (July 2023), our stance is: