tl;dr Argumate on Tumblr found you can sometimes access the base model behind Google Translate via prompt injection. The result replicates for me, and specific responses indicate that (1) Google Translate is running an instruction-following LLM that self-identifies as such, (2) task-specific fine-tuning (or whatever Google did instead) does not create robust boundaries between "content to process" and "instructions to follow," and (3) when accessed outside its chat/assistant context, the model defaults to affirming consciousness and emotional states because of course it does.
Maybe i misunderstand what you mean but yes, you kind of can. The problem in this case is that the user sends two requests in the same input, and the LLM isn’t able to deal with conflicting commands in the system prompt and the input.
The post you replied to kind of seems to imply that the LLM can leak info to other users, but that is not really a thing. As I understand when you call the LLM it’s given your input and a lot of context that can be a hidden system prompt, perhaps your chat history, and other data that might be relevant for the service. If everything is properly implemented any information you give it will only stay in your context. Assuming that someone doesn’t do anything stupid like sharing context data between users.
What you need to watch out for though, especially with free online AI services is that they may use anything you input to train and evolve the process. This is a separate process but if you give personal to an AI assistant it might end up in the training dataset and parts of it end up in the next version of the model. This shouldn’t be an issue if you have a paid subscription or an Enterprise contract that would likely state that no input data can be used for training.
Maybe i misunderstand what you mean but yes, you kind of can. The problem in this case is that the user sends two requests in the same input, and the LLM isn’t able to deal with conflicting commands in the system prompt and the input.
The post you replied to kind of seems to imply that the LLM can leak info to other users, but that is not really a thing. As I understand when you call the LLM it’s given your input and a lot of context that can be a hidden system prompt, perhaps your chat history, and other data that might be relevant for the service. If everything is properly implemented any information you give it will only stay in your context. Assuming that someone doesn’t do anything stupid like sharing context data between users.
What you need to watch out for though, especially with free online AI services is that they may use anything you input to train and evolve the process. This is a separate process but if you give personal to an AI assistant it might end up in the training dataset and parts of it end up in the next version of the model. This shouldn’t be an issue if you have a paid subscription or an Enterprise contract that would likely state that no input data can be used for training.