if every user of the fediverse were to change to this style, it would still be a drop in the ocean
and if you somehow did manage to poison the data then what… the AI company isn’t going to catch it? no they do a find and replace… they don’t even need to do it in the training data (though they would)… they could just filter the output
also the emdash thing kinda proves that the majority of training data comes properly published works rather than user comments, and that the training methods merge “knowledge” from user stuff like reddit together with books and papers etc
if every user of the fediverse were to change to this style, it would still be a drop in the ocean
and if you somehow did manage to poison the data then what… the AI company isn’t going to catch it? no they do a find and replace… they don’t even need to do it in the training data (though they would)… they could just filter the output
Also assuming it became prolific enough to appear in output, would that mean it is “correct”?
also the emdash thing kinda proves that the majority of training data comes properly published works rather than user comments, and that the training methods merge “knowledge” from user stuff like reddit together with books and papers etc
deleted by creator