Thread by @aparrish, among the reasons I use large pre-trained language models sparingly in my [...]

among the reasons I use large pre-trained language models sparingly in my computer-generated poetry practice is that being able to know whose voices I& #39;m speaking with is... actually important, as is being understanding how the output came to have its shape (long thread sorry)

understanding how the output came to have its shape is important because it& #39;s what makes it possible to experiment and iterate on poetic ideas. (i& #39;m taking for granted that understanding a language model training set is essentially the same thing as understanding its predictions)

if I train a language model on, say, a single novel, understanding the output of the model& #39;s predictions is at least somewhat intuitive. it& #39;s possible for me to draw conclusions about that novel (and my own aesthetic reactions to it) from what the model& #39;s predictions produce

a novel is, of course (thx Bakhtin) already a distillation of many different voices. but if I make a poem from the predictions of a language model trained on a single novel, I can still plausibly trace the voices I& #39;m speaking with & understand my relationship to them

just as important: with a (small) model like this, it& #39;s at least possible—if not always easy—for me to make a determination about whether or not I& #39;m authorized to speak with those voices, both morally and legally

but large language models need large amounts of text to train, and as researchers have moved from one corpus to the next in pursuit of more tokens (brown, wikipedia, common crawl...) attribution of authorship has gone from "plausible but difficult" to "impossible and undesirable"

large pre-trained language models have the effect of flattening authorship at scale—of making it appear as though a single voice is speaking by drawing on many voices—of wresting the unequivocal from the equivocal—a "feigned impartiality" to paraphrase Safiya Noble

(sometimes this is what you want, of course, even for poetry! it& #39;s interesting to curate texts and consider their properties together, setting aside the ways that they& #39;re different in order to draw out similarity. I do this with poetry from Project Gutenberg, for example.)

but I think there& #39;s a reason that when people report their experiences using large pre-trained language models in order to generate creative output, they almost inevitably write about the model as though it& #39;s doing the talking: "gpt-3 wrote..." or "gpt-3 said..."—

—which is that people don& #39;t (can& #39;t?) understand where those words are coming from, and so attribute them to the *model*, instead of to (e.g.) common crawl as a whole, or to the individual writers whose work contributed to the predictions in the model& #39;s output

one consequence of this kind of authorship misattribution is this: if the model itself is seen as the source of the output, the owners of the model can disavow any claim of ownership from those who contributed to the training data (cf https://nedroidcomics.tumblr.com/post/41879001445/the-internet)">https://nedroidcomics.tumblr.com/post/4187...

nedroidcomics

The Internet.

https://nedroidcomics.tumblr.com/post/41879001445/the-internet

but part of being a poet for me is being in dialogue with other writers, and so it& #39;s important for me to recognize and attribute my sources (whenever possible) when making work that remixes text resulting from other people& #39;s labor. large language models make this kinda impossible

(I don& #39;t think the legal copyright status of machine learning models and ML-generated works is especially important for the argument I& #39;m making—as an artist, the ethical standards I& #39;m held to regarding ownership & appropriation aren& #39;t coterminous with "is this technically legal")

otoh, as much as researchers advertise their pre-trained models as "general," even a corpus like common crawl is *incredibly specific*, incorporating a very narrow set of speakers & representing a very small slice of human linguistic competence (see e.g. https://www.bloomberg.com/news/newsletters/2020-07-24/ai-says-men-are-lazy)">https://www.bloomberg.com/news/news...

AI Says Men Are Lazy

Artificial intelligence's latest darling illustrates its flaws.

https://www.bloomberg.com/news/newsletters/2020-07-24/ai-says-men-are-lazy

so maybe my reticence to make use of these models is just as much a function of how dreary and distressing it seems to co-write poetry with twelve years of chewed up internet, regardless of how powerful the language model might be. shrug, the end

Latest Threads Unrolled: