. @Openai GPT-3 Thoughts and Takeaways

Demos are fun, but let& #39;s discuss the details.

This thread talks about about sentence completion, trade-offs, few shot learning, fine-tuning, technical takeaways, industry impacts, ethics, fun facts, and open questions.

cc @gdb

(1/13)
https://abs.twimg.com/emoji/v2/... draggable="false" alt="🤓" title="Nerd-Gesicht" aria-label="Emoji: Nerd-Gesicht"> Short Intro

There are https://abs.twimg.com/emoji/v2/... draggable="false" alt="4⃣" title="Tastenkappe Ziffer 4" aria-label="Emoji: Tastenkappe Ziffer 4"> language models of improving quality at the cost of increased latency: Ada, Babbage, Curie, Davinci.

There are https://abs.twimg.com/emoji/v2/... draggable="false" alt="2⃣" title="Tastenkappe Ziffer 2" aria-label="Emoji: Tastenkappe Ziffer 2"> API endpoints: Completion and Search. We’ll talk mostly about Completion because it’s the main endpoint.

(2/13)
Completion Parameters

https://abs.twimg.com/emoji/v2/... draggable="false" alt="🥇" title="Goldmedaille" aria-label="Emoji: Goldmedaille"> Prompt - Input text.
https://abs.twimg.com/emoji/v2/... draggable="false" alt="🥈" title="Silbermedaille" aria-label="Emoji: Silbermedaille"> Max_tokens - Output token length.
https://abs.twimg.com/emoji/v2/... draggable="false" alt="🌡️" title="Thermometer" aria-label="Emoji: Thermometer"> Temperature -
https://abs.twimg.com/emoji/v2/... draggable="false" alt="⬇️" title="Pfeil nach unten" aria-label="Emoji: Pfeil nach unten"> = less random + more deterministic. https://abs.twimg.com/emoji/v2/... draggable="false" alt="⬆️" title="Pfeil nach oben" aria-label="Emoji: Pfeil nach oben"> = more “creative.”
https://abs.twimg.com/emoji/v2/... draggable="false" alt="4⃣" title="Tastenkappe Ziffer 4" aria-label="Emoji: Tastenkappe Ziffer 4"> Top_p - Diversity via nucleus sampling.
https://abs.twimg.com/emoji/v2/... draggable="false" alt="5⃣" title="Tastenkappe Ziffer 5" aria-label="Emoji: Tastenkappe Ziffer 5"> Frequency_Penalty - https://abs.twimg.com/emoji/v2/... draggable="false" alt="⬆️" title="Pfeil nach oben" aria-label="Emoji: Pfeil nach oben"> = https://abs.twimg.com/emoji/v2/... draggable="false" alt="⬇️" title="Pfeil nach unten" aria-label="Emoji: Pfeil nach unten"> repetition.

(3/13)
https://abs.twimg.com/emoji/v2/... draggable="false" alt="6⃣" title="Tastenkappe Ziffer 6" aria-label="Emoji: Tastenkappe Ziffer 6"> Presence_penalty - https://abs.twimg.com/emoji/v2/... draggable="false" alt="⬆️" title="Pfeil nach oben" aria-label="Emoji: Pfeil nach oben"> = https://abs.twimg.com/emoji/v2/... draggable="false" alt="⬆️" title="Pfeil nach oben" aria-label="Emoji: Pfeil nach oben"> new topics.
https://abs.twimg.com/emoji/v2/... draggable="false" alt="7⃣" title="Tastenkappe Ziffer 7" aria-label="Emoji: Tastenkappe Ziffer 7"> N - Best of how many generations.
https://abs.twimg.com/emoji/v2/... draggable="false" alt="🎱" title="Billard" aria-label="Emoji: Billard"> Stream - Whether to stream back partial progress.
https://abs.twimg.com/emoji/v2/... draggable="false" alt="9⃣" title="Tastenkappe Ziffer 9" aria-label="Emoji: Tastenkappe Ziffer 9">Logprobs - High logprob = model is more confident.
https://abs.twimg.com/emoji/v2/... draggable="false" alt="🛑" title="Stop sign" aria-label="Emoji: Stop sign"> Stop - Where API will stop generating further tokens.

(4/13)
I wonder if GPT-3 wouldn& #39;t be easily be good at writing a math book because you& #39;d like the text part of the book to be more creative and the mathematical part to be logical and repeatable. You probably wouldn& #39;t want a math book that was creatively written and then 2+2=5.

(7/13)
https://abs.twimg.com/emoji/v2/... draggable="false" alt="😯" title="Schweigendes Gesicht" aria-label="Emoji: Schweigendes Gesicht"> Few shot > Fine-tune

Back in the day (a few months ago), you needed to fine-tune a pre-trained model on a task-specific supervised dataset.

Today, you get similar results by simply prepending a few task-specific examples to the prompt during inference using GPT-3.

(8/13)
https://abs.twimg.com/emoji/v2/... draggable="false" alt="🤑" title="Gesicht mit Geld-Mund" aria-label="Emoji: Gesicht mit Geld-Mund"> Industry Impacts

@OpenAI will be competing with AI-as-an-API startups, like @rev, and big tech companies with ML solutions, like @googlecloud.

Bigger models need better hardware.

Companies will need to upgrade their ML serving infrastructure for bigger models.

(10/13)
https://abs.twimg.com/emoji/v2/... draggable="false" alt="🧐" title="Gesicht mit Monokel" aria-label="Emoji: Gesicht mit Monokel"> AI Ethics

The paper talked about social impact and potential misuse. @openai enabled “Flag Toxicity” filter by default and allowed us to send feedback about “unsafe” content. They’re also working on a semantically-deep toxicity filter built on the API.

(11/13)
https://abs.twimg.com/emoji/v2/... draggable="false" alt="🥳" title="Partying face" aria-label="Emoji: Partying face"> Fun Facts

GPT - June 2018 release date, 150M parameters, 5GB training set.
GPT-2 - February 2019, 1.5B, 50GB.
GPT-3 - June 2020, 175B, 570GB.
GPT-4 - June 2021, 1.5T, 5.7TB.

GPT-4 predicted by GPT-3.

(12/13)
https://abs.twimg.com/emoji/v2/... draggable="false" alt="🤔" title="Denkendes Gesicht" aria-label="Emoji: Denkendes Gesicht"> My Personal Open Questions

How deep is the model& #39;s understanding?
How do we optimize the parameters? Random search?
How do we evaluate the model generally and specifically to priming?

If anyone has any ideas, please feel free to reply.

(13/13)
You can follow @pujaarajan.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: