1/ So I spent a day trying to understand GPT-3: what it is and how it works, from the pov of a layperson.

Compiling my learnings here. 👇

Hope it helps a few to get a basic understanding of GPT-3 and how it works.

It's time for a

GPT-3 MEGATHREAD! 🔥🔥🔥
2/ Disclaimers:

I've dabbled with basic linear/logistic regression and Bayesian models earlier in my career. My knowledge is still pretty amateurish, and this thread isn't meant to be *precise* as much as its meant to layperson-friendly peek into the blackbox that's GPT-3.
3/ Having said that, if you find any grievious mistakes, please reply to the thread and I'll add the corrections to the thread.

All quotes used in this thread are from the GPT-3 whitepaper.
4/ To start off with, let's see how any deep learning ML model "learns."
5/ There are essentially two components to a machine learning model: PARAMETERS and DATA.

The model identifies a set of PARAMETERS that help it to identify and predict the nature of any given data. It learns these parameters this using the DATA it is trained on.
6/ A analogy, we identify a cat by using parameters like

does it have the shape of a cat?
features of a cat?
color of a cat?
do others call it a cat?
etc.

And we can do this based on our training dataset — what we already know about cats from our past.
7/ For most of us, that would be something we learned as kids:

that this is a cat and this isn't a cat.

We often don't feel this parametric-reasoning happening everytime we see a cat because it happens subconsciously, in realtime.
You can follow @ghuubear.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: