Thread by @AlejandroPiad, Hey, today is #MindblowingMonday !I want to tell you about Language Models, [...]

Hey, today is #MindblowingMonday

https://abs.twimg.com/emoji/v2/... draggable="false" alt="🤯" title="Explodierender Kopf" aria-label="Emoji: Explodierender Kopf">!

I want to tell you about Language Models, a type of machine learning techniques that are behind most of the recent hype in natural language processing.

https://abs.twimg.com/emoji/v2/... draggable="false" alt="❓" title="Rotes Fragezeichen-Symbol" aria-label="Emoji: Rotes Fragezeichen-Symbol"> Want to know more about them?

https://abs.twimg.com/emoji/v2/... draggable="false" alt="🧵" title="Thread" aria-label="Emoji: Thread">

https://abs.twimg.com/emoji/v2/... draggable="false" alt="👇" title="Rückhand Zeigefinger nach unten" aria-label="Emoji: Rückhand Zeigefinger nach unten">

A language model is a computational representation of human language that captures which sentences are more likely to appear in a given language.

https://abs.twimg.com/emoji/v2/... draggable="false" alt="🎩" title="Zylinder" aria-label="Emoji: Zylinder"> Formally, a language model is a probability distribution over the sentences in a language.

https://abs.twimg.com/emoji/v2/... draggable="false" alt="❓" title="Rotes Fragezeichen-Symbol" aria-label="Emoji: Rotes Fragezeichen-Symbol"> What are they used for?

https://abs.twimg.com/emoji/v2/... draggable="false" alt="👇" title="Rückhand Zeigefinger nach unten" aria-label="Emoji: Rückhand Zeigefinger nach unten">

https://abs.twimg.com/emoji/v2/... draggable="false" alt="⚙️" title="Zahnrad" aria-label="Emoji: Zahnrad"> Language models allow computers to understand and manipulate language at least to some degree. They are used in machine translation, speech to text, optical character recognition, text generation, and many more applications!

They come in many flavors

https://abs.twimg.com/emoji/v2/... draggable="false" alt="👇" title="Rückhand Zeigefinger nach unten" aria-label="Emoji: Rückhand Zeigefinger nach unten">

The simplest language model is the *unigram model*, also called a *bag of words* (BOW).

https://abs.twimg.com/emoji/v2/... draggable="false" alt="👉" title="Rückhand Zeigefinger nach rechts" aria-label="Emoji: Rückhand Zeigefinger nach rechts"> In BOW, each word is assigned a probability Pi, and the probability of a sentence is computed assuming all words are independent.

But of course, this isn& #39; true.

For example, "water" is a more commonly used word than "philosophy", but the phrase "philosophy is the mother of science" is arguably much more likely than the phrase "water is the mother of science".

https://abs.twimg.com/emoji/v2/... draggable="false" alt="💡" title="Elektrische Glühbirne" aria-label="Emoji: Elektrische Glühbirne"> The likelihood of a phrase depends upon all its words.

This dependency can be modelled with an n-gram model, in which the likelihood of a word is computed w.r.t. the words before in a given phrase (in a window of size n).

https://abs.twimg.com/emoji/v2/... draggable="false" alt="💡" title="Elektrische Glühbirne" aria-label="Emoji: Elektrische Glühbirne"> If we start a phrase with "philosophy", is more likely to see the word "science" than "shark".

https://abs.twimg.com/emoji/v2/... draggable="false" alt="☝️" title="Zeigefinger nach oben" aria-label="Emoji: Zeigefinger nach oben"> The problem with n-gram models is that the total number of parameters you need to store grows exponentially with n.

If you want to capture phrases of length n=10, you need N^10 numbers, where N is the number of words in the language!

https://abs.twimg.com/emoji/v2/... draggable="false" alt="⭐" title="Mittelgroßer Stern" aria-label="Emoji: Mittelgroßer Stern"> Neural language models (aka continuous space language models) are a solution to this exponential explosion.

They try to learn jointly a vectorial representation for all words (aka an embedding) and some mathematical operation among them that approximates the likelihood.

https://abs.twimg.com/emoji/v2/... draggable="false" alt="⚙️" title="Zahnrad" aria-label="Emoji: Zahnrad"> Neural language models are built by training a neural network to predict some relationships between words and the phrases in which they appear.

The most popular neural language model is possibly *word2vec*, trained in predicting a word given a small window around it.

https://abs.twimg.com/emoji/v2/... draggable="false" alt="👉" title="Rückhand Zeigefinger nach rechts" aria-label="Emoji: Rückhand Zeigefinger nach rechts"> Modern neural language models have more complex neural network architectures.

Popular examples are BERT and the family of GPT models, of which GPT-3 recently took Twitter by surprise with its ability to speak nonstop about anything, often without much sense.

https://abs.twimg.com/emoji/v2/... draggable="false" alt="😇" title="Lächelndes Gesicht mit Heiligenschein" aria-label="Emoji: Lächelndes Gesicht mit Heiligenschein"> The nice thing about language models is that they can be trained independently of any NLP problem and then used inside specific applications with a little fine-tunning.

https://abs.twimg.com/emoji/v2/... draggable="false" alt="😇" title="Lächelndes Gesicht mit Heiligenschein" aria-label="Emoji: Lächelndes Gesicht mit Heiligenschein"> They also improve efficiency. A big company (like OpenAI or Google) can train a big language model and then the rest of us mortals can use them without having to pay millions in GPU training time.

https://abs.twimg.com/emoji/v2/... draggable="false" alt="⚠️" title="Warnsignal" aria-label="Emoji: Warnsignal"> But they don& #39;t come without issues

https://abs.twimg.com/emoji/v2/... draggable="false" alt="👇" title="Rückhand Zeigefinger nach unten" aria-label="Emoji: Rückhand Zeigefinger nach unten">

https://abs.twimg.com/emoji/v2/... draggable="false" alt="🤔" title="Denkendes Gesicht" aria-label="Emoji: Denkendes Gesicht"> Language models encode "common" language used, so all human bias is implicitly stored in them.

For example, the phrase "boy is a programmer" is considered more likely by a model than "girl is a programmer", simply because the Internet has more examples of the first phrase.

https://abs.twimg.com/emoji/v2/... draggable="false" alt="☝️" title="Zeigefinger nach oben" aria-label="Emoji: Zeigefinger nach oben"> If used without care, these language models will introduce subtle biases in your application that are very hard to discover and debug. Understanding and fixing these biases is one of the most exciting and important issues in AI safety!

As usual, if you like this topic, reply in this thread or @ me at any time. Feel free to

https://abs.twimg.com/emoji/v2/... draggable="false" alt="❤️" title="Rotes Herz" aria-label="Emoji: Rotes Herz"> like and

https://abs.twimg.com/emoji/v2/... draggable="false" alt="🔁" title="Nach rechts und links zeigende Pfeile in offenem Kreis im Uhrzeigersinn" aria-label="Emoji: Nach rechts und links zeigende Pfeile in offenem Kreis im Uhrzeigersinn"> retweet if you think someone else could benefit from knowing this stuff.

https://abs.twimg.com/emoji/v2/... draggable="false" alt="🧵" title="Thread" aria-label="Emoji: Thread"> Read this thread online at < https://apiad.net/tweetstorms/mindblowingmonday/languagemodels>">https://apiad.net/tweetstor...

Stay curious

https://abs.twimg.com/emoji/v2/... draggable="false" alt="🖖" title="„Live long and prosper!“" aria-label="Emoji: „Live long and prosper!“">:

-

https://abs.twimg.com/emoji/v2/... draggable="false" alt="📃" title="Seite mit Eselsohr" aria-label="Emoji: Seite mit Eselsohr"> < https://en.wikipedia.org/wiki/Language_model>
-">https://en.wikipedia.org/wiki/Lang...

https://abs.twimg.com/emoji/v2/... draggable="false" alt="🗞️" title="Eingerollte Zeitung" aria-label="Emoji: Eingerollte Zeitung"> < https://arxiv.org/abs/2005.14165 >
-">https://arxiv.org/abs/2005....

https://abs.twimg.com/emoji/v2/... draggable="false" alt="💻" title="Computer" aria-label="Emoji: Computer"> < https://github.com/huggingface/transformers>
-">https://github.com/huggingfa...

https://abs.twimg.com/emoji/v2/... draggable="false" alt="🎥" title="Filmkamera" aria-label="Emoji: Filmkamera"> < https://youtu.be/89A4jGvaaKk >
-">https://youtu.be/89A4jGvaa...

https://abs.twimg.com/emoji/v2/... draggable="false" alt="🎥" title="Filmkamera" aria-label="Emoji: Filmkamera"> < https://youtu.be/_x9AwxfjxvE >">https://youtu.be/_x9Awxfjx...

Latest Threads Unrolled: