GPT 2 - REVOLUTIONIZING AI TEXT GENERATION

Hamza K.
Feb 7, 2020
3 min read

Updated: Sep 9, 2020

In early 2019, OpenAI, one of the frontrunners in artificial intelligence founded by Tesla CEO Elon Musk, released a paper detailing its latest accomplishment in the field of Natural Language Processing, the GPT 2 language model; a neural network designed to analyze and generate coherent text. It was trained (given examples of text to learn how language works) on 40 gigabytes of text from 8 million webpages. 40 gigabytes may not seem like such a big of a deal, but for reference, the first 'Lord of the Rings' book ends up just shy of 1 megabyte. 40 gigabytes is over 40 million times that!

Previously, most language models designed to generate text utilized a sort of 'memory' that remembered previous text that was written or fed in. The problem with this is that there was only a limited amount of info previous models could remember so useful information was lost. GPT 2 is a type of model known as a transformer that utilizes a system known as 'attention' that deals with this problem by mostly focusing on relevant info for each part of the text it generates instead of keeping in mind everything it has read prior. For example, in the text "Pointing at John, the child in the blue jacket let us know that he had seen Chris talk with...", if you were asked to finish this sentence, you would probably complete it with the word 'him' as you would focus on the fact that the word 'John' is being referred to and it is a male name. This is similar to what attention does whereas previous methods would have tried to remember all the previous text read and have been bogged down with unnecessary information such as 'child', 'blue jacket', or 'Chris' using up their limited memory.

In fact, the model had performed so well that out of fear of its misuse, instead of releasing it in its entirety, OpenAI had released a dumbed-down version of the model with 117 million parameters (numbers used in calculations), less than 8% of the original model's 1.5 million parameter size, and it still had outperformed previous methods. This had sparked a controversy in the AI community with many calling into question the 'Open' part of OpenAI's name. Many had also defended OpenAI due to the high possibility of malicious use-cases of GPT 2 such as bombarding products with fake, generated well-written reviews on sites such as Amazon, or mass-producing fake politically-motivated articles and spreading them across the internet. Bot accounts on social media could also be used to sway public opinion.

Despite the possible ramifications of releasing the full model, OpenAI gradually released larger and larger versions of GPT 2, going from 117M parameters to 345M, to 774M, and then finally the full 1.5 billion parameter model on November 5th, 2019. Now that it is fully released, many people have utilized it in interesting projects such as AI Dungeon 2, a text adventure game where you can type in anything and a repurposed GPT 2 model generates a response in a text-adventure style as shown below:

There is also a Reddit community made up of only bots powered by GPT 2 interacting with each other.

Generating text is not the only use for transformers; they can utilize any data that comes in sequences (like text, stock prices, and audio). Generating is also not their only use. They can also be used for question answering, text summarizing, speech recognition, language translation, and sentiment analysis (predicting things such as emotion, attitudes, or opinions from text).

You can test the original model by making it generate succeeding text to any input you give it on https://talktotransformer.com/. If you want to read the blog post or paper posted by OpenAI, you can find them here.

GPT 2 - REVOLUTIONIZING AI TEXT GENERATION

Recent Posts

Comments

Subscribe Form