UPDATED 20:52 EDT / FEBRUARY 14 2019


OpenAI holds back research on new system since it can write fake news

OpenAI, the Silicon Valley research firm funded by various tech leaders to better humanity, has taken the step of not releasing research relating to a revolutionary artificial intelligence system. The problem: It’s apparently so good that it can write coherent fake news.

The system, GPT-2, is a general-purpose language algorithm that uses machine learning to translate text, answer questions and predicatively write text. The latter feature is where the concern lies, because GPT-2 can create fake news based on something as simple as an opening sentence.

In one example, shared by Technology Review, GPT-2 wrote a full, coherent fake news article based on nothing more than the opening sentence, “Russia has declared war on the United States after Donald Trump accidentally …”

In another example, shared by the researchers, the algorithm created an entire article based on a human prompt, “In a shocking finding, scientist discovered a herd of unicorns living in a remote, previously unexplored valley, in the Andes Mountains. Even more surprising to the researchers was the fact that the unicorns spoke perfect English.”

GPT-2 isn’t 100 percent perfect as it stands, with variations of success depending on the topic.

“We find that it takes a few tries to get a good sample, with the number of tries depending on how familiar the model is with the context,” the researchers explain. “When prompted with topics that are highly represented in the data (Brexit, Miley Cyrus, Lord of the Rings, and so on), it seems to be capable of generating reasonable samples about 50 percent of the time. The opposite is also true: on highly technical or esoteric types of content, the model can perform poorly.”

Although designed to create real text with positive applications, the ramifications are that GPT-2 could be used for nefarious uses. Along with the generation of fake news, other potential users include the impersonation others online, automation of the production of abusive or faked content to post on social media, and automation of the production of spam and phishing content.

“Due to concerns about large language models being used to generate deceptive, biased, or abusive language at scale, we are only releasing a much smaller version of GPT-2 along with sampling code, the researchers wrote. “We are not releasing the dataset, training code, or GPT-2 model weights.”

The decision to withhold the research may ultimately be moot, however.

“We are aware that some researchers have the technical capacity to reproduce and open source our results,” the researchers noted. “We believe our release strategy limits the initial set of organizations who may choose to do this, and gives the AI community more time to have a discussion about the implications of such systems.”

Image: mikemacmarketing/Flickr

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One-click below supports our mission to provide free, deep and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy