A more powerful version of OpenAI’s fake news generator has been made available online, just months after its creators described it as too dangerous to release.
Back in February, the San Francisco-based start-up founded by Elon Musk and Sam Altman announced the creation of a text-generating algorithm (GPT-2) – a program so mind-bendingly brilliant at spewing fictitious (but convincing) written passages, they decided it best to keep the technology under wraps lest it be used to create fake news. Instead, they released a modified (smaller) version of the program.
Then, in May, the company unveiled a larger model, around three times as powerful as before. And last week, they dropped another, approximately six times as powerful as the first.
According to a recent statement on their website, they are releasing a 774 million parameter model (the February edition was 124 million) after “subsequent research with partners and the AI community into the model’s potential for misuse and societal benefit”. They have also announced they’ll be publishing an open-source legal agreement to make model-sharing partnerships easier. OpenAI says it knows of at least five GPT-2 replications currently in existence.
One of those was also released last week, Wired reports. Its creators, Aaron Gokaslan and Vanya Cohen, wanted to show that you don’t have to be a Silicon Valley tech company with millions of dollars and PhD grads at your disposal to build these fake-text-generating programs. They built theirs with free cloud computing provided by Google and used text gathered from millions of websites linked to on Reddit – like GPT-2. They say a high-schooler with some coding knowledge could do similar.
Their argument is that it is better to have these algorithms in the open so that we know what we’re dealing with.
“This allows everyone to have an important conversation about security, and researchers to help secure against future potential abuses,” Cohen told Wired.
“I’ve gotten scores of messages, and most of them have been like, ‘Way to go.'”
The purpose of these algorithms is to create complete articles on any subject from a human-written prompt – and they can produce incredibly plausible copy filled with false quotes and misattributions (take this example of an undiscovered herd of unicorns). However, it isn’t perfect. It’s a machine-learning program that bases its output on statistical patterns of language rather than understanding. This could lead to a story about a fire that takes place underwater, for example, or an article that feels like someone’s thrown a bunch of loosely connected sentences together.
And as the research team behind GPT-2 has explained, it is better on content that concerns politics and popular culture (both better represented on the 8 million or so pages used to train the algorithm) than on more niche or technical subjects.
Still, we decided to give it a go. Here’s what it came up with when we plugged in the first line of an article about an orbiting spacecraft (human prompt in bold):
A mysterious spacecraft has been orbiting the Earth for the past 720 days, and only the US Air Force knows why.
The satellite, operated by NASA and operated by the California Institute of Technology, is the first ever for Mars.
“It’s the best opportunity we have in hundreds of years to look for signs of life,” said NASA’s planetary scientist, Jim Green.
“A small, Earth-like planet is like a laboratory. We need to learn a lot about a planet to decide whether life is more likely to be there than elsewhere.”
But the discovery has sparked concerns about the fate of the spacecraft. Some scientists believe it may still contain life.
Image caption There are no signs of life on the Red Planet
The telescope was launched last year to a “hot” orbit at 35,000km above the planet’s surface. It’s hoped it will give scientists a view of Mars as it rotates and may even help scientists understand the planet’s surface.
The telescope, nicknamed MESSENGER, was launched when scientists thought the Red Planet might have an atmosphere.
And here’s the original article for comparison.