Monday, September 16, 2024
HomeTechnologyWe made a cat drink a beer with Runway’s AI video generator,...

We made a cat drink a beer with Runway’s AI video generator, and it sprouted fingers


A screen capture of an AI-generated video of a cat drinking a can of beer, created by Runway Gen-3 Alpha.

In June, Runway debuted a brand new text-to-video synthesis mannequin referred to as Gen-3 Alpha. It converts written descriptions referred to as “prompts” into HD video clips with out sound. We have since had an opportunity to make use of it and needed to share our outcomes. Our checks present that cautious prompting is not as vital as matching ideas probably discovered within the coaching information, and that attaining amusing outcomes probably requires many generations and selective cherry-picking.

An everlasting theme of all generative AI fashions we have seen since 2022 is that they are often glorious at mixing ideas present in coaching information however are sometimes very poor at generalizing (making use of realized “data” to new conditions the mannequin has not explicitly been educated on). Meaning they’ll excel at stylistic and thematic novelty however battle at elementary structural novelty that goes past the coaching information.

What does all that imply? Within the case of Runway Gen-3, lack of generalization means you would possibly ask for a crusing ship in a swirling cup of espresso, and offered that Gen-3’s coaching information consists of video examples of crusing ships and swirling espresso, that is an “simple” novel mixture for the mannequin to make pretty convincingly. However when you ask for a cat ingesting a can of beer (in a beer business), it’ll usually fail as a result of there aren’t probably many movies of photorealistic cats ingesting human drinks within the coaching information. As an alternative, the mannequin will pull from what it has realized about movies of cats and movies of beer commercials and mix them. The result’s a cat with human fingers pounding again a brewsky.

A couple of fundamental prompts

Through the Gen-3 Alpha testing section, we signed up for Runway’s Normal plan, which gives 625 credit for $15 a month, plus some bonus free trial credit. Every technology prices 10 credit per one second of video, and we created 10-second movies for 100 credit a chunk. So the amount of generations we might make had been restricted.

We first tried just a few requirements from our picture synthesis checks previously, like cats ingesting beer, barbarians with CRT TV units, and queens of the universe. We additionally dipped into Ars Technica lore with the “moonshark,” our mascot. You will see all these outcomes and extra under.

We had so few credit that we could not afford to rerun them and cherry-pick, so what you see for every immediate is strictly the one technology we obtained from Runway.

“A highly-intelligent particular person studying “Ars Technica” on their laptop when the display explodes”

“business for a brand new flaming cheeseburger from McDonald’s”

“The moonshark leaping out of a pc display and attacking an individual”

“A cat in a automotive ingesting a can of beer, beer business”

Will Smith consuming spaghetti” triggered a filter, so we tried “a black man consuming spaghetti.” (Watch till the top.)

“Robotic humanoid animals with vaudeville costumes roam the streets amassing safety cash in tokens”

“A basketball participant in a haunted passenger prepare automotive with a basketball courtroom, and he’s enjoying in opposition to a group of ghosts”

“A herd of 1 million cats operating on a hillside, aerial view”

“online game footage of a dynamic Nineties third-person 3D platform sport starring an anthropomorphic shark boy”

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments