Microsoft-backed start-up OpenAI unveiled its text-to-video generative tool Sora on 16 February 2024. People’s interest in learning more about the artificial intelligence-assisted video creator hit an all-time high in 12 months after the teaser release, according to Google Trends. Although the tool is not available to the public yet, Sora has already sparked debates about how widely it can be adopted in the video production industry – AI enthusiasts and long-time OpenAI fans greet Sora with open arms, while some raise concerns.
There have been several other generative AI tools that provoked discussions before Sora. For example, many talked about replacing designers and illustrators at workplaces with image-generative software like DALL-E in 2022. At the fingertip of users, these tools offer high-quality text-to-visual features at a “cheaper cost than hiring humans”. The creativity of these tools has been challenged by the media and users – often with comments like “AI cannot outperform human creations”.
AI has come a long way since then, each day with a significant advance in their work. We are now at the stage where singers express criticism of AI-generated covers in their voices, teachers seek ways to detect ChatGPT uses from their students’ assignments, and developers invent tools to identify AI-made content.
Now, with Sora, the video generation software that allows users to make short videos without a camera or actors, we may have to turn our eyes to what impact text-to-video tools can make outside the digital realm.
What is Sora?
Sora is a video generator that can transform text prompts into clips that match the description. According to OpenAI’s post on X, formerly known as Twitter, it can make videos for up to 60 seconds “featuring highly detailed scenes, complex camera motion, and multiple characters with vibrant emotions”. The more detailed a prompt is, the more accurate the output becomes.
Here, look at the video from OpenAI’s official website as an example.
The prompt that OpenAI put is: “A stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage. She wears a black leather jacket, a long red dress, and black boots, and carries a black purse. She wears sunglasses and red lipstick. She walks confidently and casually. The street is damp and reflective, creating a mirror effect of the colourful lights. Many pedestrians walk about.”
The name “Sora” translates to “sky” in Japanese. The team of 13 people behind this video creator, including Tim Brooks and Bill Peebles, said they chose the name because it “evokes the idea of limitless creative potential”.
Can I use Sora now?
Sora is unavailable for anyone at the moment, but only a small select group of people. OpenAI said in an interview that they are still testing the tool to “understand the system’s dangers”. Working with academics, researchers, and creators, the company is now finding any possibilities of the tool being misused.
Still, people can have an early peek into the platform before Sora’s release on its official website or X. When OpenAI’s CEO Sam Altman unveiled the project on his X account; he allowed users to see the tool’s performance via comments. Altman replied with videos that corresponded to each user’s comment, such as “a monkey playing chess in a park” or “a bicycle race on the ocean with different animals as athletes.”
OpenAI did not disclose the official release date of Sora yet.
Where can it be used?
Sora gives great flexibility to users, like supporting videos in various formats or sizes, offering features for editing, and accepting images or videos as prompts. This means that the tool can be utilised across a range of businesses.
Just like how ChatGPT serves as a tool for educational purposes, Sora can be useful software for both teachers and students. For example, teachers can make videos using Sora to make easy-to-understand visual aids for specific class scenarios or topics. Students may be able to diversify their assignments or study materials, transforming their written works into videos.
Sora’s launch can be good news for marketers and advertisers as well. Companies will be able to make their brand promotional content with high-quality visuals. The production process may become faster, which can lead to more active interaction with customers on social media.
It also can be a powerful assistance for e-commerce companies, like online shopping malls. They can make dynamic demonstrations of their products in videos instead of image or text descriptions, presenting more engaging, detailed explanations.
Why are some people concerned about Sora?
The news of Sora raised concerns from several creators as well as videography-related businesses, who fear their jobs being automated by the AI-powered platform. For example, MrBeast, one of the most-subscribed YouTubers, commented on Altman’s post on X: “Sam plz don’t make me homeless”.
There are people raising questions about ethical problems or misinformation, which is a common issue of many other AI-assisted programs. OpenAI said they will try to respond to these questions through the test period with “red teamers” who can advise on “misinformation, hateful content, and bias”.
The company added that there would be systematic measures to prevent Sora from creating ethically problematic videos, like rejecting certain prompts against their policies, including “extreme violence, sexual content, hateful imagery, celebrity likeness, or the IP of others”. To curb the spread of possible misinformation generated by Sora, OpenAI said it will work on a program that can identify content made by Sora.
Are there other tech companies preparing to launch AI video creators?
The competition among AI-powered video generators became more fierce in mid-2023. Last June, American start-up Runway surprised many with its revolutionary video generator “Gen-2”. The text-to-video system is now available for anyone to use. Several months later, London-based AI video company Stability AI emerged in the AI video creator scene with “Stable Video Diffusion”. The company explained that the tool is strictly for “research” purposes and will not be released for public access.
Backed by their huge resources, tech giants are leading the latest development in AI video production. Meta offered the first look into its video generator Emu Video in September 2023, following its video generation tool Make-A-Video in 2022. Most recently, Google unveiled its AI generative video model Gemini 1.5 in February 2023. The big tech made demo clips and prompts available for viewing via Lumiere but has not disclosed the exact dates for release yet. According to reports, Amazon is also working on a text-to-video program to be released soon.
Many say that Sora is an advancement in the text-to-video generator game compared to competitors like Runway, Meta and even Google, especially regarding quality and resolution. Although it requires further viewing after the official launch, how Sora allows users to make videos up to 60 seconds is already a win in the industry, considering that many other programs’ limit is about 3 to 10 seconds.
As big tech firms celebrate the first couple of months of 2024 with their latest developments in AI video generators, watching over their potential in reshaping media-related businesses throughout this year will be interesting.