The primary goal of the Open-Sora initiative is to make advanced video production capabilities universally accessible by offering an open-source version of the technology pioneered in OpenAI's Sora project1. This aims to eliminate barriers often encountered by independent creators, educators, and small businesses in the video production domain1.
Open-Sora 1.1 introduced several enhancements over version 1.0. It extended video length capability from 2 seconds to 15 seconds and offered more flexibility in video output, supporting various resolutions (144p to 720p) and aspect ratios34. Additionally, Open-Sora 1.1 incorporated a comprehensive video processing pipeline and improved prompting capabilities, enabling image and video prompts for video generation.
The training process for Open-Sora 1.2 involves a 3D-VAE model, rectified flow, and score conditioning1. It is trained on over 30 million data points, utilizing 80,000 GPU hours supporting various video resolutions and aspect ratios2. The command line for inference supports multiple configurations, including text-to-video and image-to-video generation2.