So Far It's All Still AI

People online collectively lost their mind yesterday over this video a Reddit user generated with Google's Veo 3. It's built with Google's new multimodal model, Veo 3, that can do video and audio from a prompt or a source image. Depending on who you talk to, this is the newest frontier in hellish AI slop and/or the future of video content writ large. I try to avoid goalpost-moving too much, so I'll be the first to admit this is wild. As a flashpoint moment it's comparable to the original Dall-E release. We're lightyears from Will Smith eating spaghetti: the videos are coherent and look and sound really good. I actually had trouble sleeping last night thinking about the implications of an infinite content generator applied to the infinite-scrolling paradigm. If people have brainrot now I am terrified to find out what's next.
All that said, I think people going apopleptic over this need to step back and think about it a little bit. First off, the video was generated from a prompt - the skits themselves are ideas the user admitted to having had for a while. Despite all the advances in the technology I have yet to even see a short story written by an LLM that was compelling, let alone a novel or a TV show script. The dream of telling ChatGPT to make you 15 seasons of a show about a time-traveling pelican is still some ways away. Second, the short format is an ideal showcase for video generation - I have yet to see a long-form video that manages to maintain context and coherence. These videos look crazy good, and I'm sure TikTok is already awash with them, but until I see a 45-minute one I'm not gonna be convinced they can replace longform video.
Even assuming those are solvable problems, I'm skeptical about how this stuff will actually look in practice. Presumably because of the relatively small screen size I watched it on and the quirks of how the human mind processes moving images and fills in details, a lot of the videos in Interdimensional Cable weren't obviously AI the way most generated writing or images are to me. Will this actually hold up in a context like, say, a movie theater, or even in something I'm watching on my TV instead of scrolling past on my phone?
All that said, models like Veo are likely to have a major impact on filmmaking and game development. The ability of these tools to take a still image and animate it, edit video to add or remove elements, and other tasks traditionally kept in the domain of editors, animators, and modelers is likely to have a significant impact on already-unstable industries. I don't know that it's a death knell for SFX, necessarily, but unsexy and time-consuming stuff like rigging models may go the way of the dinosaurs. This type of application (less vibe-coding and more "speeding up a tedious workflow") is one that also bothers me less. If someone spends their time designing a beautiful character in Blender and uses AI to animate it that doesn't feel as gross to me as a fully-generated work. Field-specific applications - automating fiddly and difficult parts of a highly technical workflow - are always the ones that actually become killer apps. It might not get headlines the way the threat of an infinite content machine does, but it's actually useful in a way fully-generated stuff hasn't been so far.
There's a hard-to-attribute line I like a lot when talking about these tools: "if it's useful, it isn't AI". Historically, once something from AI research became actually useful to people, they stopped calling it AI - at that point it's just part of a technology. The latter set of applications for models like Veo are the set that are the most likely to graduate from being AI to being useful. The same is true for LLMs, diffusion models, and plenty of other tech: I have yet to be convinced of their utility as a purely generative machine, but as an integrated part of a workflow or a method of automating away tedium they do have promise. Unfortunately this approach does not get you billions of dollars from Softbank so presumably the slop factory will continue grabbing headlines until something changes.