Text-based editing powered by AI is taking the post-production world by storm. Adobe and Blackmagic Design announced that their respective editing apps would showcase it as a built-in feature. Text-based editing uses artificial intelligence to produce a transcript of your videos that provides a way to edit the video by selecting text. Time savings is the major advantage of text-based editing, especially for documentaries and interviews. The Lumberjack system introduced AI-powered text-based editing scenes several years ago for Final Cut Pro, so the big news is that DaVinci Resolve and Premiere Pro now have it built in. But we’ll look at each system’s strengths and weaknesses so that you can decide which tool is right for your workflow.
The case for text-based editing
The script lays out the story for narrative films and TV shows. AVID’s Media Composer features ScriptSync to help editors match their edits to the shooting script. Their PhraseFind feature then allows you to search your audio clips and find just what you are looking for. But non-scripted shows like reality TV and documentaries operate in reverse. The final “script,” as it were, is really the byproduct of the editing process. It’s even been recognized that documentary editors are writers. Documentaries often cull together an enormous amount of interviews. Those interviews often overlap on common subjects. Those subjects form the building blocks of the film’s story. So there was a real need for more powerful and more cost-effective solutions for text-based editing.
It should be noted that there’s a huge amount of buzz regarding “Text-to-video” tools like Runway. That tool uses AI to create video clips from text prompts. “Text-based editing” uses AI to create a transcript from a video, like an interview. Then you use that transcript to edit your video together by pulling together important clips. These clips might come from a single interview or multiple interviews.
A “paper edit” is the product of using transcripts of interviews printed out on paper to craft an edit. This is the analog method of “text-based” editing. You can actually cut up the portions of the interviews and lay them out, and then group them by topic. This sounds archaic, but it can really help you to see the full story. Another version of the paper edit is to print out a list of markers from the interviews summarizing each point and interviewee discussed. Here’s an example of a paper edit from the 2017 documentary Fragments of Truth.
Paper edit from the Fragments of Truth documentary (2017, Reuben Evans)
In this example, each of the markers was typed out after a portion of an interview had been watched. Then the markers were printed out, and cards were made that listed the common subjects. Those became the building blocks for the film.
Paper edit from the Fragments of Truth documentary (2017, Reuben Evans)
As you can see, this process could benefit greatly from some technological improvements. This is one area where artificial intelligence can shave days, if not weeks, off the time it takes to log and organize your footage.
Those improvements took center stage at NAB 2023 when Adobe and Blackmagic Design announced that text-based editing would ship with their NLEs. You just have to love Adobe’s marketing tagline for Text-based editing, “No more paper cuts.”
Blackmagic Design included text-based editing in the DaVinci Resolve 18.5 beta. It brings the basics of text-based editing to DaVinci Resolve Studio ($295). Blackmagic calls it “Speech to Text.”
DaVinci Resolve Speech to Text (Blackmagic Design, 2023)
Resolve can automatically create transcripts for you using AI. It will identify silent portions of your clips as well. Simply select a clip in your media bin and click the “Transcribe Audio” button. Resolve will transcribe the text and note silent portions with ellipses. When you highlight the text of your transcription, Resolve will highlight that portion of your clip in the timeline. Resolve can use that transcription to create captions for your video as well. The YouTube channel “Creative Video Tips” has a great tutorial on Speech to Text editing in DaVinci Resolve.
You can see in the video that Resolve only addresses a couple of aspects of text-based editing. The reviewer is having to implement a “hack” where he is using one timeline to organize clips and another to do his edit. That other timeline functions like the “organization cards” in a paper edit. That makes DaVinci Resolve’s implementation pretty good for a single interview or a short video. But it falls a bit short of indexing and organizing the contents of an entire film because it doesn’t incorporate some key metadata. Identifying who is saying what in a documentary interview is highly beneficial. For instance, you may have a host appear in multiple locations or several speakers in a single interview.
Just a few days before Blackmagic Design announced Speech to Text, Adobe announced, “Premiere Pro is the only professional editing software to incorporate Text-Based Editing.” While that claim didn’t last long, Adobe’s implementation did go further than Blackmagic’s feature. Premiere Pro automatically transcribes clips and produces captions. Importantly, it allows you to identify the speakers.
It would be nice to see some more advanced tools when it comes to identifying speakers. Currently, the editor has to go through each phrase and identify the speaker. Adobe has shown off the ability for it to identify speakers, but they haven’t shipped that feature to beta yet.
Premiere’s text-based workflow adds a couple of other important features as well. Editors can import a transcript that has been created through a service like Rev.com, and you can associate that transcript with the clip. This is handy if your audio has technical words or foreign languages.
Adobe Premiere Pro Text-based editing, 2023 Adobe
Editors can export the transcripts that Premiere provides as well. This adds value to the transcripts because those transcripts can be uploaded to social media sites along with the video for increased SEO performance.
Both Premiere Pro and DaVinci Resolve allow you to insert clips from the transcription window. You can identify silent sections in your clips in both NLEs.
Adobe also provides a workspace for text-based editing in Premiere, making the feature feel more refined than Blackmagic’s implementation. It feels like Adobe has laid the foundation for more functionality in this workspace in the future. But currently, it is still limited in its ability to function as an organizational tool for a film with common topics across multiple speakers, as is the case with most non-scripted work. So Premiere appears to have the upper hand when it comes to built-in integration.
In 2018, Philip Hodgets from Intelligent Assistance presented Lumberjack Builder. When you organize footage, it is known as “logging” footage, hence the name Lumberjack.
Lumberjack then grew into a whole suite of logging and editing tools, culminating in the release of their new Lumberjack Builder NLE. Originally released for FCP, the Lumberjack system also works with Premiere Pro. It was the first system to connect AI for transcription with an editing interface and the first text-based editing tool.
Lumberjack combines transcription with keywords and other metadata that allow you to organize an entire project’s worth of footage and cull it together into an actual text-based edit. This comes from a deep understanding of the purpose of a paper edit. It is designed to work with keywords across clips the way an editor uses cards for organizing when doing a paper edit.
The key difference here is that Resolve and Premiere use the text as a “source,” but the “destination” is still the timeline. You read the words in the “source,” but you have to listen in the “destination.” Whereas, Lumberjack features the same interface when you are working through your source interviews or the timeline that you are assembling. The editor is working with blocks of text. This makes it a powerful tool for documentary filmmakers.
For films in languages other than English, Lumberjack offers 16 languages for free. And it integrates with a third-party transcription service for another 50 languages at 25 cents a minute.
Finally, Lumberjack offers real-time logging for interviews with their iOS app. The app enables metadata tagging by people, locations, or other key topics right on set. When combined with AI transcription and text-based editing, it’s a powerful solution. When the editor has finished their “paper edit” in Lumberjack, just send it over to FCP or Premiere and start the process of trimming.
The AI-powered online video editing app, Descript, uses a text-based editing approach as well. It’s designed to be easily accessible for anyone who needs to make simple videos like presentations. Descript also features an audio mode that is designed for podcasters. One of the big features of Descript is that it will help to identify and eliminate “verbal clutter.” Those are the umms and ahhs that we say when we don’t quite know what to say next.
Descript offers “Scenes” as an easy way to insert your b-roll. The editor inserts a slash into the transcript to identify the beginning and end of a scene. And then you just drag a clip or graphic onto that spot.
Descript now has the backing of OpenAI, so it will be really interesting to see what they come up with in the future.
Text-based editing is nothing new in the sense that Intelligent Assistance has been offering it for years. At the same time, it feels totally new, because far more people have accessed it through Resolve and Premiere Pro in the past few weeks than in the past few years. It is a tool that has proven its worth, whether through the old-school paper edit or the latest AI tech. So many AI-powered features will be coming to post-production professionals that it will be hard to keep up. Some will be of dubious usefulness, while others will transform job descriptions overnight. But the best tools will be the ones that empower storytellers to efficiently work their craft so that we can all do more of what we love.
MediaSilo allows for easy management of your media files, seamless collaboration for critical feedback, and out-of-the-box synchronization with your timeline for efficient changes. See how MediaSilo is powering modern post-production workflows with a 14-day free trial.