Scaling Video Files
Create an app to time scale your video files
Originally published • Last updatedOur time scaling technique for audio is adapted to uniformly speed up or slow down the rate of play of video.
ScaleVideo
The associated Xcode project implements a SwiftUI app for macOS and iOS that scales video files stored on your device or in iCloud.
A default video file is provided to set the initial state of the app.
After a video is imported it is displayed in the VideoPlayer where it can be viewed along with its scaled counterpart.
Select the scale factor from a slider.
Classes
The project is comprised of:
- ScaleVideoApp - The App for import, scale and export.
- ScaleVideoObservable - An ObservableObject that manages the user interaction to scale and play video files.
- ScaleVideo - The AVFoundation and vDSP code that reads, scales and writes video files.
1. ScaleVideoApp
Tap the Import Button to present a file navigator to select a video file to scale.
Button(action: { showFileImporter = true }, label: {
Label("Import", systemImage: "square.and.arrow.down")
})
The action sets the state showFileImporter
to display a browser for video files using the fileImporter modifier.
.fileImporter(isPresented: $showFileImporter, allowedContentTypes: [.movie, .quickTimeMovie, .mpeg4Movie], allowsMultipleSelection: false) { result in
do {
showURLLoadingProgress = true
guard let selectedURL: URL = try result.get().first else { return }
scaleVideoObservable.loadSelectedURL(selectedURL) { wasLoaded in
if !wasLoaded {
alertInfo = AlertInfo(id: .urlNotLoaded, title: "Video Not Loaded", message: (scaleVideoObservable.errorMesssage ?? "No information available."))
}
showURLLoadingProgress = false
}
} catch {
print(error.localizedDescription)
}
}
In fileImporter
’s completion the method loadSelectedURL
of ScaleVideoObservable
copies the video file at selectedURL
to our Documents folder.
Since the file may be located in iCloud it may require downloading if it is not already on your device, so an indeterminate ProgressView is displayed as an overlay in color Tangerine:
let tangerine = Color(red: 0.98, green: 0.57, blue: 0.21, opacity:0.9)
.overlay(Group {
if showURLLoadingProgress {
ProgressView("Loading...")
.padding()
.background(RoundedRectangle(cornerRadius: 16).fill(tangerine))
}
})
After the video is loaded it is displayed in a VideoPlayer with its file name:
VideoPlayer(player: scaleVideoObservable.player)
Text(scaleVideoObservable.videoURL.lastPathComponent)
Tap the Video and Scaled buttons to play the imported video and scaled video:
Button(action: { scaleVideoObservable.playOriginal() }, label: {
Label("Video", systemImage: "play.circle")
})
Button(action: { scaleVideoObservable.playScaled() }, label: {
Label("Scaled", systemImage: "play.circle.fill")
})
Tap the Scale button to begin the scaling process.
Button(action: { scaleVideoObservable.scale() }, label: {
Label("Scale", systemImage: "timelapse")
})
.padding()
The method scale
of ScaleVideoObservable
sets the state isScaling
so the ScaleProgressView
displays and creates a ScaleVideo
object to perform the scaling of the video using the selected scale factor and frame rate.
The scale factor is chosen from a Slider to scale a video duration from 0.1 to 2 times the original duration:
Slider(
value: $scaleAudioObservable.factor,
in: 0.1...2
)
The video frame rate can be selected using a Picker to 24, 30 or 60 frames per second:
enum FPS: Int, CaseIterable, Identifiable {
case twentyFour = 24, thirty = 30, sixty = 60
var id: Self { self }
}
Picker("Frame Rate", selection: $scaleVideoObservable.fps) {
Text("24").tag(FPS.twentyFour)
Text("30").tag(FPS.thirty)
Text("60").tag(FPS.sixty)
}
.pickerStyle(.segmented)
During the process progress values and images are sent to the ScaleProgressView
, where the scaling can also be cancelled.
Tap the Export button to present a navigator to select a name and location to save the scaled video file.
Button(action: { scaleVideoObservable.prepareToExportScaledVideo(); showFileExporter = true }, label: {
Label("Export", systemImage: "square.and.arrow.up.fill")
})
The action sets the state showFileExporter
to display a file browser using the fileExporter modifier.
fileExporter
requires that we provide a FileDocument representing the file we want to export. So the action also invokes the method prepareToExportScaledVideo
of ScaleVideoObservable
that creates a VideoDocument
which implements the FileDocument protocol for the url to export.
.fileExporter(isPresented: $showFileExporter, document: scaleVideoObservable.videoDocument, contentType: UTType.quickTimeMovie, defaultFilename: scaleVideoObservable.videoDocument?.filename) { result in
if case .success = result {
do {
let exportedURL: URL = try result.get()
alertInfo = AlertInfo(id: .exporterSuccess, title: "Scaled Video Saved", message: exportedURL.lastPathComponent)
}
catch {
}
} else {
alertInfo = AlertInfo(id: .exporterFailed, title: "Scaled Video Not Saved", message: (scaleVideoObservable.videoDocument?.filename ?? ""))
}
}
The fileExporter
will complete with an alert that the file was saved or not.
The PickVideoView
defines a state alertInfo
of type AlertInfo
to present alerts for multiple situations.
@State private var alertInfo: AlertInfo?
AlertInfo
is a struct whose value identifies the alert to be presented when the state is changed:
struct AlertInfo: Identifiable {
enum AlertType {
case urlNotLoaded
case exporterSuccess
case exporterFailed
}
let id: AlertType
let title: String
let message: String
}
So for example when the export succeeds it sets the state as:
alertInfo = AlertInfo(id: .exporterSuccess, title: "Scaled Video Saved", message: exportedURL.lastPathComponent)
And then the alert
modifier will be invoked to display the alert withe the corresponding title
and message
:
.alert(item: $alertInfo, content: { alertInfo in
Alert(title: Text(alertInfo.title), message: Text(alertInfo.message))
})
Both iOS and macOS store the generated file in the Documents folder.
On the Mac the Documents folder can be accessed using the provided Go to Documents button.
For iOS the app’s Info.plist includes an entry for Application supports iTunes file sharing so the Documents folder can be accessed in the Finder of your connected device, as well as an entry for Supports Document Browser so the Documents folder can be accessed in the ‘On My iPhone’ section of the Files app.
Note that the app’s init
method clears the Documents folder every time it launches using a FileManager extension.
FileManager.clearDocuments()
2. ScaleVideoObservable
In the ScaleVideoApp
section several methods of ScaleVideoObservable
were mentioned:
A. loadSelectedURL - loads the video file selected by user to import.
B. scale - scales the selected video.
C. prepareToExportScaledVideo - creates a VideoDocument
to save the scaled video.
Let’s now discuss these methods in detail.
A. loadSelectedURL
- loads the video file selected by user to import.
Back to ScaleVideoObservable Methods
func loadSelectedURL(_ url:URL, completion: @escaping (Bool) -> ()) {
copyURL(url) { copiedURL in
DispatchQueue.main.async {
if let copiedURL = copiedURL {
self.videoURL = copiedURL
self.player = AVPlayer(url: copiedURL)
completion(true)
}
else {
completion(false)
}
}
}
}
This method invokes copyURL
which tries to copy the selected url
to a copiedURL
that references a file by the same name in the documents folder, deleting any existing file by the same name.
copyURL
first calls startAccessingSecurityScopedResource because some files encountered by the file navigator may be security-scoped.
If the file copy at first fails then it is possible it failed because the file is in iCloud and has not yet been downloaded.
In that case copyURL
calls tryDownloadingUbiquitousItem
and if that succeeds it then tries to copy that url to documents.
func copyURL(_ url: URL, completion: @escaping (URL?) -> ()) {
let scoped = url.startAccessingSecurityScopedResource()
let filename = url.lastPathComponent
if let copiedURL = FileManager.documentsURL("\(filename)") {
try? FileManager.default.removeItem(at: copiedURL)
do {
try FileManager.default.copyItem(at: url, to: copiedURL)
if scoped {
url.stopAccessingSecurityScopedResource()
}
completion(copiedURL)
}
catch {
tryDownloadingUbiquitousItem(url) { downloadedURL in
if let downloadedURL = downloadedURL {
do {
try FileManager.default.copyItem(at: downloadedURL, to: copiedURL)
if scoped {
url.stopAccessingSecurityScopedResource()
}
completion(copiedURL)
}
catch {
self.errorMesssage = error.localizedDescription
print("Can't copy the URL: \(error.localizedDescription)")
if scoped {
url.stopAccessingSecurityScopedResource()
}
completion(nil)
}
}
else {
self.errorMesssage = error.localizedDescription
print("Can't copy the URL: \(error.localizedDescription)")
if scoped {
url.stopAccessingSecurityScopedResource()
}
completion(nil)
}
}
}
}
else {
completion(nil)
}
}
The method tryDownloadingUbiquitousItem
tries to download url
on a background thread with startDownloadingUbiquitousItem. We can be notified of its successful completion by using an NSFileCoordinator:
func tryDownloadingUbiquitousItem(_ url: URL, completion: @escaping (URL?) -> ()) {
var downloadedURL:URL?
if FileManager.default.isUbiquitousItem(at: url) {
let queue = DispatchQueue(label: "com.limit-point.startDownloadingUbiquitousItem")
let group = DispatchGroup()
group.enter()
DispatchQueue.global(qos: .userInteractive).asyncAfter(deadline: .now()) {
do {
try FileManager.default.startDownloadingUbiquitousItem(at: url)
let error:NSErrorPointer = nil
let coordinator = NSFileCoordinator(filePresenter: nil)
coordinator.coordinate(readingItemAt: url, options: NSFileCoordinator.ReadingOptions.withoutChanges, error: error) { readURL in
downloadedURL = readURL
}
if let error = error {
self.errorMesssage = error.pointee?.localizedFailureReason
print("Can't download the URL: \(self.errorMesssage ?? "No available error from NSFileCoordinator")")
}
group.leave()
}
catch {
self.errorMesssage = error.localizedDescription
print("Can't download the URL: \(error.localizedDescription)")
group.leave()
}
}
group.notify(queue: queue, execute: {
completion(downloadedURL)
})
}
else {
self.errorMesssage = "URL is not ubiquitous item"
completion(nil)
}
}
If the user selected file is successfully loaded it is displayed in the AVPlayer and it is ready to be scaled:
self.player = AVPlayer(url: copiedURL)
B. scale
- scales the selected video.
Back to ScaleVideoObservable Methods
Scaling is performed by the scale
method that creates and runs a ScaleVideo
object.
The ScaleVideo
initializer init
:
init?(path : String, desiredDuration: Float64, frameRate: Int32, destination: String, progress: @escaping (CGFloat, CIImage?) -> Void, completion: @escaping (URL?, String?) -> Void)
Arguments:
-
path: String - The path of the video file to be scaled.
-
desiredDuration: Float64 - The desired duration in seconds of the scaled video.
-
frameRate: Int32 - The desired frame rate of the scaled video.
-
destination: String - The path of the scaled video file.
-
progress - A handler that is periodically executed to send progress images and values.
-
completion - A handler that is executed when the operation has completed to send a message of success or not.
Example usage is provided in the code:
func testScaleVideo() {
let fm = FileManager.default
let docsurl = try! fm.url(for:.documentDirectory, in: .userDomainMask, appropriateFor: nil, create: true)
let destinationPath = docsurl.appendingPathComponent("DefaultVideoScaled.mov").path
let scaleVideo = ScaleVideo(path: kDefaultURL.path, desiredDuration: 8, frameRate: 30, destination: destinationPath) { p, _ in
print("p = \(p)")
} completion: { result, error in
print("result = \(String(describing: result))")
}
scaleVideo?.start()
}
The scale
method begins by setting the state isScaling
so that the ScaleProgressView
is displayed.
The progress view displays progress images with throttling.
The selected scale factor is used to compute the desired video duration based on the duration of the video to be scaled.
func scale() {
self.player.pause()
isScaling = true
let filename = self.videoURL.deletingPathExtension().lastPathComponent + "-scaled.mov"
let destinationPath = FileManager.documentsURL("\(filename)")!.path
let asset = AVAsset(url: self.videoURL)
self.progressFrameImage = asset.assetFirstFrame()?.cgimage()
DispatchQueue.global(qos: .userInitiated).async {
var lastDate = Date()
var updateProgressImage = true
var totalElapsed:TimeInterval = 0
let desiredDuration:Float64 = asset.duration.seconds * self.factor
self.scaleVideo = ScaleVideo(path: self.videoURL.path, desiredDuration: desiredDuration, frameRate: Int32(self.fps.rawValue), destination: destinationPath, progress: { (value, ciimage) in
DispatchQueue.main.async {
self.progress = value
self.progressTitle = "Progress \(Int(value * 100))%"
}
let elapsed = Date().timeIntervalSince(lastDate)
lastDate = Date()
totalElapsed += elapsed
if totalElapsed > 0.3 && updateProgressImage {
updateProgressImage = false
totalElapsed = 0
var previewImage:CGImage?
autoreleasepool {
if let image = ciimage {
previewImage = image.cgimage()
}
}
DispatchQueue.main.async {
autoreleasepool {
if let previewImage = previewImage {
self.progressFrameImage = previewImage
}
}
updateProgressImage = true
}
}
}, completion: { (resultURL, errorMessage) in
DispatchQueue.main.async {
self.progress = 0
if let resultURL = resultURL, self.scaleVideo?.isCancelled == false {
self.scaledVideoURL = resultURL
}
else {
self.scaledVideoURL = kDefaultURL
}
self.playScaled()
self.isScaling = false
}
})
self.scaleVideo?.start()
}
}
C. prepareToExportScaledVideo
- creates a VideoDocument
to save the scaled video.
Back to ScaleVideoObservable Methods
The scaled video is prepared for export by creating a VideoDocument
for it to be passed to the fileExporter
:
func prepareToExportScaledVideo() {
videoDocument = VideoDocument(url:self.scaledVideoURL)
}
Our VideoDocument
class specifies writable UTType’s for video, specifies an initializer from a URL
, and implements the fileWrapper
method for writing:
class VideoDocument : FileDocument {
var filename:String?
var url:URL?
static var readableContentTypes: [UTType] { [UTType.movie, UTType.quickTimeMovie, UTType.mpeg4Movie] }
init(url:URL) {
self.url = url
filename = url.deletingPathExtension().lastPathComponent
}
required init(configuration: ReadConfiguration) throws {
}
func fileWrapper(configuration: WriteConfiguration) throws -> FileWrapper {
guard let url = self.url
else {
throw CocoaError(.fileWriteUnknown)
}
let fileWrapper = try FileWrapper(url: url)
return fileWrapper
}
}
3. ScaleVideo
Time scaling is performed by the ScaleVideo
class on both the video frames and audio samples simultaneously:
ScaleVideo Writers
1. Video writer writeVideoOnQueue
Writes time scaled video frames:
This resampling method implements upsampling by repeating frames and downsampling by skipping frames of the video to stretch or contract it in time respectively.
2. Audio writer writeAudioOnQueue
Writes time scaled audio samples:
This method is based on the technique developed in the blog ScaleAudio. But rather than scale the whole audio file at once, as is done in ScaleAudio, we implement scaling in a progressive manner where audio is scaled, when it can be, as it is read from the file being scaled.
The ScaleVideo
class is initialized with:
init?(path : String, desiredDuration: Float64, frameRate: Int32, destination: String, progress: @escaping (CGFloat, CIImage?) -> Void, completion: @escaping (URL?, String?) -> Void)
The frameDuration
is set using the frameRate
. It is used later to ensure the frame rate:
let scale:Int32 = 600
self.frameDuration = CMTime(value: 1, timescale: CMTimeScale(frameRate)).convertScale(scale, method: CMTimeRoundingMethod.default)
The timeScaleFactor
is set using the duration of the video and the desired duration:
self.timeScaleFactor = self.desiredDuration / CMTimeGetSeconds(videoAsset.duration)
We also set the values of some properties we will need for audio scaling:
if let outputSettings = audioReaderSettings(),
let sampleBuffer = self.videoAsset.audioSampleBuffer(outputSettings:outputSettings),
let sampleBufferSourceFormat = CMSampleBufferGetFormatDescription(sampleBuffer),
let audioStreamBasicDescription = sampleBufferSourceFormat.audioStreamBasicDescription
{
outputBufferSize = sampleBuffer.numSamples
channelCount = Int(audioStreamBasicDescription.mChannelsPerFrame)
totalSampleCount = self.videoAsset.audioBufferAndSampleCounts(outputSettings).sampleCount
sourceFormat = sampleBufferSourceFormat
}
outputBufferSize
- The number of audio samples in a sample buffer
channelCount
- The number of audio channels
totalSampleCount
- The total number of audio samples
sourceFormat
- a format description CMAudioFormatDescription of the audio
The ciOrientationTransform
is the transform we apply to video frames to display them with the proper orientation for progress feedback:
ciOrientationTransform = videoAsset.ciOrientationTransform()
ScaleVideo
is a subclass of the VideoWriter
class that performs tasks which setup the AVFoundation methods for reading and writing video and audio sample buffers. The VideoWriter
is not an abstract class and can actually be used to read and write a video as a passthrough copy of it, with sample buffers simply read and rewritten as is:
func testVideoWriter() {
let fm = FileManager.default
let docsurl = try! fm.url(for:.documentDirectory, in: .userDomainMask, appropriateFor: nil, create: true)
let destinationPath = docsurl.appendingPathComponent("DefaultVideoCopy.mov").path
let videoWriter = VideoWriter(path: kDefaultURL.path, destination: destinationPath, progress: { p, _ in
print("p = \(p)")
}, completion: { result, error in
print("result = \(String(describing: result))")
})
videoWriter?.start()
}
The ScaleVideo
subclass overrides videoReaderSettings
and videoWriterSettings
since sample buffers for the video frames need to be decompressed in order to adjust their presentation times for time scaling. In videoReaderSettings
color properties are specified if input is HDR video:
// For HDR input specify SDR color properties in the videoReaderSettings
func isHDR() -> Bool {
let hdrTracks = videoAsset.tracks(withMediaCharacteristic: .containsHDRVideo)
return hdrTracks.count > 0
}
override func videoReaderSettings() -> [String : Any]? {
var settings:[String : Any]?
settings = [kCVPixelBufferPixelFormatTypeKey as String: NSNumber(value: kCVPixelFormatType_32BGRA as UInt32)]
if isHDR() {
settings?[AVVideoColorPropertiesKey]
= [AVVideoColorPrimariesKey:
AVVideoColorPrimaries_ITU_R_709_2,
AVVideoTransferFunctionKey:
AVVideoTransferFunction_ITU_R_709_2,
AVVideoYCbCrMatrixKey:
AVVideoYCbCrMatrix_ITU_R_709_2]
}
return settings
}
// Write compressed
override func videoWriterSettings() -> [String : Any]? {
return [AVVideoCodecKey : AVVideoCodecType.h264, AVVideoWidthKey : movieSize.width, AVVideoHeightKey : movieSize.height]
}
Two primary tasks are performed by the ScaleVideo
method writeVideoAndAudio
:
func writeVideoAndAudio() {
self.writeVideoOnQueue(self.videoQueue)
self.writeAudioOnQueue(self.audioQueue)
}
Where the overridden writeVideoOnQueue
and writeAudioOnQueue
are reimplementations of the methods of VideoWriter
to perform the time scaling of video and audio. Let’s now discuss these method overrides in detail.
1. Video writer writeVideoOnQueue
This resampling method implements upsampling by repeating frames and downsampling by skipping frames of the video to stretch or contract it in time respectively.
override func writeVideoOnQueue(_ serialQueue: DispatchQueue) {
guard self.videoReader.startReading() else {
self.finishVideoWriting()
return
}
var lastPercent:CGFloat = 0
videoWriterInput.requestMediaDataWhenReady(on: serialQueue) {
while self.videoWriterInput.isReadyForMoreMediaData, self.writingVideoFinished == false {
if self.currentIndex == 0 {
lastPercent = self.copyNextSampleBufferForResampling(lastPercent: lastPercent)
}
guard self.isCancelled == false else {
self.videoReader?.cancelReading()
self.finishVideoWriting()
return
}
guard self.sampleBuffer != nil else {
self.finishVideoWriting()
return
}
autoreleasepool { () -> Void in
if self.currentTime <= self.sampleBufferPresentationTime {
if self.appendNextSampleBufferForResampling() {
self.currentTime = CMTimeAdd(self.currentTime)
}
else {
self.sampleBuffer = nil
}
}
else {
lastPercent = self.copyNextSampleBufferForResampling(lastPercent: lastPercent)
}
}
}
}
}
First we check for cancellation with isCancelled
, which is set to true
when user taps the Cancel
button in the ScaleProgressView
, or for whether the next sampleBuffer
is nil, either of which stops video writing by calling finishVideoWriting
:
guard self.isCancelled == false else {
self.videoReader?.cancelReading()
self.finishVideoWriting()
return
}
guard self.sampleBuffer != nil else {
self.finishVideoWriting()
return
}
ScaleVideo
has a property currentTime
which keeps track of the current video time at which the scaled video frames will be inserted.
Each time we read a sample buffer in copyNextSampleBufferForResampling
to append to the output video its time scaled presentation time is stored in the property sampleBufferPresentationTime
.
We check if this scaled sampleBufferPresentationTime
occurs after the currentTime
, and append it with appendNextSampleBufferForResampling
if it does, but with its time stamp adjusted to match the currentTime
, otherwise we move on to the next sample buffer with copyNextSampleBufferForResampling
:
if self.currentTime <= self.sampleBufferPresentationTime {
if self.appendNextSampleBufferForResampling() {
self.currentTime = CMTimeAdd(self.currentTime)
}
else {
self.sampleBuffer = nil
}
}
else {
lastPercent = self.copyNextSampleBufferForResampling(lastPercent: lastPercent)
}
This process repeats or skips sample buffers as needed, and the currentTime
is incremented by the frameDuration
whenever a sample buffer is appended:
self.currentTime = CMTimeAdd(self.currentTime, self.frameDuration)
Moreover, as we will see in appendNextSampleBufferForResampling
, sample buffers are ensured to have presentation times that match the currentTime
so that the video is written with time scaling while matching the desired frame rate, or frameDuration
.
Write Video
Let’s look at the two methods referred to in writeVideoOnQueue
:
A. copyNextSampleBufferForResampling - copy a time scaled video frame
B. appendNextSampleBufferForResampling - append a time scaled video frame
A. copyNextSampleBufferForResampling
- copy a time scaled video frame
While the AVAssetWriterInput videoWriterInput
is ready for accepting data we read video frame sample buffers with the method copyNextSampleBufferForResampling
:
func copyNextSampleBufferForResampling(lastPercent:CGFloat) -> CGFloat {
self.sampleBuffer = nil
guard let sampleBuffer = self.videoReaderOutput?.copyNextSampleBuffer() else {
return lastPercent
}
self.sampleBuffer = sampleBuffer
if self.videoReaderOutput.outputSettings != nil {
var presentationTimeStamp = CMSampleBufferGetPresentationTimeStamp(sampleBuffer)
presentationTimeStamp = CMTimeMultiplyByFloat64(presentationTimeStamp, multiplier: self.timeScaleFactor)
if let adjustedSampleBuffer = sampleBuffer.setTimeStamp(time: presentationTimeStamp) {
self.sampleBufferPresentationTime = presentationTimeStamp
self.sampleBuffer = adjustedSampleBuffer
}
else {
self.sampleBuffer = nil
}
}
self.currentIndex += 1
let percent:CGFloat = min(CGFloat(self.currentIndex)/CGFloat(self.frameCount), 1.0)
self.cumulativeProgress += ((percent - lastPercent) * self.progressFactor)
self.progressAction(self.cumulativeProgress, self.sampleBuffer?.ciimage()?.transformed(by:ciOrientationTransform))
print(self.cumulativeProgress)
return percent
}
In this method a CMSampleBuffer sampleBuffer
is read with copyNextSampleBuffer:
sampleBuffer = self.videoReaderOutput?.copyNextSampleBuffer()
Note that whenever the sampleBuffer
property is nil reading in writeVideoOnQueue
stops:
guard self.sampleBuffer != nil else {
self.finishVideoWriting()
return
}
Next we check that outputSettings
is not nil, a reminder that we need decompressed samples to modify the presentation times, and retrieve the presentation time with CMSampleBufferGetPresentationTimeStamp:
var presentationTimeStamp = CMSampleBufferGetPresentationTimeStamp(sampleBuffer)
The sample buffer presentation time is then scaled by the timeScaleFactor
previously computed in init
using CMTimeMultiplyByFloat64:
presentationTimeStamp = CMTimeMultiplyByFloat64(presentationTimeStamp, multiplier: self.timeScaleFactor)
Then the sample buffer time stamp is adjusted to the scaled value presentationTimeStamp
and we set the sampleBuffer
property to this adjusted sample buffer, recording its adjusted presentation time in sampleBufferPresentationTime
:
if let adjustedSampleBuffer = sampleBuffer.setTimeStamp(time: presentationTimeStamp) {
self.sampleBufferPresentationTime = presentationTimeStamp
self.sampleBuffer = adjustedSampleBuffer
}
else {
self.sampleBuffer = nil
}
This code is using a CMSampleBuffer extension setTimeStamp
, written specifically for sample buffers that are expected to be video frames:
func setTimeStamp(time: CMTime) -> CMSampleBuffer? {
var count: CMItemCount = 0
guard CMSampleBufferGetSampleTimingInfoArray(self, entryCount: 0, arrayToFill: nil, entriesNeededOut: &count) == noErr, count == 1 else {
return nil
}
let timingInfoArray = [CMSampleTimingInfo(duration: CMTime.invalid, presentationTimeStamp: time, decodeTimeStamp: CMTime.invalid)]
var sampleBuffer: CMSampleBuffer?
guard CMSampleBufferCreateCopyWithNewTiming(allocator: nil, sampleBuffer: self, sampleTimingEntryCount: count, sampleTimingArray: timingInfoArray, sampleBufferOut: &sampleBuffer) == noErr else {
return nil
}
return sampleBuffer
}
Finally the currentIndex
is updated for progress monitoring, which also sends a progress image CIImage to the progress handler with proper orientation for display:
self.currentIndex += 1
let percent:CGFloat = min(CGFloat(self.currentIndex)/CGFloat(self.frameCount), 1.0)
self.cumulativeProgress += ((percent - lastPercent) * self.progressFactor)
self.progressAction(self.cumulativeProgress, self.sampleBuffer?.ciimage()?.transformed(by:ciOrientationTransform))
Since we have two threads running the scaling of audio and video the progress value is modified by the progressFactor
with value 0.5
so that each thread contributes to half of the progress.
B. appendNextSampleBufferForResampling
- append a time scaled video frame
After copying a time scaled video frame the next step is to append the time scaled video frame to the output with appendNextSampleBufferForResampling
:
func appendNextSampleBufferForResampling() -> Bool {
var appended = false
if let sampleBuffer = self.sampleBuffer {
if self.currentTime != sampleBufferPresentationTime {
if let adjustedSampleBuffer = sampleBuffer.setTimeStamp(time: self.currentTime) {
appended = self.videoWriterInput.append(adjustedSampleBuffer)
}
}
else {
appended = self.videoWriterInput.append(sampleBuffer)
}
}
return appended
}
As noted prior the currentTime
is the time the next sample buffer will be inserted into the output video.
The current sampleBuffer
, which was time adjusted using the timeScaleFactor
in copyNextSampleBufferForResampling
, had its presentation time saved in the current sampleBufferPresentationTime
.
If the sampleBuffer
presentation time matches the current time we simply append it to the output:
appended = self.videoWriterInput.append(sampleBuffer)
But if it does not match the currentTime
then we adjust the presentation time of the sample buffer to the currentTime
, append it and update the currentTime
by incrementing it by the frameDuration
:
if self.currentTime != sampleBufferPresentationTime {
if let adjustedSampleBuffer = sampleBuffer.setTimeStamp(time: self.currentTime) {
appended = self.videoWriterInput.append(adjustedSampleBuffer)
}
}
2. Audio writer writeAudioOnQueue
This method implements scaling audio samples and is based on the technique of our blog ScaleAudio. It will be referred to as progressive scaling in which audio is scaled in blocks, or subarrays of the whole, as the audio data is accumulated.
The discussion here focuses on how the method of ScaleAudio has been adapted to scaling audio progressively.
The method achieves the goal of no discrepancy between scaling an array by joining scaled subarrays and scaling the whole array at once.
The simpler approach of only scaling each individual sample buffer returned by copyNextSampleBuffer may cause ‘clicking’ sounds in the scaled audio file due to such discrepancies. This is especially noticeable in pure tone audio, but can also be witnessed with small arrays of numbers.
The method is also concerned about reducing memory requirements by only keeping a minimal amount of audio and control data in memory while progressing: otherwise memory usage can grow quite large.
override func writeAudioOnQueue(_ serialQueue:DispatchQueue) {
let length = Int(Double(totalSampleCount) * self.timeScaleFactor)
guard let controlBlocks = ControlBlocks(length: length, count: totalSampleCount, size: outputBufferSize, smoothly: true) else {
self.finishAudioWriting()
return
}
guard let audioReader = self.audioReader, let audioWriterInput = self.audioWriterInput, let audioReaderOutput = self.audioReaderOutput, audioReader.startReading() else {
self.finishAudioWriting()
return
}
var arrays_to_scale = [[Int16]](repeating: [], count: channelCount)
var scaled_array:[Int16] = []
var nbrItemsRemoved:Int = 0
var nbrItemsToRemove:Int = 0
var block = controlBlocks.first()
controlBlocks.removeFirst()
func update_arrays_to_scale() {
if nbrItemsToRemove > arrays_to_scale[0].count {
nbrItemsRemoved += arrays_to_scale[0].count
nbrItemsToRemove = nbrItemsToRemove - arrays_to_scale[0].count
for i in 0...arrays_to_scale.count-1 {
arrays_to_scale[i].removeAll()
}
}
else if nbrItemsToRemove > 0 {
for i in 0...arrays_to_scale.count-1 {
arrays_to_scale[i].removeSubrange(0...nbrItemsToRemove-1)
}
nbrItemsRemoved += nbrItemsToRemove
nbrItemsToRemove = 0
}
}
func lastIndexAdjusted(_ array:[Double]) -> Int? {
guard array.count > 0, let last = array.last else {
return nil
}
var lastIndex = Int(trunc(last))
if last - trunc(last) > 0 {
lastIndex += 1
}
return lastIndex
}
func offsetBlock(_ block:[Double]?) -> [Double]? {
if let block = block {
return vDSP.add(-trunc(block[0]), block)
}
return nil
}
var lastPercent:CGFloat = 0
var bufferSamplesCount:Int = 0
audioWriterInput.requestMediaDataWhenReady(on: serialQueue) {
while audioWriterInput.isReadyForMoreMediaData, self.writingAudioFinished == false {
guard self.isCancelled == false else {
self.audioReader?.cancelReading()
self.finishAudioWriting()
return
}
if let sampleBuffer = audioReaderOutput.copyNextSampleBuffer() {
bufferSamplesCount += sampleBuffer.numSamples
if let bufferSamples = self.extractSamples(sampleBuffer) {
let channels = bufferSamples.extract_array_channels(channelCount: self.channelCount)
for i in 0...arrays_to_scale.count-1 {
arrays_to_scale[i].append(contentsOf: channels[i])
}
update_arrays_to_scale()
while true {
if let controlBlockOffset = offsetBlock(block), let indexAdjusted = lastIndexAdjusted(controlBlockOffset), indexAdjusted < arrays_to_scale[0].count {
var scaled_channels:[[Int16]] = []
for array_to_scale in arrays_to_scale {
scaled_channels.append(array_to_scale.scaleToD(control: controlBlockOffset))
}
if let scaled_channels_interleaved = self.interleave_arrays(scaled_channels) {
scaled_array.append(contentsOf: scaled_channels_interleaved)
}
block = controlBlocks.first()
if let controlBlock = block {
let controlBlockIndex = Int(trunc(controlBlock[0]))
nbrItemsToRemove = nbrItemsToRemove + (controlBlockIndex - nbrItemsRemoved)
update_arrays_to_scale()
controlBlocks.removeFirst()
}
}
else {
break
}
}
if scaled_array.count > 0 {
if let sourceFormat = self.sourceFormat, let scaledBuffer = self.sampleBufferForSamples(audioSamples: scaled_array, channelCount: self.channelCount, formatDescription: sourceFormat), audioWriterInput.append(scaledBuffer) == true {
scaled_array.removeAll()
}
else {
audioReader.cancelReading()
}
}
}
let percent = Double(bufferSamplesCount)/Double(self.totalSampleCount)
self.cumulativeProgress += ((percent - lastPercent) * self.progressFactor)
lastPercent = percent
self.progressAction(self.cumulativeProgress, nil)
print(self.cumulativeProgress)
}
else {
self.finishAudioWriting()
}
}
}
}
First we check for cancellation with isCancelled
, which is set to true
when user taps the Cancel
button in the ScaleProgressView
, or for whether the next sampleBuffer
is nil, either of which stops audio writing by calling finishAudioWriting
:
guard self.isCancelled == false else {
self.audioReader?.cancelReading()
self.finishAudioWriting()
return
}
if let sampleBuffer = audioReaderOutput.copyNextSampleBuffer() {
...
}
else {
self.finishAudioWriting()
}
Then we begin writing scaled audio samples.
Write Audio
A. Prerequisites - Progressive scaling reuses methods of ScaleAudio
B. Memory Management - Discusses how memory usage is kept low with a virtual array
C. scaleToD - The method of ScaleAudio is adpated to progressive scaling
D. Progressive Scaling - Discusses how audio sample buffers are read, scaled and written sequentially
A. Prerequisites
The prerequisites for progressive audio scaling is the discussion of audio scaling in ScaleAudio. Most notably how control points are used by the vDSP routine vDSP_vlint, and its counterpart vDSP_vlintD in its method scaleToD
.
These utility methods have been borrowed from ScaleAudio with no functional changes:
func extractSamples(_ sampleBuffer:CMSampleBuffer) -> [Int16]?
func interleave_arrays(_ arrays:[[Int16]]) -> [Int16]?
func sampleBufferForSamples(audioSamples:[Int16], channelCount:Int, formatDescription:CMAudioFormatDescription) -> CMSampleBuffer?
We also reuse the following two extensions on Array
unchanged:
func extract_array_channel(channelIndex:Int, channelCount:Int) -> [Int16]?
func extract_array_channels(channelCount:Int) -> [[Int16]]
B. Memory Management
Audio is sampled at such a high rate to reproduce the Nyquist frequency that it doesn’t take that long before the total number of audio samples for video audio can become very large. Consequently the arrays of Int16
and Double
values that store the audio samples and control points for scaling can become very large, and the following describes a technique to alleviate that.
The basic idea is to accumulate enough audio samples as they are being read to scale and fill one sample buffer for writing, and then dispose of them as processed.
This is achieved by using a virtual arrays for control points and storing accumulated extracted samples into an array that also has items removed after they have been scaled to save memory - which can grow quite large otherwise.
- ControlBlocks - A virtual array that calculates blocks of control points to pass to vDSP_vlintD via
scaleToD
- update_arrays_to_scale - Removes audio samples from memory after they are processed
Lets discuss each in more detail:
1. ControlBlocks
Scaling the audio samples in blocks requires some management of the control points we pass to the vDSP routine that linearly interpolates for scaling.
Since we process the audio in blocks we need the interpolating control points available as corresponding blocks in lieu of having the whole array of control points available as we did in ScaleAudio.
To that end the new ControlBlocks
class is introduced:
class ControlBlocks {
var length:Int // length of controls array
var count:Int // length of array (controls are indexes into this array)
var size:Int // block sizes
var smoothly:Bool
var currentBlockIndex:Int = 0 // block start index into virtual array of count `length` controls
init?(length:Int, count:Int, size:Int, smoothly:Bool) {
guard length > 0, count > 0, size > 0 else {
return nil
}
self.length = length
self.count = count
self.size = size
self.smoothly = smoothly
}
func control(n:Int) -> Double { // n in 0...length-1
if length > 1, n == length-1 {
return Double(count-1)
}
if count == 1 || length == 1 {
return 0
}
if smoothly, length > count {
let denominator = Double(length - 1) / Double(count - 1)
let x = Double(n) / denominator
return floor(x) + simd_smoothstep(0, 1, simd_fract(x))
}
return Double(count - 1) * Double(n) / Double(length-1)
}
func removeFirst() {
currentBlockIndex += size
}
func first() -> [Double]? {
guard currentBlockIndex < length else {
return nil
}
let start = currentBlockIndex
let end = Swift.min(currentBlockIndex + size, length)
var block = [Double](repeating: 0, count: end-start)
for n in start...end-1 {
block[n-start] = control(n: n)
}
return block
}
func blocks() -> [[Double]] { // for testing
var blocks:[[Double]] = []
while let block = self.first() {
blocks.append(block)
self.removeFirst()
}
return blocks
}
}
This class, a virtual array, is mimicking operations we can perform on a mutable array: namely first()
and removeFirst()
.
It is simulating advancing through an array of blocks of control points by removing each block as it is used to scale, so the next block needed is the first.
Rather than create an array of all the control points as we do in ScaleAudio the values from first()
are returned as the values of the method control
which produces the same values as before for the whole control array.
Since we are processing audio samples as subarrays of the whole, and only those not yet processed, we need to adjust the control points returned by first()
by their offset in the actual array to start at index 0 - since control points are indices into the array being scaled.
So we define a function in writeAudioOnQueue
that does this, named offsetBlock
, that adjusts it elements of an array by subtracting the index for the first element from all other elements:
func offsetBlock(_ block:[Double]?) -> [Double]? {
if let block = block {
return vDSP.add(-trunc(block[0]), block)
}
return nil
}
2. update_arrays_to_scale
As audio sample are extracted from sample buffers during reading they are accumulated into mutable arrays, one for each channel:
var arrays_to_scale = [[Int16]](repeating: [], count: channelCount)
And the scaled channels are interleaved into scaled_array
:
var scaled_array:[Int16] = []
To limit memory the arrays_to_scale
have items removed periodically when it is determined pervious values are no longer needed.
This requires some bookkeeping using variables nbrItemsRemoved
and nbrItemsToRemove
:
var nbrItemsRemoved:Int = 0
var nbrItemsToRemove:Int = 0
And a method that performs this task when needed:
func update_arrays_to_scale() {
if nbrItemsToRemove > arrays_to_scale[0].count {
nbrItemsRemoved += arrays_to_scale[0].count
nbrItemsToRemove = nbrItemsToRemove - arrays_to_scale[0].count
for i in 0...arrays_to_scale.count-1 {
arrays_to_scale[i].removeAll()
}
}
else if nbrItemsToRemove > 0 {
for i in 0...arrays_to_scale.count-1 {
arrays_to_scale[i].removeSubrange(0...nbrItemsToRemove-1)
}
nbrItemsRemoved += nbrItemsToRemove
nbrItemsToRemove = 0
}
}
Since arrays_to_scale
are collecting audio samples as sample buffers are read and only processed under certain conditions this cleanup is performed at two locations: after new samples are added and after samples are processed.
Samples are processed when enough of them have been accumulated to fill a scaled audio buffer. Note that when scaling audio down, i.e. time scale factor < 1, more samples are needed to fill a buffer than when scaling up, i.e. time scale factor >1.
Recall how control points are used, from ScaleAudio:
The integer part of each is the index into the array to be scaled, and the fractional part is used to perform a weighted average with the successive index.
As audio buffers are read for progressive scaling we need to ensure that enough audio samples are accumulated so that the current control block can be applied by scaleToD
.
Since control points are references to elements in the array of audio samples, and they form an increasing ramp, it is the last control point that is used to determine if enough audio samples have been accumulated to apply scaleToD
.
If the last control point has a fractional part then we need to make sure the accumulated samples include up to the audio sample referenced by the last control point but also the successive audio sample that will be averaged with it.
So given an block of control points the last audio sample index needed is the ceiling of the last control point.
The following method is defined to return that value:
func lastIndexAdjusted(_ array:[Double]) -> Int? {
guard array.count > 0, let last = array.last else {
return nil
}
var lastIndex = Int(trunc(last))
if last - trunc(last) > 0 {
lastIndex += 1
}
return lastIndex
}
C. scaleToD
Finally we adjust the scaleToD
extension of Array
in ScaleAudio with an argument for the control points to use with vDSP_vlintD. This way it can be used with each control point block to scale each corresponding subarray of audio samples:
extension Array where Element == Int16 {
func scaleToD(control:[Double]) -> [Element] {
let length = control.count
guard length > 0 else {
return []
}
let stride = vDSP_Stride(1)
var result = [Double](repeating: 0, count: length)
var double_array = vDSP.integerToFloatingPoint(self, floatingPointType: Double.self)
let lastControl = control[control.count-1]
let lastControlTrunc = Int(trunc(lastControl))
if lastControlTrunc > self.count - 2 {
let zeros = [Double](repeating: 0, count: lastControlTrunc - self.count + 2)
double_array.append(contentsOf: zeros)
}
vDSP_vlintD(double_array,
control, stride,
&result, stride,
vDSP_Length(length),
vDSP_Length(double_array.count))
return vDSP.floatingPointToInteger(result, integerType: Int16.self, rounding: .towardNeardestInteger)
}
}
Recall the need to pad with zeroes in scaleToD
is explained in ScaleAudio, and due to the restriction noted in the documentation of vDSP_vlint, namely:
“However, the integer parts of the values in B must be greater than or equal to zero and less than or equal to M - 2.”
(B is the control point array; M is the length, or count, of the array being scaled)
This technicality requires some discussion about why this zero padding has no effect here:
In the detailed discussion of control points in ScaleAudio it is explained why this padding had no effect on the result. Basically the restriction is to ensure that the array is long enough so that the last interpolating average using the fractional part of the last control point does not extend outside the range of the array being scaled.
In ScaleAudio it is always assured that the last control point never has a fractional part, so it does not matter if a 0 is appended to the array. But we must append the 0 anyway to satisfy the restriction since, by design, the last control point always references the last element of the array being scaled.
Moreover during progressive scaling we use the control point blocks to ensure the array is long enough so that it always includes an element for the last average, should it be needed, if the last control point does have a fractional part: The lastIndexAdjusted
is used to ensure that. There will be no padding required for using scaleToD
on any control point block but for the last one.
If you set a breakpoint to see when the padding occurs you will find that it only occurs one time for each channel when the last batch of audio samples is processed, just as it is at the end in ScaleAudio.
D. Progressive Scaling
With memory management and other preparation outlined above now the core operation of writeAudioOnQueue
can be described.
Recall these values, needed now, were set in ScaleVideo
init
:
outputBufferSize
- The number of audio samples in a sample buffer
channelCount
- The number of audio channels
totalSampleCount
- The total number of audio samples
sourceFormat
- a format description CMAudioFormatDescription of the audio
We use totalSampleCount
and outputBufferSize
to define `controlBlocks:
let length = Int(Double(totalSampleCount) * self.timeScaleFactor)
guard let controlBlocks = ControlBlocks(length: length, count: totalSampleCount, size: outputBufferSize, smoothly: true) else {
self.finishAudioWriting()
return
}
The process of scaling begins with reading the next sample buffer, keeping track of total number of samples for progress and extracting the audio samples:
if let sampleBuffer = audioReaderOutput.copyNextSampleBuffer() {
bufferSamplesCount += sampleBuffer.numSamples
if let bufferSamples = self.extractSamples(sampleBuffer) {
We then extract the interleaved channels with extract_array_channels
and add them to the arrays_to_scale
:
let channels = bufferSamples.extract_array_channels(channelCount: self.channelCount)
for i in 0...arrays_to_scale.count-1 {
arrays_to_scale[i].append(contentsOf: channels[i])
}
When elements are added to arrays_to_scale
try some cleanup to reduce memory:
update_arrays_to_scale()
The applicability of update_arrays_to_scale
depends on what type of scaling is occurring: Scaling down requires more samples than scaling up.
Next we have a while loop that performs the progressive scaling, scaling each channel and interleaving as we do in ScaleAudio:
while true {
if let controlBlockOffset = offsetBlock(block), let indexAdjusted = lastIndexAdjusted(controlBlockOffset), indexAdjusted < arrays_to_scale[0].count {
var scaled_channels:[[Int16]] = []
for array_to_scale in arrays_to_scale {
scaled_channels.append(array_to_scale.scaleToD(control: controlBlockOffset))
}
if let scaled_channels_interleaved = self.interleave_arrays(scaled_channels) {
scaled_array.append(contentsOf: scaled_channels_interleaved)
}
block = controlBlocks.first()
if let controlBlock = block {
let controlBlockIndex = Int(trunc(controlBlock[0]))
nbrItemsToRemove = nbrItemsToRemove + (controlBlockIndex - nbrItemsRemoved)
update_arrays_to_scale()
controlBlocks.removeFirst()
}
}
else {
break
}
}
This loop is scaling blocks of the currently accumulated audio samples with scaleToD
.
First it offsets the control points in block
- they are offset because the array of samples to scale always start at index 0, since we remove them with update_arrays_to_scale
as they are processed:
let controlBlockOffset = offsetBlock(block)
controlBlockOffset
is the first set of offset control points that have not yet been used to scale.
It can only be used to scale audio samples in arrays_to_scale
if the last index referenced by the control points is included in the samples, i.e. if we have accumulated enough samples:
let indexAdjusted = lastIndexAdjusted(controlBlockOffset), indexAdjusted < arrays_to_scale[0].count
When these conditions are met perform the scaling on each channel:
var scaled_channels:[[Int16]] = []
for array_to_scale in arrays_to_scale {
scaled_channels.append(array_to_scale.scaleToD(control: controlBlockOffset))
}
Then accumulate the interleaved scaled channels into the scaled_array
:
if let scaled_channels_interleaved = self.interleave_arrays(scaled_channels) {
scaled_array.append(contentsOf: scaled_channels_interleaved)
}
Now load the next block
of controls:
block = controlBlocks.first()
Finally cleanup the arrays_to_scale
if necessary:
if let controlBlock = block {
let controlBlockIndex = Int(trunc(controlBlock[0]))
nbrItemsToRemove = nbrItemsToRemove + (controlBlockIndex - nbrItemsRemoved)
update_arrays_to_scale()
controlBlocks.removeFirst()
}
The nbrItemsToRemove
must be updated by the amount of items processed which can be determined by the index referenced by the first element of the current control block array:
controlBlockIndex - nbrItemsRemoved
Then update the arrays:
update_arrays_to_scale()
And remove the first control block:
controlBlocks.removeFirst()
When the while loop is exited we have enough scaled samples to write them to the output video file:
if scaled_array.count > 0 {
if let sourceFormat = self.sourceFormat, let scaledBuffer = self.sampleBufferForSamples(audioSamples: scaled_array, channelCount: self.channelCount, formatDescription: sourceFormat), audioWriterInput.append(scaledBuffer) == true {
scaled_array.removeAll()
}
else {
audioReader.cancelReading()
}
}
After the buffer is written to the file the scaled_array
is cleared completely for the next cycle until there are no more audio sample buffers to read.
Conclusion
By extending the audio scaling technique in ScaleAudio to handle arbitrary duration audio we developed a class ScaleVideo
that can be used to time scale videos uniformly by a given scale factor.
The ScaleVideo
initializer init
:
init?(path : String, desiredDuration: Float64, frameRate: Int32, destination: String, progress: @escaping (CGFloat, CIImage?) -> Void, completion: @escaping (URL?, String?) -> Void)
Arguments:
-
path: String - The path of the video file to be scaled.
-
desiredDuration: Float64 - The desired duration in seconds of the scaled video.
-
frameRate: Int32 - The desired frame rate of the scaled video.
-
destination: String - The path of the scaled video file.
-
progress - A handler that is periodically executed to send progress images and values.
-
completion - A handler that is executed when the operation has completed to send a message of success or not.
Example usage is provided in the code:
func testScaleVideo() {
let fm = FileManager.default
let docsurl = try! fm.url(for:.documentDirectory, in: .userDomainMask, appropriateFor: nil, create: true)
let destinationPath = docsurl.appendingPathComponent("DefaultVideoScaled.mov").path
let scaleVideo = ScaleVideo(path: kDefaultURL.path, desiredDuration: 8, frameRate: 30, destination: destinationPath) { p, _ in
print("p = \(p)")
} completion: { result, error in
print("result = \(String(describing: result))")
}
scaleVideo?.start()
}