ISDK-2241: Co-Viewing example with custom AudioDevice #325

ceaglest · 2018-11-08T17:29:11Z

Summary

This PR adds a new Co-Viewing example.

The sample app is mostly functional, but does not have completed UI and README.md. These items have been given separate tickets and will be completed after this PR (do we need a feature branch?).

The app is hardcoded with streamable content for presenters (see TODOs). Launching the app and tapping "Presenter" selects the hardcoded remote content URL. However, the app can also open the following document types:

public.mpeg-4
com.apple.quicktime-movie

If you encounter one of these types of videos anywhere on your iOS device (for example the Files app, or Dropbox), you can open them with Co-Viewing. This immediately connects to a Room as a Presenter.

Design

The Co-Viewing app has both Presenter and Viewer roles. A Viewer is generally a pretty standard Participant and shares their Camera and Microphone. A Presenter shares their camera, the video content, and an audio track with both microphone and player audio.

Please refer to the internal design doc for more info. I will also add a more detailed diagram of the audio pipeline.

Limitations

The world of video playback is complex, and AVPlayer seamlessly handles a lot of different content types. Not all of these types are compatible with CoViewing for various reasons. For example, don't expect to be able to stream any FairPlay encrypted content.

The following table lists what kind of content I've tested.

Content Type	Audio Tap	Video Output	Notes
Local (.mp4, .m4v, .mov)	✅	✅	Content transforms are not handled for portrait video.
Progressive Stream (.mp4, .m4v, .mov)	✅	✅
HTTP Live Stream (.m3u8)	❌	✅	MTAudioProcessingTap is prepared, but doesn't receive samples.
FairPlay Encrypted (remote)	❌	❌	Untested, known not to work.
FairPlay Encrypted (local)	❓	❓	Untested.
HDR10, Dolby Vision	❓	❓	Untested. We request 8-bit 4:2:0 NV12 buffers.

About HLS

If you really want to stream HLS content, there might be a way. An interested developer could perform several steps to covert a live stream into a progressive download mp4.

For a given HLS URI, replace https with a custom scheme appscheme.
Provide a custom AVAssetResourceLoader.
Parse the m3u8 playlist.
If it is a master playlist, choose a quality level which is appropriate. Otherwise skip to step 6.
Parse the m3u8 playlist for your selected quality level.
Fetch the transport stream (.ts) or fragmented mp4 (.fmp4) segments in response to AVPlayer's requests.
Demux and Remux to .mp4.

This could be done with heavy dependencies like ffpmeg (remuxing is a one liner in the cli), or lighter open source projects. You could also imagine an offline version of the same process, which downloads all of the segments and then assembles a playable .mp4 file.

However, I don't know how Apple's App Store reviewers would feel about such a technique. Ultimately, it would be great if they fixed the bug with AVPlayer and MTAudioProcessingTap, or provided sample code which demonstrates how to use it with HLS content.

TODO

Finish sample rate conversion for recording.
Decide on final mixing solution for recording.
Revisit test content.

* Need to add KVO.

* Use AVPlayerItemVideoOutput, and CADisplayLink to pull frames.

* TODO - Set the MTAudioTapProcessor.

* Still need to figure out how to consume frames properly.

* TODO - Audio is heavily distorted.

* Add a rendering method for WebRTC audio. * Hook up both audio tap and rendering inputs to the mixer.

ceaglest · 2018-11-09T00:51:32Z

CoViewingExample/AudioDevices/ExampleAVPlayerAudioDevice.m

+     * In this example we don't need any fixed size buffers or other pre-allocated resources. We will simply write
+     * directly to the AudioBufferList provided in the AudioUnit's rendering callback.
+     */
+    return YES;


Comment needs to be updated.

ceaglest · 2018-11-09T00:55:41Z

CoViewingExample/ExampleAVPlayerAudioTap.swift

+import Foundation
+import MediaToolbox
+
+class ExampleAVPlayerAudioTap {


This class can probably be removed, unless its useful to demonstrate why we didn't write the audio code in Swift.

I guess lets remove it if it is not used.

ceaglest · 2018-11-09T00:57:00Z

CoViewingExample/ExampleAVPlayerSource.swift

+            attributes = [
+                kCVPixelBufferPixelFormatTypeKey as String : kCVPixelFormatType_420YpCbCr8BiPlanarFullRange
+                ] as [String : Any]
+        }


We should explicitly add kCVPixelBufferIOSurfacePropertiesKey. I was doing it wrong before.

kCVPixelBufferIOSurfacePropertiesKey as String : [:]

ceaglest · 2018-11-09T00:59:49Z

Podfile

+    platform :ios, '11.0'
+    project 'CoViewingExample.xcproject'
+
+    pod 'TPCircularBuffer', '~> 1.6'


Just to call it out, this PR is adding a dependency which is only used by the new example.

ceaglest · 2018-11-09T20:53:21Z

CoViewingExample/AudioDevices/ExampleAVPlayerAudioDevice.m

+        }
+        self.capturingContext->deviceContext = context;
+        self.capturingContext->maxFramesPerBuffer = _capturingFormat.framesPerBuffer;
+        self.capturingContext->audioBuffer = _captureBuffer;


355-361 should probably be moved into initializeCapturer()

ceaglest · 2018-11-09T23:15:34Z

Okay, so sample rate conversion is working. The next step is to revisit recording mixing. In the current commit you just get the player audio in stereo due to an early return for testing.

piyushtank

Reviewed the Playback side of changes. Reviewing the app and processing tap next, and then the capturing side. 😅

piyushtank · 2018-11-09T20:55:37Z

CoViewingExample.xcodeproj/project.pbxproj

+				ALWAYS_SEARCH_USER_PATHS = NO;
+				CLANG_ANALYZER_NONNULL = YES;
+				CLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE;
+				CLANG_CXX_LANGUAGE_STANDARD = "gnu++14";


do we want to change this in sample app?

Lets use c++14 to be safe.

piyushtank · 2018-11-09T20:55:53Z

CoViewingExample.xcodeproj/project.pbxproj

+				ALWAYS_SEARCH_USER_PATHS = NO;
+				CLANG_ANALYZER_NONNULL = YES;
+				CLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE;
+				CLANG_CXX_LANGUAGE_STANDARD = "gnu++14";


piyushtank · 2018-11-09T20:56:36Z

CoViewingExample.xcodeproj/project.pbxproj

+				GCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;
+				GCC_WARN_UNUSED_FUNCTION = YES;
+				GCC_WARN_UNUSED_VARIABLE = YES;
+				IPHONEOS_DEPLOYMENT_TARGET = 11.0;


Is this OK?

So far this is all I have tested on, and can validate. We should talk about what versions we want to support in the example.

piyushtank · 2018-11-09T23:13:00Z

CoViewingExample/AudioDevices/ExampleAVPlayerProcessingTap.h

+@class ExampleAVPlayerAudioDevice;
+
+typedef struct ExampleAVPlayerAudioTapContext {
+    __weak ExampleAVPlayerAudioDevice *audioDevice;


id<TVIAudioDevice> ?

I think we need some sort of callback protocol here. If it was just TVIAudioDevice I wouldn't know what custom method to call.

piyushtank · 2018-11-09T23:14:46Z

CoViewingExample/AudioDevices/ExampleAVPlayerAudioDevice.m

+@property (nonatomic, strong, nullable) TVIAudioFormat *renderingFormat;
+@property (nonatomic, assign, readonly) BOOL wantsAudio;
+@property (nonatomic, assign) BOOL wantsCapturing;
+@property (nonatomic, assign) BOOL wantsRendering;


It would be nice to have documentation around wantsAudio, wantsCapturing and wantsRendering.

piyushtank · 2018-11-10T00:12:09Z

CoViewingExample/AudioDevices/ExampleAVPlayerAudioDevice.m

+    audioUnitDescription.componentFlags = 0;
+    audioUnitDescription.componentFlagsMask = 0;
+    return audioUnitDescription;
+}


Is this method being used anywhere?

piyushtank · 2018-11-10T00:15:11Z

CoViewingExample/AudioDevices/ExampleAVPlayerAudioDevice.m

+    if (status != noErr) {
+        NSLog(@"Could not set stream format on the input bus!");
+        AudioComponentInstanceDispose(_voiceProcessingIO);
+        _voiceProcessingIO = NULL;


should we dispose mixer as well ? Probably we can have a strategy, setup mixer first, and if _voiceProcessingIO gets into an error, we can dispose mixer as well. WDYT?

piyushtank · 2018-11-10T00:23:14Z

CoViewingExample/AudioDevices/ExampleAVPlayerAudioDevice.m

+        AudioStreamBasicDescription playerFormatDescription = renderingFormatDescription;
+        if (self.renderingContext->playoutBuffer) {
+            playerFormatDescription.mSampleRate = self.audioTapContext->sourceFormat.mSampleRate;
+        }


comment would be nice why are we adapting to the buffer (AVPlayer) format

CoViewingExample/ViewController.swift

piyushtank · 2018-11-10T00:46:53Z

CoViewingExample/AudioDevices/ExampleAVPlayerAudioDevice.m

+    }
+}
+
+- (void)handleRouteChange:(NSNotification *)notification {


did you get chance to test this?

Not yet, though I do see route change events being fired during normal operation (That don't restart).

CoViewingExample/ViewController.swift

piyushtank · 2018-11-10T01:05:15Z

CoViewingExample/AudioDevices/ExampleAVPlayerProcessingTap.m

+        AudioConverterReset(context->captureFormatConverter);
+        context->captureFormatConvertIsPrimed = NO;
+    }
+}


This file has format conversion code as well. We will be using ExampleAVPlayerProcessingTap.m in AVAudioEngine audio device as well where the format conversion will not be required. Can we move the audio format conversion code outside this file? another option is we can have processing tap per audio device, in that case we will need to add a layer in folder structure inside AudioDevices

Also I noticed, audio capturing side stops working when app goes into the background. It resumes coming into the foreground.

piyushtank · 2018-11-10T01:05:58Z

CoViewingExample/AudioDevices/ExampleAVPlayerProcessingTap.m

+    // Adjust for what the format converter actually produced, in case it was different than what we asked for.
+    producerBufferList->mBuffers[0].mDataByteSize = ioPacketSize * bytesPerFrameOut;
+//    printf("Output was: %d packets / %d bytes. Consumed input packets: %d. Cached input packets: %d.\n",
+//           ioPacketSize, ioPacketSize * bytesPerFrameOut, context.sourcePackets, context.cachePackets);


There are some commented code in this file but I guess you are aware of it and working on it.

piyushtank · 2018-11-10T01:07:14Z

CoViewingExample/AudioDevices/ExampleAVPlayerProcessingTap.m

+
+    if (status != kCVReturnSuccess) {
+        // TODO
+        return;


Do we need to delete context / cleanup ?

piyushtank · 2018-11-10T01:13:23Z

CoViewingExample/ExampleAVPlayerAudioTap.swift

+import Foundation
+import MediaToolbox
+
+class ExampleAVPlayerAudioTap {


I guess lets remove it if it is not used.

piyushtank · 2018-11-10T01:15:17Z

CoViewingExample/ExampleAVPlayerSource.swift

+        if !output.hasNewPixelBuffer(forItemTime: itemTimestamp) {
+            // TODO: Consider suspending the timer and requesting a notification when media becomes available.
+//            print("No frame for host timestamp:", CACurrentMediaTime(), "\n",
+//                  "Last presentation timestamp was:", lastPresentationTimestamp != nil ? lastPresentationTimestamp! : CMTime.zero)


fyi - this file has commented logs

In this case, we should log once, suspend the timer and restart it later. This should be an easy fix. 👍

piyushtank · 2018-11-10T01:22:43Z

CoViewingExample/ExampleAVPlayerSource.swift

+    public var supportedFormats: [TVIVideoFormat] {
+        get {
+            let format = TVIVideoFormat()
+            format.dimensions = CMVideoDimensions(width: 640, height: 360)


const will be nice

piyushtank · 2018-11-10T01:33:19Z

CoViewingExample/ExampleAVPlayerSource.swift

+import AVFoundation
+import TwilioVideo
+
+class ExampleAVPlayerSource: NSObject, TVIVideoCapturer {


I feel documentation around scheduling, sample queue and logic would be nice.

piyushtank · 2018-11-10T01:34:59Z

CoViewingExample/ExampleAVPlayerSource.swift

+                                    attributes: DispatchQueue.Attributes(rawValue: 0),
+                                    autoreleaseFrequency: DispatchQueue.AutoreleaseFrequency.workItem,
+                                    target: nil)
+        super.init()


super.init() should be the first to call ?

The non-null sample queue needed to be set before initializing super or else the compiler yelled at me. I'm not an expert at initializing swift classes so this might be wrong.

piyushtank · 2018-11-10T01:36:36Z

CoViewingExample/ViewController.swift

+        // Configure access token either from server or manually.
+        // If the default wasn't changed, try fetching from server.
+        if (accessToken == "TWILIO_ACCESS_TOKEN") {
+            let urlStringWithRole = tokenUrl + "?identity=" + name


Lets remove the rooms demo url specific logic from here

CoViewingExample/ViewController.swift

* Pre-roll AVPlayer to coordinate start times between subsystems. * Use the audio master clock for AVPlayer. * Catch up when dequeuing old frames.

* AVPlayer playback was broken. * Don’t tie device recording format to AVAudioSession, we always want stereo. * AudioTap provides a rendering format, which is now used to configure the mixer input.

srinivasdr · 2020-11-04T16:31:25Z

@ceaglest when I ran above example I see only presenter AVPlayer audio in audible in viewer side, but not presenter microphone audio. How to make both microphone and avplayer audio track audible/shared so that at viewer side we can listen both audio of microphone and avplayer of presneter

ceaglest and others added 30 commits October 29, 2018 16:48

Add new project with empty template.

6c96009

Add ExampleAVPlayerView.

06c933c

WIP - Play and pause video.

b65f914

* Need to add KVO.

Play automatically for now.

d74becf

Added ExampleAVPlayerSource which captures from an AVPlayerItem.

e62c97a

* Use AVPlayerItemVideoOutput, and CADisplayLink to pull frames.

WIP - Add an AVAudioMix to the asset.

e5ff609

* TODO - Set the MTAudioTapProcessor.

Attempt to request an IOSurface.

f3204b4

Add ExampleAVPlayerAudioTap.

d18f28c

Comment out IOSurface request, it crashes on device.

65a7d5e

Drop in a copy of ExampleCoreAudioDevice, and a bridging header.

2e8486b

Spacing.

8113a51

Add project to Podfile, and consume TPCircularBuffer.

c45ce61

Rename to AVPlayerAudioDevice, WIP - AudioTap.

5d29ca4

WIP - Produce audio using the ring buffer.

20345d2

WIP - Playback of MTAudioProcessingTap.

27891e3

Create an MTAudioProcessingTap.

d8eede2

Hook up the audio device, almost there.

2b90a9d

* Still need to figure out how to consume frames properly.

Fix the crash consuming buffers.

eee3408

* TODO - Audio is heavily distorted.

Use a format converter.

de4ee11

Workaround for device crashes.

cb64ff6

Co-viewing app ui (#2)

d0079a9

Minor.

12bdc2c

WIP - Adding capturing capabilities.

14f507c

UI tweaks - always share audio, mirror camera.

2314119

Recording is almost working (distorted).

2542a60

Fix recording distortion.

06fc4fb

Use a multichannel mixer for playback.

0aace23

* Add a rendering method for WebRTC audio. * Hook up both audio tap and rendering inputs to the mixer.

Print UI messages to the console for now.

15d44da

ExampleAVPlayerSource is a TVIVideoCapturer.

a54a440

WIP - Add a recording mixer.

26f8c45

ceaglest commented Nov 9, 2018

View reviewed changes

ceaglest added 2 commits November 9, 2018 14:25

Use 48khz sample content that doesn’t need resampling.

65a42bb

Working sample rate conversion to 48 kHz.

c7275b8

ceaglest added 3 commits November 9, 2018 15:35

Use a 20 millisecond duration for all example TVIAudioDevices.

c5294b4

Use more bandwidth for presenter audio, restore 44.1 kHz content.

6659a3e

Comment out printf statements in realtime code.

6b70a34

piyushtank suggested changes Nov 10, 2018

View reviewed changes

CoViewingExample/ViewController.swift Show resolved Hide resolved

piyushtank suggested changes Nov 10, 2018

View reviewed changes

ceaglest added 17 commits November 9, 2018 20:25

Review feedback - document important audio device properties.

a517531

Remove dispatch_semaphore, address ASBD naming.

b9a2c6c

Use the regular token server URL.

1def76f

Remove commented KVO code.

1a26281

Explicitly request an IOSurface.

9c51e00

Dynamically create and destroy remotePlayerView.

7156ed2

Use C++14.

32ccba6

Improve ExampleAVPlayerSource documentation.

f63f649

Reduce the button / video view margins from 10 to 4 points.

e75ee3e

Comments and disable AVAudioMix update code.

3555d2d

Produce buffers with timestamps for playback.

f924c8e

Set the preferred number of input channels, cleanup logging.

6d97213

Attempt to fix memory management of MTAudioProcessingTap.

d066e15

Time synchronization for audio playback.

4980f30

* Pre-roll AVPlayer to coordinate start times between subsystems. * Use the audio master clock for AVPlayer. * Catch up when dequeuing old frames.

Merge branch 'master' into tweek/co-viewing

287b8e2

Arbitrary loads for media only.

47be1d8

Support 1-channel output devices properly (like AirPods in HFP).

d316d29

* AVPlayer playback was broken. * Don’t tie device recording format to AVAudioSession, we always want stereo. * AudioTap provides a rendering format, which is now used to configure the mixer input.

piyushtank mentioned this pull request Jun 25, 2019

IDSK-2547 Re-writing AVPlayerExample #406

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ISDK-2241: Co-Viewing example with custom AudioDevice #325

ISDK-2241: Co-Viewing example with custom AudioDevice #325

ceaglest commented Nov 8, 2018 •

edited

ceaglest Nov 9, 2018

ceaglest Nov 9, 2018

piyushtank Nov 10, 2018

ceaglest Nov 9, 2018

ceaglest Nov 9, 2018

ceaglest Nov 9, 2018

ceaglest commented Nov 9, 2018 •

edited

piyushtank left a comment

piyushtank Nov 9, 2018

ceaglest Nov 10, 2018

piyushtank Nov 9, 2018

piyushtank Nov 9, 2018

ceaglest Nov 10, 2018

piyushtank Nov 9, 2018

ceaglest Nov 10, 2018

piyushtank Nov 9, 2018

piyushtank Nov 10, 2018

piyushtank Nov 10, 2018

piyushtank Nov 10, 2018

piyushtank Nov 10, 2018

ceaglest Nov 10, 2018

piyushtank Nov 10, 2018

piyushtank Nov 10, 2018

piyushtank Nov 10, 2018

piyushtank Nov 10, 2018

piyushtank Nov 10, 2018

piyushtank Nov 10, 2018

ceaglest Nov 10, 2018

piyushtank Nov 10, 2018

piyushtank Nov 10, 2018

piyushtank Nov 10, 2018

ceaglest Nov 10, 2018 •

edited

piyushtank Nov 10, 2018

srinivasdr commented Nov 4, 2020 •

edited

ISDK-2241: Co-Viewing example with custom AudioDevice #325

Are you sure you want to change the base?

ISDK-2241: Co-Viewing example with custom AudioDevice #325

Conversation

ceaglest commented Nov 8, 2018 • edited

Summary

Design

Limitations

About HLS

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ceaglest commented Nov 9, 2018 • edited

piyushtank left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ceaglest Nov 10, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

srinivasdr commented Nov 4, 2020 • edited

ceaglest commented Nov 8, 2018 •

edited

ceaglest commented Nov 9, 2018 •

edited

ceaglest Nov 10, 2018 •

edited

srinivasdr commented Nov 4, 2020 •

edited