本篇文章是在開發(fā)新功能-觀眾端錄制直播流的小視頻過程中,記錄學(xué)習(xí)到內(nèi)容,踩過的坑,分享一下.
需求
在觀眾端可以錄制正在播放的流的小視頻,同時要將屏幕上的用戶互動,包括:禮物,聊天,彈幕等元素同時錄制下來,與視頻流合在一起.
背景介紹
播放器使用七牛PLPlayerKit.而該框架在播放流時有兩個回調(diào)方法,將解析到的流數(shù)據(jù)暴露出來.
/**
回調(diào)將要渲染的幀數(shù)據(jù)
該功能只支持直播
@param player 調(diào)用該方法的 PLPlayer 對象
@param frame 將要渲染幀 YUV 數(shù)據(jù)。
CVPixelBufferGetPixelFormatType 獲取 YUV 的類型。
軟解為 kCVPixelFormatType_420YpCbCr8Planar.
硬解為 kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange.
@param pts 顯示時間戳 單位ms
@param sarNumerator
@param sarDenominator
其中sar 表示 storage aspect ratio
視頻流的顯示比例 sarNumerator sarDenominator
@discussion sarNumerator = 0 表示該參數(shù)無效
@since v2.4.3
*/
- (void)player:(nonnull PLPlayer *)player willRenderFrame:(nullable CVPixelBufferRef)frame pts:(int64_t)pts sarNumerator:(int)sarNumerator sarDenominator:(int)sarDenominator;
/**
回調(diào)音頻數(shù)據(jù)
@param player 調(diào)用該方法的 PLPlayer 對象
@param audioBufferList 音頻數(shù)據(jù)
@param audioStreamDescription 音頻格式信息
@param pts 顯示時間戳 是解碼器進(jìn)行顯示幀時相對于SCR(系統(tǒng)參考)的時間戳。SCR可以理解為解碼器應(yīng)該開始從磁盤讀取數(shù)據(jù)時的時間
@param sampleFormat 采樣位數(shù) 枚舉:PLPlayerAVSampleFormat
@return audioBufferList 音頻數(shù)據(jù)
@since v2.4.3
*/
- (nonnull AudioBufferList *)player:(nonnull PLPlayer *)player willAudioRenderBuffer:(nonnull AudioBufferList *)audioBufferList asbd:(AudioStreamBasicDescription)audioStreamDescription pts:(int64_t)pts sampleFormat:(PLPlayerAVSampleFormat)sampleFormat;
分析
拿到需求時,針對要將用戶互動內(nèi)容一起渲染的需求,首先想到了OpenGL中的多重紋理混合的應(yīng)用,將通過視頻流創(chuàng)建的紋理和通過屏幕元素創(chuàng)建的紋理混合后,輸出我們需要的紋理數(shù)據(jù),轉(zhuǎn)為視頻數(shù)據(jù),通過回調(diào)接口的pts與音頻數(shù)據(jù)同步,錄入視頻.
而這個流程中的合成和寫入視頻,基于 OpenGL ES的GPUImage都有很好的是實現(xiàn),本著不重復(fù)造輪子,合理利用資源,于是就決定基于GPUImage來實現(xiàn).
視頻數(shù)據(jù)
通過可以拿到的視頻數(shù)據(jù)為kCVPixelFormatType_420YpCbCr8Planar(軟解)或者kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange(硬解)的CVPixelBufferRef數(shù)據(jù).
/**
@abstract 是否使用 video toolbox 硬解碼。
@discussion 使用 video toolbox Player 將嘗試硬解碼,失敗后,將切換回軟解碼。
@waring 該參數(shù)僅對 rtmp/flv 直播生效, 默認(rèn)不使用。支持 iOS 8.0 及更高版本。
@since v2.1.4
*/
extern NSString * _Nonnull PLPlayerOptionKeyVideoToolbox;
雖然GPUImage對kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange格式的數(shù)據(jù)有著很好的支持和使用過程,但是本著兼容性考慮,必須對使用kCVPixelFormatType_420YpCbCr8Planar格式視頻數(shù)據(jù)作為輸入.
代碼
GPUImagePixelRender繼承GPUImageOutput,作為輸出視頻紋理的類,進(jìn)入GPUImage響應(yīng)鏈.基本仿照了GPUImageMovie類的初始化流程.關(guān)鍵點在修改shader. kCVPixelFormatType_420YpCbCr8Planar是三個planar來分別存儲YUV數(shù)據(jù),在上傳紋理時必然使用是三個紋理采樣.
// DTVRecordVideoFrame:數(shù)據(jù)模型類,記錄視頻數(shù)據(jù)
- (DTVRecordVideoFrame *)creatTextureYUV:(CVPixelBufferRef)pixelBuffer
{
OSType pixelType = CVPixelBufferGetPixelFormatType(pixelBuffer);
NSAssert(pixelType == kCVPixelFormatType_420YpCbCr8Planar, @"pixelType error ...");
int pixelWidth = (int)CVPixelBufferGetWidth(pixelBuffer);
int pixelHeight = (int)CVPixelBufferGetHeight(pixelBuffer);
CVPixelBufferLockBaseAddress(pixelBuffer, 0);
DTVRecordVideoFrame *yuv = [[DTVRecordVideoFrame alloc] init];
// 視頻數(shù)據(jù)的寬高
yuv.width = pixelWidth;
yuv.height = pixelHeight;
// YUV三個分量數(shù)據(jù)
size_t y_size = pixelWidth * pixelHeight;
uint8_t *yuv_y_frame = malloc(y_size);
uint8_t *y_frame = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0);
memcpy(yuv_y_frame, y_frame, y_size);
yuv.Y = yuv_y_frame;
size_t u_size = y_size / 4;
uint8_t *yuv_u_frame = malloc(u_size);
uint8_t *u_frame = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 1);
memcpy(yuv_u_frame, u_frame, u_size);
yuv.U = yuv_u_frame;
size_t v_size = y_size / 4;
uint8_t *yuv_v_frame = malloc(v_size);
uint8_t *v_frame = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 2);
memcpy(yuv_v_frame, v_frame, v_size);
yuv.V = yuv_v_frame;
CVPixelBufferUnlockBaseAddress(pixelBuffer, 0);
return yuv;
}
解析出數(shù)據(jù)后,創(chuàng)建FBO,上傳頂點和紋理數(shù)據(jù)等不做詳解,可參考GPUImageMovie來做.下面是創(chuàng)建紋理對象的代碼
- (void)setupTexture:(DTVRecordVideoFrame *)videoFrame
{
if (0 == _textures[0]) glGenTextures(3, _textures);
const uint8_t *pixelByte[3] = {videoFrame.Y , videoFrame.U , videoFrame.V};
const int widths[3] = { videoFrame.width, videoFrame.width / 2, videoFrame.width / 2 };
const int heights[3] = { videoFrame.height, videoFrame.height / 2, videoFrame.height / 2 };
for (int i = 0; i < 3; i++) {
glBindTexture(GL_TEXTURE_2D, _textures[i]);
glTexImage2D(GL_TEXTURE_2D,
0,
GL_LUMINANCE,
widths[i],
heights[i],
0,
GL_LUMINANCE,
GL_UNSIGNED_BYTE,
pixelByte[i]);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glBindTexture(GL_TEXTURE_2D, 0);
}
}
再看一下shader,shader的代碼是從kxMoive中學(xué)習(xí)到的,用來渲染YUV三個分量,
NSString *const kGPUImageYUVPlanarFragmentShaderString = SHADER_STRING
(
varying highp vec2 textureCoordinate;
uniform sampler2D s_texture_y;
uniform sampler2D s_texture_u;
uniform sampler2D s_texture_v;
void main()
{
highp float y = texture2D(s_texture_y, textureCoordinate).r;
highp float u = texture2D(s_texture_u, textureCoordinate).r - 0.5;
highp float v = texture2D(s_texture_v, textureCoordinate).r - 0.5;
highp float r = y + 1.402 * v;
highp float g = y - 0.344 * u - 0.714 * v;
highp float b = y + 1.772 * u;
gl_FragColor = vec4(r,g,b,1.0);
}
);
View數(shù)據(jù)
GPUImageUIElement就是用來根據(jù)view或layer來生成紋理的,按理說可以拿來直接使用,然而我們屏幕上的動畫并不是都用的layer.contents來實現(xiàn)的,有些是基于UIView或者CAAnimation的一些layer動畫,如果直接使用view或layer,一些動畫根本不會顯示出來.如果對CALayer圖層了解的話,肯定知道為什么了.Layer層中modelLayer的屬性是在修改后立刻就變?yōu)榻K值的,而presentationLayer則會經(jīng)歷一個漸變的修改過程.而我們平常view.layer就是modelLayer,直接是終值了.所以我們拿到的紋理動畫上會有些奇怪.
知道了這點,對GPUImageUIElement進(jìn)行修改,每次創(chuàng)建紋理獲取數(shù)據(jù)時,是用presentationLayer來渲染.
同時發(fā)現(xiàn)GPUImageUIElement每次更新紋理時,都創(chuàng)建新的FBO,不會cache回收,對下面這段代碼進(jìn)行了修改,使用完FBO后.
for (id<GPUImageInput> currentTarget in targets)
{
if (currentTarget != self.targetToIgnoreForUpdates)
{
NSInteger indexOfObject = [targets indexOfObject:currentTarget];
NSInteger textureIndexOfTarget = [[targetTextureIndices objectAtIndex:indexOfObject] integerValue];
[currentTarget setInputSize:layerPixelSize atIndex:textureIndexOfTarget];
[currentTarget setInputFramebuffer:outputFramebuffer atIndex:textureIndexOfTarget];
[currentTarget newFrameReadyAtTime:kCMTimeIndefinite atIndex:textureIndexOfTarget];
}
}
合成
PlanA:視頻收到一幀合成繪制一幀.通常我們采用的視頻流幀率是24或36,而屏幕刷新是60,經(jīng)測試,以視頻的幀率來刷新會比較節(jié)省CPU,但視頻卡頓時,屏幕元素也會卡住,同時動畫不夠流暢.
PlanB:以CADisplayLink刷新屏幕元素,以接收到的幀數(shù)據(jù)刷新視頻幀.
GPUImageMovieWriter用來寫入視頻數(shù)據(jù),存入本地.
//緩存視頻幀
- (void)addVideoPixelBuffer:(CVPixelBufferRef)pixelBuffer pts:(int64_t)videoPts fps:(int)videoFPS;
{
// 已緩存足夠的數(shù)據(jù)
if (_hasFillFrame) {
return;
}
// 記錄開始的pts
if (!_firstFramePTS) _firstFramePTS = videoPts;
DTVRecordVideoFrame *videoframe = [self creatTextureYUV:pixelBuffer];
if (videoframe.Y == NULL || videoframe.U == NULL || videoframe.V == NULL ) {
NSLog(@"無視頻效幀");
return;
}
videoframe.pts = videoPts;
//幀持續(xù)時長
videoframe.duration = _previousFrame ? (videoPts - _previousFrame.pts) : (1 / 24.f * 1000);
//幀在我們錄制視頻中的pts
videoframe.frameTime = CMTimeMake((videoPts - _firstFramePTS) * 600, 600 * 1000);
// 緩存
[self.videoBuffer addObject:videoframe];
_previousFrame = videoframe;
if (self.videoBuffer.count > 3 && !self.displayLink) {
//循環(huán)切換視頻幀
[self tick];
}
}
在tick中會根據(jù)緩存的數(shù)量,和幀持續(xù)的時長切換當(dāng)前的幀數(shù)據(jù).通過GPUImageMovieWriter寫入視頻中.
https://github.com/BradLarson/GPUImage/issues/1729解答GPUImageMovieWriter寫入AVFileTypeMPEG4時出現(xiàn)問題解決辦法.
- (void)tick
{
if (self.videoBuffer.count < 1) {
if (_hasFillFrame) {
[self stopDisplayLinkTimer];
if (_movieWriter) {
[_movieWriter finishRecording];
[_blendFilter removeTarget:_movieWriter];
_movieWriter = NULL;
}
if (self.completeBlock) self.completeBlock(_coverImage);
}
else{
_renderVideoFrame = NO;
NSLog(@"卡住...");
}
}
else
{
_renderVideoFrame = YES;
DTVRecordVideoFrame *frameTexture = self.videoBuffer.firstObject;
if (!self.movieWriter) {
unlink([DefaultFuckVideoPath UTF8String]);
_movieWriter = [[GPUImageMovieWriter alloc] initWithMovieURL:[NSURL fileURLWithPath:DefaultFuckVideoPath] size:CGSizeMake(540, 960) fileType:AVFileTypeMPEG4 outputSettings:nil];
_movieWriter.encodingLiveVideo = YES;
_movieWriter.hasAudioTrack = YES;
_movieWriter.assetWriter.movieFragmentInterval = kCMTimeInvalid;
[self.pixelRender addTarget:self.blendFilter];
[self.layerRender addTarget:self.blendFilter];
[self.blendFilter addTarget:_movieWriter];
[_movieWriter startRecording];
}
[self startDisplayLinkTimer];
runAsynchronouslyOnVideoProcessingQueue(^{
[self.pixelRender processVideoFrame:frameTexture];
});
dispatch_after(dispatch_time(DISPATCH_TIME_NOW, (int64_t)(frameTexture.duration * NSEC_PER_MSEC)), dispatch_get_main_queue(), ^{
[self.videoBuffer removeObjectAtIndex:0];
[self tick];
});
//作為封面
if (CMTimeGetSeconds(_previousFrame.frameTime) > 0.5f && !_coverImage) {
[self.blendFilter useNextFrameForImageCapture];
_coverImage = [self.blendFilter imageFromCurrentFramebuffer];
}
}
}
CADisplayLink方法writerFrame,負(fù)責(zé)刷新獲取屏幕元素數(shù)據(jù),與當(dāng)前視頻幀_currentFrame通過GPUImageAlphaBlendFilter的濾鏡合成最終一幀.
- (void)writerFrame
{
[self.layerRender updateWithPresentationLayer:_renderView.layer.presentationLayer];
}
音頻
音頻要和視頻同步,由于我們通過七牛接口拿到的是AudioBufferList數(shù)據(jù),需要轉(zhuǎn)換為GPUImageMovieWriter 需要的CMSampleBufferRef數(shù)據(jù).
// 根據(jù)視頻的pts重新計算獲取音頻的pts
CMTime time = CMTimeMake((audioPts - _firstFramePTS) * 600, 600 * 1000);
//轉(zhuǎn)換CMSampleBufferRef
CMSampleBufferRef audioBuffer = NULL;
CMFormatDescriptionRef format = NULL;
CMSampleTimingInfo timing = {CMTimeMake(1, audioStreamDescription.mSampleRate),time, kCMTimeInvalid};
UInt32 size = audioBufferList->mBuffers->mDataByteSize / sizeof(UInt32);
UInt32 mNumberChannels = audioBufferList->mBuffers->mNumberChannels;
CMItemCount numSamples = (CMItemCount)size / mNumberChannels;
OSStatus status;
status = CMAudioFormatDescriptionCreate(kCFAllocatorDefault, &audioStreamDescription, 0, NULL, 0, NULL, NULL, &format);
if (status != noErr) {
CFRelease(format);
return;
}
status = CMSampleBufferCreate(kCFAllocatorDefault,NULL,false,NULL,NULL,format,numSamples, 1, &timing, 0, NULL, &audioBuffer);
if (status != noErr) {
CFRelease(format);
return;
}
status = CMSampleBufferSetDataBufferFromAudioBufferList(audioBuffer, kCFAllocatorDefault,kCFAllocatorDefault, 0,audioBufferList);
if (status != noErr) {
CFRelease(format);
return;
}
if (_movieWriter && audioBuffer) {
[_movieWriter processAudioBuffer:audioBuffer];
}
總結(jié)
在做這個功能的過程中學(xué)習(xí)到了很多內(nèi)容,CALayer圖層,視頻數(shù)據(jù)格式,音頻轉(zhuǎn)換,簡單的音視頻同步,加深了GPUImage的理解.個人感覺收獲頗多.