iOS 音視頻采集及rtmp推流

項目暫時告一段落,也是一知半解,不過我的分享可以幫助我這樣菜鳥了。
先來下知識結構:

1、h264

視頻編碼處理的最后一步就是熵編碼,在H.264中采用了兩種不同的熵編碼方法:通用可變長編碼(UVLC)和基于文本的自適應二進制算術編碼(CABAC)。

2、aac

Advanced Audio Coding。一種專為聲音數據設計的文件壓縮格式,與MP3不同,它采用了全新的算法進行編碼,更加高效,具有更高的“性價比”。利用AAC格式,可使人感覺聲音質量沒有明顯降低

3、pcm

音頻采集的原始數據,硬編碼數據

4、yuv

視頻采集的原始數據,硬編碼數據

5、時間戳

直播音視頻同步的關鍵參數

6、rtmp推流

直播的推流手段

一、我們首先要做的就是采集,我們需要采集硬編碼數據yuv將其轉化成h264,然后采集pcm數據,并將其轉化成aac數據,并發送

1、視頻采集,我們要采集最后要轉化為h264編碼的格式,需要用到VideoToolbox.framework及AVFoundation.framework
VideoToolbox.framework 的主要工作是編碼,將yuv數據編碼為h264。AVFoundation.framework的任務是采集yuv原始數據。

// 獲取硬編碼數據函數,一些初始化工作就不在這里熬述了,網上有很多
-(void) captureOutput:(AVCaptureOutput*)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection*)connection
{
}

(1)初始化VTCompressionSession。
VTCompressionSession初始化的時候需要給出width寬,height長,編碼器類型kCMVideoCodecType_H264等。然后通過調用VTSessionSetProperty接口設置幀率等屬性,最后需要設定一個回調函數,這個回調是視頻圖像編碼成功后調用。全部準備好后,使用VTCompressionSessionCreate創建session。

// 這個函數是初始化
- (void) initEncode:(int)width  height:(int)height bite:(int)iBite
{
    dispatch_sync(aQueue, ^{
        
        // For testing out the logic, lets read from a file and then send it to encoder to create h264 stream
        
        // Create the compression session   注意h264函數
        OSStatus status = VTCompressionSessionCreate(NULL, width, height, kCMVideoCodecType_H264, NULL, NULL, NULL, didCompressH264, (__bridge void *)(self),  &EncodingSession);
        NSLog(@"H264: VTCompressionSessionCreate %d", (int)status);
        
        if (status != 0)
        {
            NSLog(@"H264: Unable to create a H264 session");
            error = @"H264: Unable to create a H264 session";
            
            return ;
            
        }
        
        // 碼率是清晰度
        // Set the properties
        VTSessionSetProperty(EncodingSession, kVTCompressionPropertyKey_RealTime, kCFBooleanTrue);
        VTSessionSetProperty(EncodingSession, kVTCompressionPropertyKey_AllowFrameReordering, kCFBooleanFalse);
        VTSessionSetProperty(EncodingSession, kVTCompressionPropertyKey_MaxKeyFrameInterval, (__bridge CFTypeRef _Nonnull)(@(GOP_SIZE)));
        VTSessionSetProperty(EncodingSession, kVTCompressionPropertyKey_ProfileLevel, kVTProfileLevel_H264_Main_AutoLevel);
        VTSessionSetProperty(EncodingSession, kVTCompressionPropertyKey_AverageBitRate, (__bridge CFTypeRef _Nonnull)@(iBite));
        VTSessionSetProperty(EncodingSession, kVTCompressionPropertyKey_ExpectedFrameRate, (__bridge CFTypeRef _Nonnull)@(FRAME_RATE));
        VTSessionSetProperty(EncodingSession, kVTCompressionPropertyKey_DataRateLimits, (__bridge CFTypeRef _Nonnull)@[@(iBite/8),@(1)]);
    
        // Tell the encoder to start encoding
        VTCompressionSessionPrepareToEncodeFrames(EncodingSession);
    });
}

(2)提取攝像頭采集的原始圖像數據給VTCompressionSession來硬編碼。

攝像頭采集后的圖像是未編碼的CMSampleBuffer形式,利用給定的接口函數CMSampleBufferGetImageBuffer從中提取出CVPixelBufferRef,使用硬編碼接口VTCompressionSessionEncodeFrame來對該幀進行硬編碼,編碼成功后,會自動調用session初始化時設置的回調函數。

    dispatch_sync(aQueue, ^{
        
        frameCount++;
        // Get the CV Image buffer  提取攝像頭采集的原始圖像數據給VTCompressionSession來硬編碼 也就是給VTCompressionSessionCreate來編碼
        CVImageBufferRef imageBuffer = (CVImageBufferRef)CMSampleBufferGetImageBuffer(sampleBuffer);
        
        // Create properties
        CMTime presentationTimeStamp = CMTimeMake(frameCount, 1000);
        //CMTime duration = CMTimeMake(1, DURATION);
        VTEncodeInfoFlags flags;
        
        // Pass it to the encoder
        OSStatus statusCode = VTCompressionSessionEncodeFrame(EncodingSession,
                                                              imageBuffer,
                                                              presentationTimeStamp,
                                                              kCMTimeInvalid,
                                                              NULL, NULL, &flags);
        // Check for error
        if (statusCode != noErr) {
            NSLog(@"H264: VTCompressionSessionEncodeFrame failed with %d", (int)statusCode);
            error = @"H264: VTCompressionSessionEncodeFrame failed ";
            
            // End the session
            VTCompressionSessionInvalidate(EncodingSession);
            CFRelease(EncodingSession);
            EncodingSession = NULL;
            error = NULL;
            return;
        }
        //            NSLog(@"H264: VTCompressionSessionEncodeFrame Success");
    });

(3)利用回調函數,將因編碼成功的CMSampleBuffer轉換成H264碼流,通過網絡傳播。
基本上是硬解碼的一個逆過程。

void didCompressH264(void *outputCallbackRefCon, void *sourceFrameRefCon, OSStatus status, VTEncodeInfoFlags infoFlags,
                     CMSampleBufferRef sampleBuffer )
{
//        NSLog(@"didCompressH264 called with status %d infoFlags %d", (int)status, (int)infoFlags);
    NSLog(@"H264");
    if (status != 0) return;
    
    if (!CMSampleBufferDataIsReady(sampleBuffer))
    {
        NSLog(@"didCompressH264 data is not ready ");
        return;
    }
    H264Encoder* encoder = (__bridge H264Encoder*)outputCallbackRefCon;
    
    // Check if we have got a key frame first
    bool keyframe = !CFDictionaryContainsKey( (CFArrayGetValueAtIndex(CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, true), 0)), kCMSampleAttachmentKey_NotSync);
    encoder->countFrame=encoder->countFrame+1;
    
//    NSLog(@"dzf  frameCount%d",encoder->countFrame);
    if (keyframe)
    {
//        NSLog(@"dzf  keyframe is true ");
        CMFormatDescriptionRef format = CMSampleBufferGetFormatDescription(sampleBuffer);
        // CFDictionaryRef extensionDict = CMFormatDescriptionGetExtensions(format);
        // Get the extensions
        // From the extensions get the dictionary with key "SampleDescriptionExtensionAtoms"
        // From the dict, get the value for the key "avcC"
        
        size_t sparameterSetSize, sparameterSetCount;
        const uint8_t *sparameterSet;
        OSStatus statusCode = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 0, &sparameterSet, &sparameterSetSize, &sparameterSetCount, 0 );
        if (statusCode == noErr)
        {
            // Found sps and now check for pps
            size_t pparameterSetSize, pparameterSetCount;
            const uint8_t *pparameterSet;
            OSStatus statusCode = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 1, &pparameterSet, &pparameterSetSize, &pparameterSetCount, 0 );
            if (statusCode == noErr)
            {
                // Found pps
                encoder->sps = [NSData dataWithBytes:sparameterSet length:sparameterSetSize];
                encoder->pps = [NSData dataWithBytes:pparameterSet length:pparameterSetSize];
                if (encoder->_delegate)
                {
                    [encoder->_delegate gotSpsPps:encoder->sps pps:encoder->pps];
                }
            }
        }
    }
    CMBlockBufferRef dataBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
    size_t length, totalLength;
    char *dataPointer;
    OSStatus statusCodeRet = CMBlockBufferGetDataPointer(dataBuffer, 0, &length, &totalLength, &dataPointer);
    if (statusCodeRet == noErr) {

        // 發送數據
        size_t bufferOffset = 0;
        static const int AVCCHeaderLength = 4;
        while (bufferOffset < totalLength - AVCCHeaderLength) {
            
            // Read the NAL unit length
            uint32_t NALUnitLength = 0;
            memcpy(&NALUnitLength, dataPointer + bufferOffset, AVCCHeaderLength);
            
            // Convert the length value from Big-endian to Little-endian
            NALUnitLength = CFSwapInt32BigToHost(NALUnitLength);
            
            NSData* data = [[NSData alloc] initWithBytes:(dataPointer + bufferOffset + AVCCHeaderLength) length:NALUnitLength];
            [encoder->_delegate gotEncodedData:data isKeyFrame:keyframe];
            
            // Move to the next NAL unit in the block buffer
            bufferOffset += AVCCHeaderLength + NALUnitLength;
        }
        // 你小子存的數據
        [encoder->_delegate oneFrameEncodeEnd:keyframe];
    }
}

值得注意的是一段視頻的頭部是sps pps 組成的,我們在這個函數中要檢查頭部信息,篩選普通信息進行封裝發送推流。先發送頭部數據再發送普通視頻數據。
解析出參數集SPS和PPS,加上開始碼后組裝成NALU。提取出視頻數據,將長度碼轉換成開始碼,組長成NALU。將NALU發送出去。

發送視頻頭部信息代碼

- (void)gotSpsPps:(NSData*)sps pps:(NSData*)pps
{
//    NSLog(@"gotSpsPps");
    frameCount2 = [_h264Encoder getFreameCound];
    
    const char bytes[] = "\x00\x00\x00\x01";
    size_t length = (sizeof bytes) - 1; //string literals have implicit trailing '\0'
    NSData *ByteHeader = [NSData dataWithBytes:bytes length:length];
    
    mysps = sps;
    mypps = pps;
    [mutableData appendData:ByteHeader];
    [mutableData appendData:mysps];
    [mutableData appendData:ByteHeader];
    [mutableData appendData:mypps];
    pos = pos + sps.length + pps.length + ByteHeader.length*2;
    
    NSMutableData *mutableDataTem1 = [[NSMutableData alloc] init];;
    [mutableDataTem1 appendData:ByteHeader];
    [mutableDataTem1 appendData:mysps];
    long tem1 = sps.length + ByteHeader.length;
    [self sendData:sizeof(Byte)*tem1 data:(char*)[mutableDataTem1 bytes]];
    
    NSMutableData *mutableDataTem = [[NSMutableData alloc] init];;
    [mutableDataTem appendData:ByteHeader];
    [mutableDataTem appendData:mypps];
    long tem = pps.length + ByteHeader.length;
    [self sendData:sizeof(Byte)*tem data:(char*)[mutableDataTem bytes]];
}

發送實體部分代碼

- (void)oneFrameEncodeEnd:(BOOL)isKeyFrame
{
    FrameData *frameData = [[FrameData alloc] init];
    
    frameData.Iframe = isKeyFrame;
    frameData.frame_len = (int) pos;
    frameData.frame_seq = total_vseq;
    frameData.stream_index = 0;
    
    
    frameData.frame_data = (Byte *)malloc(sizeof(Byte)*pos);//new Byte[pos];
    memcpy(frameData.frame_data,[mutableData bytes], pos*sizeof(Byte));
    
    [_videoArray addObject:frameData];
    total_vseq++;
    
    //if(isKeyFrame)
    //NSLog(@"add one h264 h264  h264  frame to videoArray---seq:%ld",total_vseq);
    
    mysps = nil;
    mypps = nil;
    
    [mutableData resetBytesInRange:NSMakeRange(0, [mutableData length])];
    [mutableData setLength:0];
    
    pos = 0;
}

音頻的采集發送

將采集pcm數據進行aac編碼,網上應該有相關的代碼可以學習

-(void) captureOutput:(AVCaptureOutput*)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection*)connection
{
    static BOOL             firstStartTimer = false;
    static long             num = 0;

    if (connection == _audioConnection) {
        NSLog(@"captureOutput audio");
        
        char szBuf[4096];
        memset(szBuf, 0, sizeof(szBuf));
        uint32_t  nSize = sizeof(szBuf);
//        AudioStreamBasicDescription inputFormat = *(CMAudioFormatDescriptionGetStreamBasicDescription(CMSampleBufferGetFormatDescription(sampleBuffer))); // 輸入音頻格式
        
        AudioStreamBasicDescription outputFormat = *(CMAudioFormatDescriptionGetStreamBasicDescription(CMSampleBufferGetFormatDescription(sampleBuffer)));
        nSize = CMSampleBufferGetTotalSampleSize(sampleBuffer);
        CMBlockBufferRef databuf = CMSampleBufferGetDataBuffer(sampleBuffer);
        if (CMBlockBufferCopyDataBytes(databuf, 0, nSize, szBuf) == kCMBlockBufferNoErr)
        {
            int32_t nOffSet = 0;
            while (nOffSet < nSize)
            {
                int outsize = 0;
                char szOutBuf[4096] = {0};
                
                int nInSize = 0;
                if (nSize - nOffSet >= 640) {
                    nInSize = 640;
                } else {
                    nInSize = nSize - nOffSet;
                }
                
                outsize = [ecdoer AACEncoderEncode:lHand inData:szBuf + nOffSet inSize:nInSize outData:szOutBuf maxOutSize:4096];
                //            [ecdoer AACEncoderClose:outsize];
                if (outsize > 0)
                {
                    [self sendAacDataLen:outsize data:szOutBuf ptsTime:0];
                }

                nOffSet += 640;
            }

        }

    }
}

音頻數據的發送

- (void)sendAacDataLen:(int) totalLength data: (char*) dataPointer ptsTime:(int64_t)pts{

    int ret = WM_RTMPLIVESDK_InputData(WMRtmpLiveDataType_AAC, (const char* )dataPointer, totalLength, [self getNowTime]);
    NSLog(@"~~~~~~~~~iAAc[%lld]",[self getNowTime]);
    if (ret == 1) {
        NSLog(@"~~~~~aac~~~~~sendData ret[%d] totalLength[%d]",ret,(int)totalLength);
    }
    // fail 1 success 0
}

rtmp推流網上也有很多代碼,調用rtmplib 可以自己用c++封裝一個庫用來調用。

二、最后的陳述

這里就先不解釋了大體采集發送的過程就是這樣,還有一點視頻采集發送音頻采集發送的時間獲取的是當前時間,測試的時候也可以寫間隔20ms來測試延遲的問題,它的邏輯是發送一堆音頻數據再發送一個視頻數據,因為音頻數據比較多,音頻數據如果丟幀會感覺出來明顯的卡頓,視頻則不是,視頻丟一幀人眼是很難發現的
有些詳細的理論推薦大家看這篇博客。
http://www.lxweimin.com/p/a6530fa46a88

最后編輯于
?著作權歸作者所有,轉載或內容合作請聯系作者
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發布,文章內容僅代表作者本人觀點,簡書系信息發布平臺,僅提供信息存儲服務。

推薦閱讀更多精彩內容