Speech框架的兩種用法【語音轉(zhuǎn)文字】

? ? ? ? 最近終于有一點時間抽空來看看最近的新技術(shù),然后發(fā)現(xiàn)了蘋果在去年出的新框架,可以直接語音轉(zhuǎn)文字,簡直厲害了,直接完爆了某些大公司。但是缺點在于只支持iOS10以上的系統(tǒng),但是也算是一大進步,贊一個。

一、開發(fā)環(huán)境要求

XCode8以上,只有它之后的編譯器里才有Speech.framework

二、創(chuàng)建工程開發(fā)

1.導(dǎo)入Speech.framework【Build Phases->Link Binary With Libraries->+】

2.info.plist文件中添加

Privacy - Speech Recognition Usage Description

使用語音識別

Privacy - Microphone Usage Description

使用麥克風(fēng)

3.第一種:識別本地錄音

#import "ViewController.h"#import<Speech/Speech.h>

@interface ViewController ()<SFSpeechRecognitionTaskDelegate>

@property (nonatomic ,strong) SFSpeechRecognitionTask *recognitionTask;

@property (nonatomic ,strong) SFSpeechRecognizer? ? ? *speechRecognizer;

@property (nonatomic ,strong) UILabel? ? ? ? ? ? ? ? *recognizerLabel;

@end

@implementation ViewController

- (void)dealloc {

[self.recognitionTask cancel];

self.recognitionTask = nil;

}

- (void)viewDidLoad {

[super viewDidLoad];

self.view.backgroundColor = [UIColor whiteColor];

//0.0獲取權(quán)限

[SFSpeechRecognizer requestAuthorization:^(SFSpeechRecognizerAuthorizationStatus status) {

switch (status) {

case SFSpeechRecognizerAuthorizationStatusNotDetermined:

break;

case SFSpeechRecognizerAuthorizationStatusDenied:

break;

case SFSpeechRecognizerAuthorizationStatusRestricted:

break;

case SFSpeechRecognizerAuthorizationStatusAuthorized:

break;

default:

break;

}

}];

//1.創(chuàng)建SFSpeechRecognizer識別實例

self.speechRecognizer = [[SFSpeechRecognizer alloc] initWithLocale:[[NSLocale alloc] initWithLocaleIdentifier:@"zh_CN"]];//中文識別

//@"zh"在iOS9之后就不是簡體中文了,而是TW繁體中文

//? ? [SFSpeechRecognizer supportedLocales];//根據(jù)手機設(shè)置的語言識別

//? ? for (NSLocale *lacal in [SFSpeechRecognizer supportedLocales].allObjects) {

//? ? ? ? NSLog(@"countryCode:%@? languageCode:%@ ", lacal.countryCode, lacal.languageCode);

//? ? }

//2.創(chuàng)建識別請求

SFSpeechURLRecognitionRequest *request = [[SFSpeechURLRecognitionRequest alloc] initWithURL:[NSURL fileURLWithPath:[[NSBundle mainBundle] pathForResource:@"1122334455.mp3" ofType:nil]]];

//3.開始識別任務(wù)

self.recognitionTask = [self recognitionTaskWithRequest1:request];

}

- (SFSpeechRecognitionTask *)recognitionTaskWithRequest0:(SFSpeechURLRecognitionRequest *)request{

return [self.speechRecognizer recognitionTaskWithRequest:request resultHandler:^(SFSpeechRecognitionResult * _Nullable result, NSError * _Nullable error) {

if (!error) {

NSLog(@"語音識別解析正確--%@", result.bestTranscription.formattedString);

}else {

NSLog(@"語音識別解析失敗--%@", error);

}

}];

}

- (SFSpeechRecognitionTask *)recognitionTaskWithRequest1:(SFSpeechURLRecognitionRequest *)request{

return [self.speechRecognizer recognitionTaskWithRequest:request delegate:self];

}

- (void)didReceiveMemoryWarning {

[super didReceiveMemoryWarning];

}

#pragma mark- SFSpeechRecognitionTaskDelegate

- (void)speechRecognitionDidDetectSpeech:(SFSpeechRecognitionTask *)task

{

}

- (void)speechRecognitionTask:(SFSpeechRecognitionTask *)task didHypothesizeTranscription:(SFTranscription *)transcription {

}

- (void)speechRecognitionTask:(SFSpeechRecognitionTask *)task didFinishRecognition:(SFSpeechRecognitionResult *)recognitionResult {

NSDictionary *attributes = @{

NSFontAttributeName:[UIFont systemFontOfSize:18],

};

CGRect rect = [recognitionResult.bestTranscription.formattedString boundingRectWithSize:CGSizeMake(self.view.bounds.size.width - 100, CGFLOAT_MAX) options:NSStringDrawingUsesLineFragmentOrigin attributes:attributes context:nil];

self.recognizerLabel.text = recognitionResult.bestTranscription.formattedString;

self.recognizerLabel.frame = CGRectMake(50, 120, rect.size.width, rect.size.height);

}

- (void)speechRecognitionTaskFinishedReadingAudio:(SFSpeechRecognitionTask *)task {

}

- (void)speechRecognitionTaskWasCancelled:(SFSpeechRecognitionTask *)task {

}

- (void)speechRecognitionTask:(SFSpeechRecognitionTask *)task didFinishSuccessfully:(BOOL)successfully {

if (successfully) {

NSLog(@"全部解析完畢");

}

}

#pragma mark- getter

- (UILabel *)recognizerLabel {

if (!_recognizerLabel) {

_recognizerLabel = [[UILabel alloc] initWithFrame:CGRectMake(50, 120, self.view.bounds.size.width - 100, 100)];

_recognizerLabel.numberOfLines = 0;

_recognizerLabel.font = [UIFont preferredFontForTextStyle:UIFontTextStyleBody];

_recognizerLabel.adjustsFontForContentSizeCategory = YES;

_recognizerLabel.textColor = [UIColor orangeColor];

[self.view addSubview:_recognizerLabel];

}

return _recognizerLabel;

}

@end

4.第二種:識別即時語音錄入

#import "ViewController.h"

#import <Speech/Speech.h>

@interface ViewController ()<SFSpeechRecognizerDelegate>

@property (nonatomic, strong) AVAudioEngine *audioEngine;? ? ? ? ? ? ? ? ? ? ? ? ? // 聲音處理器

@property (nonatomic, strong) SFSpeechRecognizer *speechRecognizer;? ? ? ? ? ? ? ? // 語音識別器

@property (nonatomic, strong) SFSpeechAudioBufferRecognitionRequest *speechRequest; // 語音請求對象

@property (nonatomic, strong) SFSpeechRecognitionTask *currentSpeechTask;? ? ? ? ? // 當(dāng)前語音識別進程

@property (nonatomic, strong) UILabel *showLb;? ? ? // 用于展現(xiàn)的label

@property (nonatomic, strong) UIButton *startBtn;? ? // 啟動按鈕

@end

@implementation ViewController

- (void)viewDidLoad

{

[super viewDidLoad];

// 初始化

self.audioEngine = [AVAudioEngine new];

// 這里需要先設(shè)置一個AVAudioEngine和一個語音識別的請求對象SFSpeechAudioBufferRecognitionRequest

self.speechRecognizer = [SFSpeechRecognizer new];

self.startBtn.enabled = NO;

[SFSpeechRecognizer requestAuthorization:^(SFSpeechRecognizerAuthorizationStatus status)

{

if (status != SFSpeechRecognizerAuthorizationStatusAuthorized)

{

// 如果狀態(tài)不是已授權(quán)則return

return;

}

// 初始化語音處理器的輸入模式

[self.audioEngine.inputNode installTapOnBus:0 bufferSize:1024 format:[self.audioEngine.inputNode outputFormatForBus:0] block:^(AVAudioPCMBuffer * _Nonnull buffer,AVAudioTime * _Nonnull when)

{

// 為語音識別請求對象添加一個AudioPCMBuffer,來獲取聲音數(shù)據(jù)

[self.speechRequest appendAudioPCMBuffer:buffer];

}];

// 語音處理器準(zhǔn)備就緒(會為一些audioEngine啟動時所必須的資源開辟內(nèi)存)

[self.audioEngine prepare];

self.startBtn.enabled = YES;

}];

}

- (void)onStartBtnClicked

{

if (self.currentSpeechTask.state == SFSpeechRecognitionTaskStateRunning)

{? // 如果當(dāng)前進程狀態(tài)是進行中

[self.startBtn setTitle:@"開始錄制" forState:UIControlStateNormal];

// 停止語音識別

[self stopDictating];

}

else

{? // 進程狀態(tài)不在進行中

[self.startBtn setTitle:@"停止錄制" forState:UIControlStateNormal];

self.showLb.text = @"等待";

// 開啟語音識別

[self startDictating];

}

}

- (void)startDictating

{

NSError *error;

// 啟動聲音處理器

[self.audioEngine startAndReturnError: &error];

// 初始化

self.speechRequest = [SFSpeechAudioBufferRecognitionRequest new];

// 使用speechRequest請求進行識別

self.currentSpeechTask =

[self.speechRecognizer recognitionTaskWithRequest:self.speechRequest resultHandler:^(SFSpeechRecognitionResult * _Nullable result,NSError * _Nullable error)

{

// 識別結(jié)果,識別后的操作

if (result == NULL) return;

self.showLb.text = result.bestTranscription.formattedString;

}];

}

- (void)stopDictating

{

// 停止聲音處理器,停止語音識別請求進程

[self.audioEngine stop];

[self.speechRequest endAudio];

}

#pragma mark- getter

- (UILabel *)showLb {

if (!_showLb) {

_showLb = [[UILabel alloc] initWithFrame:CGRectMake(50, 180, self.view.bounds.size.width - 100, 100)];

_showLb.numberOfLines = 0;

_showLb.font = [UIFont preferredFontForTextStyle:UIFontTextStyleBody];

_showLb.text = @"等待中...";

_showLb.adjustsFontForContentSizeCategory = YES;

_showLb.textColor = [UIColor orangeColor];

[self.view addSubview:_showLb];

}

return _showLb;

}

- (UIButton *)startBtn {

if (!_startBtn) {

_startBtn = [UIButton buttonWithType:UIButtonTypeCustom];

_startBtn.frame = CGRectMake(50, 80, 80, 80);

[_startBtn addTarget:self action:@selector(onStartBtnClicked) forControlEvents:UIControlEventTouchUpInside];

[_startBtn setBackgroundColor:[UIColor redColor]];

[_startBtn setTitle:@"錄音" forState:UIControlStateNormal];

[_startBtn setTitleColor:[UIColor whiteColor] forState:UIControlStateNormal];

[self.view addSubview:_startBtn];

}

return _startBtn;

}

@end

5.引申【各國語言代碼】

語言代碼

6.知其然知其所以然

SpeechFramework框架中的重要類

SFSpeechRecognizer:這個類是語音識別的操作類,用于語音識別用戶權(quán)限的申請,語言環(huán)境的設(shè)置,語音模式的設(shè)置以及向Apple服務(wù)發(fā)送語音識別的請求。

SFSpeechRecognitionTask:這個類是語音識別服務(wù)請求任務(wù)類,每一個語音識別請求都可以抽象為一個SFSpeechRecognitionTask實例,其中SFSpeechRecognitionTaskDelegate協(xié)議中約定了許多請求任務(wù)過程中的監(jiān)聽方法。

SFSpeechRecognitionRequest:語音識別請求類,需要通過其子類來進行實例化。

SFSpeechURLRecognitionRequest:通過音頻URL來創(chuàng)建語音識別請求。

SFSpeechAudioBufferRecognitionRequest:通過音頻流來創(chuàng)建語音識別請求。

SFSpeechRecognitionResult:語音識別請求結(jié)果類。

SFTranscription:語音轉(zhuǎn)換后的信息類。

7.Demo地址

https://github.com/BeanMan/SpeechFramWork

參考文章:

http://www.lxweimin.com/p/c4de4ee2134d

http://www.lxweimin.com/p/487147605e08

站在巨人的肩膀上才有這些總結(jié)

菜鳥走向大牛,大家共同前進,如果覺得不錯,請給個贊/關(guān)注。

一起交流學(xué)習(xí),有問題隨時歡迎聯(lián)系,郵箱:383708669@qq.com

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

推薦閱讀更多精彩內(nèi)容