녹음 기능 개발을 위한 주요 팁

효율적인 기록 관리는 그 어느 때보다 중요합니다. 디지털 시대에는 오디오, 비디오 등 엄청나게 증가하는 정보를 제한된 시간 안에 처리해야 합니다. 이것은 많은 시나리오에서 유용하기 때문에 실시간 전사 기능을 필수적으로 만듭니다.
음성 또는 화상 회의에서 이 기능은 회의록을 녹음하여 나중에 참조할 수 있어 혼자 작성하는 것보다 편리합니다. 제 아이들이 온라인 과정에서 필기하는 데 어려움을 겪는 것을 보았기 때문에 필사 기능 덕분에 이 과정이 훨씬 쉬워질 수 있다는 것을 알고 있습니다. 요컨대, 교사가 말하는 모든 것을 적어야 하는 작업을 제거하여 아이들이 강의 자체에 집중하고 나중에 내용을 쉽게 다시 복습할 수 있도록 했습니다. 또한 라이브 자막은 시청자에게 실시간 자막을 제공하여 더 나은 시청 경험을 제공합니다.
코더로서 저는 "행동이 말보다 더 크게 말합니다"를 믿습니다. 그래서 이렇게 ML Kit의 areal-time transcription capability를 이용하여 실시간 전사 기능을 개발하게 되었습니다.

데모

이 기능은 최대 5시간 분량의 음성을 중국어, 영어(또는 둘 다) 및 프랑스어로 실시간으로 전사합니다. 또한 출력 텍스트에는 구두점이 있고 타임스탬프가 포함되어 있습니다.
이 기능에는 몇 가지 요구 사항이 있습니다. 프랑스어 지원은 휴대폰 모델에 따라 다르지만 중국어와 영어는 모든 휴대폰 모델에서 사용할 수 있습니다. 또한 이 기능은 인터넷 연결이 필요합니다.
자, 이제 이 기사의 요점으로 넘어가겠습니다. 이 실시간 기록 기능을 어떻게 개발했는지.

개발 절차

나. 필요한 준비를 합니다. 이에 대해서는 참조 섹션에서 자세히 설명합니다.
ii. 음성 인식기를 만들고 구성합니다.

MLSpeechRealTimeTranscriptionConfig config = new MLSpeechRealTimeTranscriptionConfig.Factory()
    // Set the language, which can be Chinese, English, both Chinese and English, or French.
    .setLanguage(MLSpeechRealTimeTranscriptionConstants.LAN_ZH_CN)
    // Punctuate the text recognized from the speech.
    .enablePunctuation(true)
    // Set the sentence offset.
    .enableSentenceTimeOffset(true)
    // Set the word offset.
    .enableWordTimeOffset(true)
    .create();
MLSpeechRealTimeTranscription mSpeechRecognizer = MLSpeechRealTimeTranscription.getInstance();

iii. 음성 인식 결과 리스너에 대한 콜백을 만듭니다.

// Use the callback to implement the [MLSpeechRealTimeTranscriptionListener](https://developer.huawei.com/consumer/en/doc/development/hiai-References/mlspeechrealtimetranscriptionlistener-0000001159518088) API and methods in the API.
Protected class SpeechRecognitionListener implements MLSpeechRealTimeTranscriptionListener{
    @Override
    public void onStartListening() {
        // The recorder starts to receive speech.
    }

    @Override
    public void onStartingOfSpeech() {
        // The speech recognizer detects the user speaking.
    }

    @Override
    public void onVoiceDataReceived(byte[] data, float energy, Bundle bundle) {
        // Return the original PCM stream and audio power to the user. The API does not run in the main thread, and the return result is processed in a sub-thread.
   }

    @Override
    public void onRecognizingResults(Bundle partialResults) {
        // Receive recognized text from **MLSpeechRealTimeTranscription**.
    }

    @Override
    public void onError(int error, String errorMessage) {
        // Callback when an error occurs during recognition.
    }

    @Override
    public void onState(int state,Bundle params) {
        // Notify the app of the recognizer status change.
    }
}

iv. 음성 인식기를 바인딩합니다.

mSpeechRecognizer.setRealTimeTranscriptionListener(new SpeechRecognitionListener());

v. startRecognizing을 호출하여 음성 인식을 시작합니다.

mSpeechRecognizer.startRecognizing(config);

vi. 인식이 완료되면 인식을 중지하고 인식기가 점유한 리소스를 해제합니다.

if (mSpeechRecognizer!= null) {
    mSpeechRecognizer.destroy();
}

참조

구성 중 Necessary Information During Preparation

Plug-In and the Maven Repository Address 추가 및 Building Dependencies 구성

Reference

이 문제에 관하여(녹음 기능 개발을 위한 주요 팁), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://dev.to/hmscore/top-tips-for-developing-a-recordist-function-3fnp

텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.

우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)

좋은 웹페이지 즐겨찾기

개발자 우수 사이트 수집

개발자가 알아야 할 필수 사이트 100선 추천 우리는 당신을 위해 100개의 자주 사용하는 개발자 학습 사이트를 정리했습니다