PC에도 TV에도 음성 인식 추가

19694 단어 csharp hackwithdg

내 제출물 개요

일상적인 컴퓨터 사용의 대부분은 컴퓨터를 사운드 장치로 사용하므로 기본 오디오 출력을 음성 인식에 연결하여 어떤 소프트웨어를 사용하든 상관없이 모든 단어를 인식할 수 있으면 좋겠다고 생각했습니다. Teams, youtube, tiktok, twitter, Edge, VLC, … 그리고 케이블 TV용 자막처럼 얼마나 밀어붙일 수 있을까요 😊

제출 카테고리:

접근성 지지자

GitHub의 코드 링크

암울한 전망 / 딥그램윈시스

방출된 Windows의 모든 사운드에 대한 딥그램 사운드-텍스트 변환기

딥그램윈시스

방출된 Windows의 모든 사운드에 대한 딥그램 사운드-텍스트 변환기
이 저장소에서 무엇을 찾을 수 있습니까?

Windows 양식에서 Deepgram을 시작하는 방법

Windows 형식의 테두리가 있는 사용자 정의 레이블 컨트롤 샘플

시스템 전체의 기본 오디오 출력을 가져오고 캡처하는 방법

캡처한 오디오를 mp3로 녹음하는 방법

시스템 설정을 저장하고 가져오는 방법

View on GitHub

추가 리소스/정보

시스템 전체의 음성 인식을 하려면 기본적으로 딥그램과 같은 음성 인식기 서비스와 PC에서 생성된 소리를 도청하는 방법의 두 가지 구성 요소가 필요합니다. 시스템에 직접 연결할 것이므로 언어로 C#을 선택했습니다. Windows에서 루프백(셀프 사운드 시스템에 연결하는 기술 용어)을 사용하려면 Wasapi 드라이버를 사용합니다. Windows 시스템용 NAudio 라이브러리를 선택했습니다.
그리고 여기에 몇 가지 결과가 있습니다.

팀에서 작동합니다.

브라우저에서 작동합니다.

시스템에 설정, 투명... 속성이 있는 동안 가장 중요한 시스템은 오디오를 받고 인식합니다.

private async void ConvertAndTranscript()
{
    //enter credentials for deepgram
    var credentials = new Credentials(textBoxApiKey.Text);
    //Create our export folder to record sound and CSV file
    var outputFolder = CreateRecordingFolder();
    //File settings
    var dateTimeNow = DateTime.Now;
    var fileName = $"{dateTimeNow.Year}_{dateTimeNow.Month}_{dateTimeNow.Day}#{dateTimeNow.Hour}_{dateTimeNow.Minute}_{dateTimeNow.Minute}_record";
    var soundFileName = $"{fileName}.mp3";
    var csvFileName = $"{fileName}.csv";
    var outputSoundFilePath = Path.Combine(outputFolder, soundFileName);
    var outputCSVFilePath = Path.Combine(outputFolder, csvFileName);
    //init deepgram
    var deepgramClient = new DeepgramClient(credentials);
    //init loopback interface
    _WasapiLoopbackCapture = new WasapiLoopbackCapture();
    //generate memory stream and deepgram client
    using (var memoryStream = new MemoryStream())
    using (var deepgramLive = deepgramClient.CreateLiveTranscriptionClient())
    {
        //the format that will we send to deepgram is 24 Khz 16 bit 2 channels  
        var waveFormat = new WaveFormat(24000, 16, 2);
        var deepgramWriter = new WaveFileWriter(memoryStream, waveFormat);
        //mp3 writer if we wanted to save audio
        LameMP3FileWriter? mp3Writer = checkBoxSaveMP3.Checked ?
            new LameMP3FileWriter(outputSoundFilePath, _WasapiLoopbackCapture.WaveFormat, LAMEPreset.STANDARD_FAST) : null;

        //file writer if we wanted to save as csv
        StreamWriter? csvWriter = checkBoxSaveAsCSV.Checked ? File.CreateText(outputCSVFilePath) : null;
        //deepgram options
        var options = new LiveTranscriptionOptions()
        {
            Punctuate = true,
            Diarize = true,
            Encoding = Deepgram.Common.AudioEncoding.Linear16,
            ProfanityFilter = checkBoxProfinityAllowed.Checked,
            Language = _SelectedLanguage.LanguageCode,
            Model = _SelectedModel.ModelCode,
        };
        //connect 
        await deepgramLive.StartConnectionAsync(options);
        //when we receive data from deepgram this is mostly taken from their samples
        deepgramLive.TranscriptReceived += (s, e) =>
        {
            try
            {
                if (e.Transcript.IsFinal &&
                   e.Transcript.Channel.Alternatives.First().Transcript.Length > 0)
                {
                    var transcript = e.Transcript;
                    var text = $"{transcript.Channel.Alternatives.First().Transcript}";
                    _CaptionForm?.captionLabel.BeginInvoke((Action)(() =>
                    {
                        csvWriter?.WriteLine($@"{DateTime.Now.ToString("yyyy-MM-dd HH:mm:ss \"GMT\"zzz")},""{text}""");
                        _CaptionForm.captionLabel.Text = text;
                        _CaptionForm?.captionLabel.Refresh();
                    }));
                }
            }
            catch (Exception ex)
            {

            }
        };
        deepgramLive.ConnectionError += (s, e) =>
        {

        };
        //when windows tell us that there is sound data ready to be processed
        //better than polling
        _WasapiLoopbackCapture.DataAvailable += (s, a) =>
        {
            mp3Writer?.Write(a.Buffer, 0, a.BytesRecorded);
            var buffer = ToPCM16(a.Buffer, a.BytesRecorded, _WasapiLoopbackCapture.WaveFormat);
            deepgramWriter.Write(buffer, 0, buffer.Length);
            deepgramLive.SendData(memoryStream.ToArray());
            memoryStream.Position = 0;
        };
        //when recording stopped release and flush all file pointers 
        _WasapiLoopbackCapture.RecordingStopped += (s, a) =>
        {
            if (mp3Writer != null)
            {
                mp3Writer.Dispose();
                mp3Writer = null;
            }
            if (csvWriter != null)
            {
                csvWriter.Dispose();
                csvWriter = null;
            }
            _WasapiLoopbackCapture.Dispose();
        };
        _WasapiLoopbackCapture.StartRecording();
        while (_WasapiLoopbackCapture.CaptureState != NAudio.CoreAudioApi.CaptureState.Stopped)
        {
            if (_CancellationTokenSource?.IsCancellationRequested == true)
            {
                _CancellationTokenSource?.Dispose();
                _CancellationTokenSource = null;
                return;
            }
            Thread.Sleep(500);
        }
    }
}

코드의 나머지는 show hide form 등을 실행할 준비가 된 코드를 얻기 위한 것입니다.

결국 어떻게 TV에서 자막을 볼 수 있습니까? 이를 달성하려면 어떻게든 PC에 TV 신호를 입력해야 합니다. 캡처 카드 입력을 처리하기 위해 USB 캡처 카드를 사용합니다. 오디오 신호를 받으면 OBS를 사용합니다. 모든 출력 사운드를 처리하기 때문에 아무런 차이가 없습니다. 신호. 그런 다음 컴퓨터 HDMI 출력을 사용하여 신호를 TV로 보냅니다. 그것은 TV와 케이블 박스에 차이가 없습니다.

추신: 지연 문제가 있는 경우 네트워크 연결을 확인하세요. 또한 해킹 솔루션에 만족하지 않는 메모리 스트림에 문제가 있는 것 같습니다. 모든 PR을 환영합니다.

Reference

이 문제에 관하여(PC에도 TV에도 음성 인식 추가), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://dev.to/bleakview/add-speech-recognition-to-your-pc-even-to-your-tv-4j0n

텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.

우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)

포밀크입니다

botkube로 kubernetes 이벤트 알림

좋은 웹페이지 즐겨찾기

개발자 우수 사이트 수집

개발자가 알아야 할 필수 사이트 100선 추천 우리는 당신을 위해 100개의 자주 사용하는 개발자 학습 사이트를 정리했습니다