S3 → Lambda → Transcribe → S3에서 문자 발생 파이프라인 만들기

1 S3(input)

음성 파일의 S3 버킷을 만듭니다.

2 람다

s3-get-object-python 를 이용해 갑니다.

「1 S3(input)」에서 작성한 S3 버킷을 선택해, 「트리거의 유효화」에 체크를 넣는다.

import json
import urllib.parse
import boto3

print('Loading function')

s3 = boto3.client('s3')


def lambda_handler(event, context):
    print("Received event: " + json.dumps(event, indent=2))

    # Get the object from the event and show its content type
    bucket = event['Records'][0]['s3']['bucket']['name']
    key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
    try:
        response = s3.get_object(Bucket=bucket, Key=key)
        print("CONTENT TYPE: " + response['ContentType'])
        return response['ContentType']
    except Exception as e:
        print(e)
        print('Error getting object {} from bucket {}. Make sure they exist and your bucket is in the same region as this function.'.format(key, bucket))
        raise e

음성 파일을 S3 버킷에 업로드하여 CloudWatch Logs에서 작동하는지 확인합니다.

3 S3(output)

문자 발생을 위한 S3 버킷을 작성해 둔다.

4 람다 수정

우선, 실행 롤에 AmazonTranscribeFullAccess 와 AmazonS3FullAccess 를 부여해 준다.

※ TranscribeService
참고로 Lambda function을 편집하십시오.

import json
import urllib.parse
import boto3
import datetime

s3 = boto3.client('s3')
transcribe = boto3.client('transcribe')

def lambda_handler(event, context):
    bucket = event['Records'][0]['s3']['bucket']['name']
    key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
    try:
        transcribe.start_transcription_job(
            TranscriptionJobName= datetime.datetime.now().strftime('%Y%m%d%H%M%S') + '_Transcription',
            LanguageCode='ja-JP',
            Media={
                'MediaFileUri': 'https://s3.ap-northeast-1.amazonaws.com/' + bucket + '/' + key
            },
            OutputBucketName='naata-ouput'
        )
    except Exception as e:
        print(e)
        print('Error getting object {} from bucket {}. Make sure they exist and your bucket is in the same region as this function.'.format(key, bucket))
        raise e

5 Transcription

"1 S3 (input)"에 mp3 파일을 업로드하고,
Transcription의 Output data location 에 출력된 「3 S3(output)」로
문자가 발생했는지 확인합니다.

Reference

이 문제에 관하여(S3 → Lambda → Transcribe → S3에서 문자 발생 파이프라인 만들기), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://qiita.com/leomaro7/items/ad9726391d547ea3bcfd

텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.

우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)

좋은 웹페이지 즐겨찾기

개발자 우수 사이트 수집

개발자가 알아야 할 필수 사이트 100선 추천 우리는 당신을 위해 100개의 자주 사용하는 개발자 학습 사이트를 정리했습니다