Dataflow Workbench에서 CloudSQL 데이터에 대한 BigQuery 제휴를 시도합니다.

Dataflow Workbench 시작


Google Cloud Console > Dataflow > Workbench

새 노트&Apache Beam&Without GPUs를 클릭하여 원래 상태로 "만들기"로 설정합니다.
일어나서 "JUPYTERLAB 열기"를 클릭하세요.

코드

import apache_beam as beam
from apache_beam.runners.interactive.interactive_runner import InteractiveRunner
import apache_beam.runners.interactive.interactive_beam as ib
from apache_beam.options import pipeline_options
#from apache_beam.options.pipeline_options import GoogleCloudOptions
#import google.auth
from apache_beam.io import ReadFromBigQuery
ib.options.recording_duration = '1m'
options = pipeline_options.PipelineOptions(project='<project id>', temp_location='gs://<bucket name>/temp')

p = beam.Pipeline(InteractiveRunner(), options=options)
# need to grand BigQuery connection user paermission to Compute Engine default Service Account
query='SELECT * FROM EXTERNAL_QUERY("projects/<project id>/locations/us/connections/cloudesql-fed", "SELECT * FROM federation_test.item;");'
query_results = p | beam.io.ReadFromBigQuery(
    query=query, use_standard_sql=True)
ib.show(query_results, include_window_info=True)

결실


이렇게 보니까CloudSQL의 연맹도 데이터를 통해 데이터를 수집할 수 있다.

좋은 웹페이지 즐겨찾기