Twitch 배달 아카이브의 코멘트 유량을 시각화해 보았다

MYJLab Advent Calendar 2020 22일째의 기사입니다.

responder 기사를 쓸 생각이었지만 시간이 없었기 때문에 포기 ...

이전에 트위치의 댓글을 받고 고뇨고했던 적이 있었기 때문에 그것을 기사로 했습니다.

할 일

Twitch 배달 아카이브의 코멘트를 취득해, csv 파일에 출력한다.

Matplotlib를 사용하여 주석의 유량을 시각화합니다.

준비

실행 환경

파이썬 3.7.9
Jupyter Notebook

Twitch API

Twitch Developers 에서 client-id를 가져옵니다.

구현

배달 아카이브 코멘트 가져오기

Twitch의 배포 아카이브에서 주석을 얻고 csv 파일로 출력합니다.

이번에는 stylishnoob 님의 배달 아카이브 에서 코멘트를 받았습니다.

get_comments.py

import requests
import json
import csv

client_id = 'your client-id'
video_id = '841484712'

# 一回目のリクエスト
url = 'https://api.twitch.tv/v5/videos/' + video_id + '/comments?content_offset_seconds=0'
headers = {'client-id': client_id}
r = requests.get(url, headers=headers)
row_data = r.json()

with open('comments.csv', 'a') as f:
    writer = csv.writer(f)
    for comment in row_data['comments']:
        writer.writerow([
            comment['content_offset_seconds'],
            comment['message']['body']
        ])

# 二回目以降のリクエスト
while '_next' in row_data:
    url = 'https://api.twitch.tv/v5/videos/' + video_id + '/comments?cursor=' + row_data['_next']
    headers = {'client-id': client_id}
    r = requests.get(url, headers=headers)
    row_data = r.json()

    with open('comments.csv', 'a') as f:
        writer = csv.writer(f)
        for comment in row_data['comments']:
            writer.writerow([
                comment['content_offset_seconds'],
                comment['message']['body']
            ])

생성된 CSV 파일(일부)

comments.csv

commented_sec,comment
1.37,OhMyDog OhMyDog
5.502,sekiKansya
7.952,sekiKansya sekiKansya sekiKansya
10.084,sekiKansya sekiKansya sekiKansya
11.785,sekiOct sekiOct sekiOct
13.287,せきさああああああああああん
13.481,sekiKansya sekiKansya sekiKansya
13.527,sekiKansya sekiKansya sekiKansya
14.75,OhMyDog OhMyDog OhMyDog
14.983,感謝します

코멘트의 유량을 시각화

Matplotlib를 사용하여 주석의 유량을 시각화합니다. 분당 주석 수를 그래프로 표시합니다.

여기에서 Jupyter Notebook에서 실행합니다.

먼저 필요한 라이브러리를 가져오고 방금 생성한 CSV 파일을 로드합니다.

import pandas as pd
import matplotlib.pyplot as plt
from IPython.display import display

df = pd.read_csv('comments.csv')

그런 다음 데이터를 처리합니다.

# コメント投稿時間の単位を、秒から分に直す
df['commented_min'] = df['commented_sec'] // 60
df['commented_min']  = df['commented_min'] .astype(int)

마지막으로 그래프를 만듭니다.

# 最後のコメントが投稿された時間（分）
last_commented_minutes = df.iat[-1, 2]

# グラフの個数は、last_commented_minutesを使い動的に指定。60分ごとにグラフを1つ作成
graph_number = last_commented_minutes // 60  + 1

# figsizeはgraph_numberを使い動的に指定。グラフ１つあたり(20,7)を割り当てる。
fig, ax = plt.subplots(graph_number, 1, figsize=(20,  graph_number * 7))

for first_index in range(0, last_commented_minutes + 1, 60):
    # コメントが投稿された時間（分刻み）
    x = df.groupby('commented_min').count().iloc[first_index : first_index + 60].index

    # 1分間あたりのコメント数のグラフ
    ax[(first_index + 1) // 60].plot(x, df.groupby('commented_min').count().iloc[first_index : first_index + 60].comment.values, marker=".", label='the number of comments')

    ax[(first_index + 1) // 60].grid(axis='both')
    ax[(first_index + 1) // 60].set_xlim(first_index, first_index + 60)
    ax[(first_index + 1) // 60].legend()
plt.show()

파란색 꺾은선은 분당 주석 수를 나타냅니다.

코멘트 유량은 시간에 따라 상당히 변동하는 것을 알 수 있습니다.
대규모입니다만, 전달의 고조 포인트를 파악할 수 있는 것은 아닐까요.

참고

주된 동영상과 함께 코멘트가 흐르는 사이트의 코멘트 취득·동영상 정보 취득 정리

Reference

이 문제에 관하여(Twitch 배달 아카이브의 코멘트 유량을 시각화해 보았다), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://qiita.com/kanekom/items/42ed3cd079fa5409ae58

텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.

우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)

좋은 웹페이지 즐겨찾기

개발자 우수 사이트 수집

개발자가 알아야 할 필수 사이트 100선 추천 우리는 당신을 위해 100개의 자주 사용하는 개발자 학습 사이트를 정리했습니다