Python 파충류 의 기어 오 르 기 삐걱삐걱 인기 동 영상 차 트

13934 단어 Python 기어오르다 B 역 랭 킹 삐걱삐걱

b.bs4 해석


import requests
from bs4 import BeautifulSoup
import datetime
if __name__=='__main__':
    url = 'https://www.bilibili.com/v/popular/rank/all'
    headers = {
       //           
    }
    page_text=requests.get(url=url,headers=headers).text
    soup=BeautifulSoup(page_text,'lxml')
    li_list=soup.select('.rank-list > li')
    with open('bZhanRank_bs4.txt','w',encoding='utf-8') as fp:
        fp.write('          ：'+str(datetime.datetime.now())+'

')
        for li in li_list:
            #      
            li_rank=li.find('div',class_='num').string
            li_rank='     ：'+li_rank+','
            #      
            li_title=li.find('div',class_='info').a.string.strip()
            li_title='     ：'+li_title+','
            #       
            li_viewCount=li.select('.detail>span')[0].text.strip()
            li_viewCount='      ：'+li_viewCount+', '
            #      
            li_danmuCount = li.select('.detail>span')[1].text.strip()
            li_danmuCount='       ：'+li_danmuCount+', '
            #      
            li_upName=li.find('span',class_='data-box up-name').text.strip()
            li_upName='  up ：'+li_upName+', '
            #      
            li_zongheScore=li.find('div',class_='pts').div.string
            li_zongheScore='       :'+li_zongheScore
            fp.write(li_rank+li_title+li_viewCount+li_danmuCount+li_upName+li_zongheScore+'
')

기어 오 르 기 결 과 는 다음 과 같다.

2.xpath 해석


import requests
from lxml import etree
import datetime
if __name__ == "__main__":
    #     
    headers = {
       //           
    }
    #  url
    url = 'https://www.bilibili.com/v/popular/rank/all'
    #          
    page_text = requests.get(url=url,headers=headers).content.decode('utf-8')
    #  etree       
    tree = etree.HTML(page_text)
    #            
    li_list = tree.xpath('//ul[@class="rank-list"]/li')
    #           
    with open('./bZhanRank.txt', 'w', encoding='utf-8') as fp:
        #         
        fp.write('  ：'+str(datetime.datetime.now())+'

')
        #       ，           
        for li in li_list:
            #      
            li_rank=li.xpath('.//div[@class="num"]/text()')
            #[0]             
            li_rank='    ：'+li_rank[0]+'
'
            #      
            li_title = li.xpath('.//a/text()')
            li_title='    ：'+li_title[0]+'
'
            #       
            li_viewCount=li.xpath('.//div[@class="detail"]/span[1]/text()')
            #.strip()           
            li_viewCount='     ：'+li_viewCount[0].strip()+'
'
            #        
            li_barrageCount = li.xpath('.//div[@class="detail"]/span[2]/text()')
            li_barrageCount='      ：'+li_barrageCount[0].strip()+'
'
            #    up   
            li_upName=li.xpath('.//span[@class="data-box up-name"]//text()')
            li_upName='  up ：'+li_upName[0].strip()+'
'
            #         
            li_score=li.xpath('.//div[@class="pts"]/div/text()')
            li_score='      ：'+li_score[0]+'

'
            #    
            fp.write(li_rank+li_title+li_viewCount+li_barrageCount+li_upName+li_score)
            print(li_rank+'    !!!!')

기어 오 르 기 결 과 는 다음 과 같다.

3.xpath 분석(이치 화 처리 후 그림 보 여주 기)


#----------      ----------
import requests#       
from lxml import etree#  xpath      
import datetime#         
from PIL import Image#          
from cv2 import cv2#          
from io import BytesIO#         
import re#          
#----------  ----------
def dJpg(url,title):
    """
      url    b webp             jpeg      
    :param url:（url）
    :return:(null+      )
    """
    headers = {
            //           
        }
    resp = requests.get(url, headers=headers)
    byte_stream = BytesIO(resp.content)
    im = Image.open(byte_stream)
    if im.mode == "RGBA":
        im.load()
        background = Image.new("RGB", im.size, (255, 255, 255))
        background.paste(im, mask=im.split()[3])
    im.save(title+'.jpg', 'JPEG')
def handle_image(img_path):
    """
     RGB            
    :param img_path:(    )
    :return:（        ）
    """
    #     
    img = cv2.imread(img_path)
    #          
    gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
    #           ，     127         255
    ret, binary = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY)
    return binary
 
#----------     ----------
if __name__ == "__main__":
    #-----    -----
    list_rank = []  #          
    list_pic_url = []  #          
 
    #-----    （    ）-----
 
    #     
    headers = {
        'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36 SLBrowser/7.0.0.2261 SLBChan/10'
    }
    #  url
    url = 'https://www.bilibili.com/v/popular/rank/all'
    #          
    page_text = requests.get(url=url,headers=headers).content.decode('utf-8')
    #  etree       
    tree = etree.HTML(page_text)
    #            
    li_list = tree.xpath('//ul[@class="rank-list"]/li')
 
    #-----    （  ）-----
 
    #                 ，           
    others_ex = r'"others".*?"tid"(.*?)]'
    list_others = re.findall(others_ex, page_text, re.S)
    #            others  
    for l in list_others:
        page_text = page_text.replace(l, '')
    pic_ex = r'"copyright":.*?,"pic":"(.*?)","title":".*?"'
    list_pic = re.findall(pic_ex, page_text, re.S)
    #     url       
    index = list_pic[0].rfind('u002F')
    #     url              url
    for i in list_pic:
        pic_url = 'http://i1.hdslb.com/bfs/archive/' + i[index + 5:] + '@228w_140h_1c.webp'
        list_pic_url.append(pic_url)
 
    #-----    -----
    #           
    with open('./bZhanRank2.txt', 'w', encoding='utf-8') as fp:
        #         
        fp.write('b      ，'+'  ：'+str(datetime.datetime.now())+'
')
        fp.write('  ：MB
')
        fp.write('*'*10+'        '+'*'*10+'

')
 
        #       ，           
        for i in range(len(li_list)):
            #      
            li_rank=li_list[i].xpath('.//div[@class="num"]/text()')
            pic_title=li_rank#                    
            #[0]             
            li_rank='    ：'+li_rank[0]+'
'
            #      
            li_title =li_list[i].xpath('.//a/text()')
            li_title='    ：'+li_title[0]+'
'
            #       
            li_viewCount=li_list[i].xpath('.//div[@class="detail"]/span[1]/text()')
            #.strip()           
            li_viewCount='     ：'+li_viewCount[0].strip()+'
'
            #        
            li_barrageCount = li_list[i].xpath('.//div[@class="detail"]/span[2]/text()')
            li_barrageCount='      ：'+li_barrageCount[0].strip()+'
'
            #    up   
            li_upName=li_list[i].xpath('.//span[@class="data-box up-name"]//text()')
            li_upName='  up ：'+li_upName[0].strip()+'
'
            #         
            li_score=li_list[i].xpath('.//div[@class="pts"]/div/text()')
            li_score='      ：'+li_score[0]+'

'
            #       （    ）
            fp.write(li_rank + li_title + li_viewCount + li_barrageCount + li_upName + li_score)
 
            #         url     jpeg  
            dJpg(list_pic_url[i], str(pic_title))
            #     jpeg             
            img = handle_image(str(pic_title) + '.jpg')
 
            #         （             ）
            img = cv2.resize(img, (120, 40))
            height, width = img.shape
            for row in range(0, height):
                for col in range(0, width):
                    #     0   ，     ‘1'   txt  
                    if img[row][col] == 0:
                        ch = '1'
                        fp.write(ch)
                    #       
                    else:
                        fp.write(' ')
                fp.write('*
')
            fp.write('


')
            print(li_rank + '    !!!!')

수첩 에 결 과 를 표시 하기 전에 수첩 의 형식 을 다음 과 같이 변경 하여 더 좋 은 시각 적 효 과 를 얻 을 수 있 습 니 다.

기어 오 르 기 결 과 는 다음 과 같다.그 다음 에 모든 픽 셀 점 을 옮 겨 다 니 며 픽 셀 값 이 0 인 픽 셀 점(즉 검은색)에 대해'1'을 기록 하고 픽 셀 값 이 1 인 픽 셀 점(즉 흰색)에 대해'빈 칸'을 기록 하여 간단 한 그림 시 뮬 레이 션 을 실현 합 니 다.)

수평선 과 수평선 아래 의 그림 은 한 시간 에 오 르 는 것 이 아니다.

상기 그림 은 문자 디 스 플레이 와 이미지 디 스 플레이 간 의 관 계 를 균형 있 게 하기 위해 그림 의 크기 를 작은 사이즈 로 강제로 설정 하고 그림 의 디 스 플레이 가 뚜렷 하지 않다.그림 을 선명 하 게 표시 하려 면 문자 의 효 과 를 고려 하지 않 고 그림 의 크기 를 크게 설정 하고 수첩 의 글꼴 크기 를 변경 할 수 있 습 니 다(직렬 방지).다음 그림 과 같이 그림 을 선명 하 게 보 여 줄 수 있 습 니 다.

4.분석 과정
(1)url―b 사이트 동 영상 차 트 의 사이트 주 소 를 가 져 옵 니 다.

(2)요청 헤드 를 가 져 옵 니 다.(오른쪽 클릭-검사)개발 자 도 구 를 열 고 Network 를 클릭 하여 패 킷 을 마음대로 선택 하고 요청 헤드 를 복사 하면 됩 니 다.

(3)웹 분석-개발 자 도구 왼쪽 상단 의 손잡이 도 구 를 클릭 하여 페이지 의 동 영상 을 선택 하면 서로 다른 동 영상 이 서로 다른 li 태그 에 저 장 된 것 을 발견 합 니 다.

(4)홈 페이지 분석-페이지 에 있 는 동 영상의 제목 을 선택 한 결과 제목 내용 이 a 태그 의 텍스트 내용 에 저장 되 어 있 고 나머지 동 영상 정 보 는 상기 와 같다.

(5)웹 페이지 분석-비디오 재생 량 정 보 를 볼 때 span 탭 에 저 장 된 빈 칸 을 발견 하고 코드 를 작성 할 때 strip()방법 으로 빈 칸 을 제거 합 니 다.

（6)디 버 깅 코드-디 버 깅 코드 시 기어 오 르 는 그림 url 의 목록 이 비어 있 습 니 다.

(7)잘못 배열-이미지 url 저장 라벨 위 치 를 검사 한 결과 위치 가 정확 하 다.

(8)오류 정렬-정보 가 비어 있 습 니 다.웹 페이지 가 로 딩 부담 을 줄 이기 위해 자바 스 크 립 트 비동기 로 딩 을 사용 할 수 있 습 니 다.개발 자 도구 에서 XHR 를 클릭 하여 패 킷 에서 그림 url 을 저장 하 는 패 킷 을 찾 았 는데 존재 하지 않 습 니 다.

(9)오류-(오른쪽 단 추 를 누 르 면 웹 소스 코드 를 볼 수 있 습 니 다)원본 코드 에서 그림 의 url 을 검색 하면 모든 그림 의 url 이 웹 소스 코드 의 맨 뒤에 저장 되 어 있 는 것 을 발견 하고 정규 표현 식 을 사용 하여 분석 할 수 있 습 니 다.

(10)오류 정렬-정규 해석 을 사용 하 는 과정 에서 others 목록 을 되 돌려 줍 니 다.이 목록 은 일부 동 영상 아래 의 동 영상 추천 이 므 로 삭제 해 야 합 니 다.그렇지 않 으 면 정규 표현 식 에 영향 을 주어 분석 해 야 합 니 다.

파 이 썬 파충류 의 삐삐 삐삐 인기 동 영상 차 트 오 르 기 에 관 한 이 글 은 여기까지 소개 되 었 습 니 다.더 많은 파 이 썬 이 B 역 차 트 오 르 기 에 관 한 내용 은 우리 의 이전 글 을 검색 하거나 아래 의 관련 글 을 계속 찾 아 보 세 요.앞으로 많은 응원 부 탁 드 리 겠 습 니 다!

이 내용에 흥미가 있습니까?

현재 기사가 여러분의 문제를 해결하지 못하는 경우 AI 엔진은 머신러닝 분석(스마트 모델이 방금 만들어져 부정확한 경우가 있을 수 있음)을 통해 가장 유사한 기사를 추천합니다:

Python의 None과 NULL의 차이점 상세 정보

그래서 대상 = 속성 + 방법 (사실 방법도 하나의 속성, 데이터 속성과 구별되는 호출 가능한 속성 같은 속성과 방법을 가진 대상을 클래스, 즉 Classl로 분류할 수 있다.클래스는 하나의 청사진과 같아서 하나의 ...

텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.

CC BY-SA 2.5, CC BY-SA 3.0 및 CC BY-SA 4.0에 따라 라이센스가 부여됩니다.

Liux 에서 Oacle 을 설치 한 후 명령 행 으로 시작 하 는 방법 Liux 에서 Oacle 을 시작 합 니 다.

linux 시스템 mysql 자동 백업 및 ftp 업로드 방법 사용

좋은 웹페이지 즐겨찾기

개발자 우수 사이트 수집

개발자가 알아야 할 필수 사이트 100선 추천 우리는 당신을 위해 100개의 자주 사용하는 개발자 학습 사이트를 정리했습니다