Beautiful Soup 중 find 와 findall 사용 설명

10779 단어 BeautifulSoup find find all

파충류 이기 BeautifulSoup 중 find 와 findall 사용법
두말 하지 않 고 먼저 상단 HTML 예


<html>
  <head>
    <title>
      index
    </title>
  </head>
  <body>
     <div>
        <ul>
           <li id="flask"class="item-0"><a href="link1.html" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >first item</a></li>
          <li class="item-1"><a href="link2.html" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >second item</a></li>
          <li class="item-inactie"><a href="link3.html" rel="external nofollow" rel="external nofollow" rel="external nofollow" >third item</a></li>
          <li class="item-1"><a href="link4.html" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >fourth item</a></li>
          <li class="item-0"><a href="link5.html" rel="external nofollow" rel="external nofollow" rel="external nofollow" >fifth item</a>
         </ul>
     </div>
    <li> hello world </li>
  </body>
</html>

BeautifulSoup 을 사용 하기 전에 BeautifulSoup 인 스 턴 스 를 구축 해 야 합 니 다.


#   beautifulsoup  
soup = BeautifulSoup(html,'lxml')
#             
#       beautifulsoup      ，

주의해 야 할 것 은 가 져 오 는 모듈 은 사전에 설치 해 야 하 며,가 져 오 는 LXML 은 사전에 설치 되 어 있 습 니 다.가 져 올 수 있 는 모듈 은 BeautifulSoup 의 문 서 를 조회 하여 볼 수 있 습 니 다.
第一次插入图片，那，我表个白，我超爱我女朋友呼延羿彤~~

다음은 find 와 find모든 소개
1. find
첫 번 째 일치 하 는 대상 만 되 돌려 줍 니 다.
문법:


find(name, attrs, recursive, text, **wargs)　　　　
# recursive    ，

인자:
매개 변수 이름
역할.
name
태그 찾기
text
텍스트 찾기
attrs
attrs 기반 매개 변수
예:


# find    
li = soup.find('li')
print('find_li:',li)
print('li.text(       ):',li.text)
print('li.attrs(       ):',li.attrs)
print('li.string(          ):',li.string)

실행 결과:
find_li:

first item

li.text(탭 의 내용 을 되 돌려 줍 니 다):first item
li.attrs(탭 의 속성 을 되 돌려 줍 니 다):{'id':'flask','class':['item-0']}
li.string(탭 내용 을 문자열 로 되 돌려 줍 니 다):first item
find 도'속성=값'방법 으로 매 칭 할 수 있 습 니 다.


li = soup.find(id = 'flask')
print(li,'
')


<li class="item-0" id="flask"><a href="link1.html" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >first item</a></li>

주의해 야 할 것 은 class 는 python 의 보존 키워드 이기 때문에 탭 에 있 는 class 의 속성 과 일치 하려 면 특별한 방법 이 필요 합 니 다.다음 과 같은 두 가지 가 있 습 니 다.

attrs 속성 에서 사전 방식 으로 매개 변 수 를 전달 합 니 다

BeautifulSoup 자체 의 특별한 키워드 class


#    : attrs           
find_class = soup.find(attrs={'class':'item-1'})
print('findclass:',find_class,'
')
#    :BeautifulSoup         class_
beautifulsoup_class_ = soup.find(class_ = 'item-1')
print('BeautifulSoup_class_:',beautifulsoup_class_,'
')

실행 결과
findclass:

second item

BeautifulSoup_class_:

second item

2. find_all
일치 하 는 모든 결 과 를 되 돌려 줍 니 다.find(find 는 찾 은 첫 번 째 결과 만 되 돌려 줍 니 다)
문법:


find_all(name, attrs, recursive, text, limit, **kwargs)

매개 변수 이름
역할.
name
태그 찾기
text
텍스트 찾기
attrs
attrs 기반 매개 변수
find 와 같은 문법
상위 코드


# find_all     
li_all = soup.find_all('li')
for li_all in li_all:
	print('---')
	print('    li:',li_all)
	print('li   :',li_all.text)
	print('li   :',li_all.attrs)

실행 결과:
---
일치 하 는 li:

first item

li 내용:first item
li 의 속성:{'id':'flask','class':['item-0']}
---
일치 하 는 li:

second item

li 내용:second item
li 의 속성:{'class':['item-1']}
---
일치 하 는 li:

third item

li 내용:third item
li 의 속성:{'cvlass':'item-inactie'}
---
일치 하 는 li:

fourth item

li 내용:fourth item
li 의 속성:{'class':['item-1']}
---
일치 하 는 li:

fifth item

li 내용:fifth item
비교적 유연 한 find 첨부all 검색 방법:


#         
li_quick = soup.find_all(attrs={'class':'item-1'})
for li_quick in li_quick:
	print('        :',li_quick)

실행 결과:

가장 유연 한 검색 방법:

second item

가장 유연 한 검색 방법:

fourth item

전체 코드:


# coding=utf8
# @Author= CaiJunxuan
# @QQ=469590490
# @Wechat:15916454524

# beautifulsoup

#   beautifulsoup  
from bs4 import BeautifulSoup

# HTML  
html = '''
<html>
  <head>
    <title>
      index
    </title>
  </head>
  <body>
     <div>
        <ul>
           <li id="flask"class="item-0"><a href="link1.html" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >first item</a></li>
          <li class="item-1"><a href="link2.html" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >second item</a></li>
          <li cvlass="item-inactie"><a href="link3.html" rel="external nofollow" rel="external nofollow" rel="external nofollow" >third item</a></li>
          <li class="item-1"><a href="link4.html" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >fourth item</a></li>
          <li class="item-0"><a href="link5.html" rel="external nofollow" rel="external nofollow" rel="external nofollow" >fifth item</a>
         </ul>
     </div>
    <li> hello world </li>
  </body>
</html>
'''

#   beautifulsoup  
soup = BeautifulSoup(html,'lxml')
#             
#       beautifulsoup      ,   
# html.parser python         ，      lxml     
# lxml   lxml  
# html5lib,           html5  
#        ,          ，    lxml    lxml

#  bs4          ,      :

# find    
li = soup.find('li')
print('find_li:',li)
print('li.text(       ):',li.text)
print('li.attrs(       ):',li.attrs)
print('li.string(          ):',li.string)
print(50*'*','
')

# find    '   =  '     select
li = soup.find(id = 'flask')
print(li,'
')
#   class python      ，        class     
#          class    
#    : attrs           
find_class = soup.find(attrs={'class':'item-1'})
print('findclass:',find_class,'
')
#    :BeautifulSoup         class_
beautifulsoup_class_ = soup.find(class_ = 'item-1')
print('BeautifulSoup_class_:',beautifulsoup_class_,'
')

# find_all     
li_all = soup.find_all('li')
for li_all in li_all:
	print('---')
	print('    li:',li_all)
	print('li   :',li_all.text)
	print('li   :',li_all.attrs)

#         
li_quick = soup.find_all(attrs={'class':'item-1'})
for li_quick in li_quick:
	print('        :',li_quick)

Beautiful Soup 에서 find 와 findall 의 사용 에 대한 상세 한 설명 은 여기까지 입 니 다.BeautifulSoup find 와 findall 내용 은 저희 의 이전 글 을 검색 하거나 아래 의 관련 글 을 계속 찾 아 보 세 요.앞으로 도 많은 응원 부 탁 드 리 겠 습 니 다!

이 내용에 흥미가 있습니까?

현재 기사가 여러분의 문제를 해결하지 못하는 경우 AI 엔진은 머신러닝 분석(스마트 모델이 방금 만들어져 부정확한 경우가 있을 수 있음)을 통해 가장 유사한 기사를 추천합니다:

GIS×Python~지오코딩으로 위치 정보 취득~

상권 분석할 때 목표물이나 시설의 위치 정보를 갖고 싶거나 한다. 일일이 손 입력해서 검색하는 것이 조금 귀찮아서 지오코딩을 반자동화해 보았다. 사쿠토 1시간 정도로 쓴 코드이므로 특히 어려운 일은 하지 않고, 보다...

텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.

CC BY-SA 2.5, CC BY-SA 3.0 및 CC BY-SA 4.0에 따라 라이센스가 부여됩니다.

asp 에서 어떻게 ADO 서버 에서 캐 시 기술 을 잘 이용 합 니까?

ajax 백 엔 드 에서 돌아 오 는 데이터 가 null 인지 판단 하 는 방법

좋은 웹페이지 즐겨찾기

개발자 우수 사이트 수집

개발자가 알아야 할 필수 사이트 100선 추천 우리는 당신을 위해 100개의 자주 사용하는 개발자 학습 사이트를 정리했습니다