Python을 사용하여 PDF 파일 다운로드

이 문서에서는 Pythonrequests 라이브러리를 사용하여 PDF를 다운로드하는 방법에 대해 설명합니다.

접근하다

가져오기requests 라이브러리

URL을 요청하고 response 개체를 가져옵니다.

response 개체를 사용하여 PDF 파일을 가져오고 True를 반환합니다.

PDF를 다운로드할 수 없으면 False로 돌아갑니다.

구현

다음 프로그램은 제공된 URL에서 PDF 파일을 다운로드합니다.

#!/usr/bin/env python3
import os
import requests


def download_pdf_file(url: str) -> bool:
    """Download PDF from given URL to local directory.

    :param url: The url of the PDF file to be downloaded
    :return: True if PDF file was successfully downloaded, otherwise False.
    """

    # Request URL and get response object
    response = requests.get(url, stream=True)

    # isolate PDF filename from URL
    pdf_file_name = os.path.basename(url)
    if response.status_code == 200:
        # Save in current working directory
        filepath = os.path.join(os.getcwd(), pdf_file_name)
        with open(filepath, 'wb') as pdf_object:
            pdf_object.write(response.content)
            print(f'{pdf_file_name} was successfully saved!')
            return True
    else:
        print(f'Uh oh! Could not download {pdf_file_name},')
        print(f'HTTP response status code: {response.status_code}')
        return False


if __name__ == '__main__':
    # URL from which pdfs to be downloaded
    URL = 'https://raw.githubusercontent.com/seraph776/DevCommunity/main/PDFDownloader/assests/the_raven.pdf'
    download_pdf_file(URL)

산출

the_raven.pdf was successfully saved!

결론

이 문서를 읽은 후 이제 Python의requests 라이브러리를 사용하여 PDF를 다운로드할 수 있습니다. 일부 웹사이트는 다른 웹사이트보다 데이터를 가져오기가 더 어려울 수 있습니다. PDF 파일을 다운로드할 수 없는 경우 HTTP response status codes을 분석하여 무엇이 잘못되었는지 확인하십시오. 이 글이 도움이 되셨다면 댓글을 남겨주세요.

GitHub에서 사용 가능한 코드

Reference

이 문제에 관하여(Python을 사용하여 PDF 파일 다운로드), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://dev.to/seraph776/download-pdf-files-using-python-4064

텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.

우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)

좋은 웹페이지 즐겨찾기

개발자 우수 사이트 수집

개발자가 알아야 할 필수 사이트 100선 추천 우리는 당신을 위해 100개의 자주 사용하는 개발자 학습 사이트를 정리했습니다