SerpApi에서 Google Reverse Images API 사용

39073 단어 programming python webscraping tutorial

What is Google Reverse Images

Simple Hello World

Detailed Code

Prerequisites

Code Explanation

Why using API?

Links

구글 리버스 이미지란?

간단히 말해서 웹에서 시각적으로 유사한 이미지를 빠르게 찾을 수 있도록 도와줍니다. 사용자는 데스크톱/모바일에서 사진을 업로드하거나 사진의 URL을 붙여넣을 수 있으며 다른 웹사이트에서 사용되는 관련 이미지와 동일한 사진의 다양한 크기를 거의 즉시 보여줍니다.

예를 들어 마인크래프트 베개 사진을 찍어(URL 붙여넣기) 정보 또는 기타 유사한 이미지를 검색하는 데 사용할 수 있습니다.

결과에는 다음이 포함될 수 있습니다.

이미지의 개체에 대한 검색 결과입니다.

유사한 이미지.

이미지 또는 유사한 이미지가 있는 웹사이트.

이 블로그 게시물에서는 SerpApi 기능을 활용하여 리버스 이미지 결과에서 데이터를 구문 분석하는 방법을 보여줍니다. 우리는 훨씬 더 빠른 브라우저 자동화 없이 그것을 하고 있습니다.

주어진 이미지를 기반으로 Google에서 반환되는 결과의 예:

일반적인 반전 이미지 프로세스:

간단한 헬로 월드

from serpapi import GoogleSearch
import os, json

image_url = "https://user-images.githubusercontent.com/81998012/182214192-59dfb3fe-522c-4979-bb42-9f8091dfd9d6.jpg"

params = {
    # https://docs.python.org/3/library/os.html#os.getenv
    "api_key": os.getenv("API_KEY")     # your serpapi api key
    "engine": "google_reverse_image",   # SerpApi search engine
    "image_url": image_url,             # image URL to perform a reverse search
    "hl": "en",                         # language of the search
    "gl": "us"                          # country of the search
    # other parameters
}

search = GoogleSearch(params)           # where data extraction happens on the SerpApi backend
results = search.get_dict()             # JSON -> Python dictionary

# ["image_results"] is basically a Google organic results
print(json.dumps(results["image_results"], indent=4, ensure_ascii=False))

상세 코드

from serpapi import GoogleSearch
import os, json

image_urls = [
    "https://user-images.githubusercontent.com/81998012/182214192-59dfb3fe-522c-4979-bb42-9f8091dfd9d6.jpg",
    "https://user-images.githubusercontent.com/81998012/182025185-27df7683-24d5-4747-904b-9f3a6045705b.jpg",
    "https://user-images.githubusercontent.com/81998012/182025195-fec95c5c-aee1-448b-9165-ce9dc1b77a56.jpg",
    "https://user-images.githubusercontent.com/81998012/182027073-4b09a0b7-ec55-415f-bcb0-7a457e87c0b4.jpg",
    "https://user-images.githubusercontent.com/81998012/182025215-ce739965-5c4f-4735-8581-566e03b609f2.jpg",    
]


def main():
    google_reverse_image_data = {}

    for index, image_url in enumerate(image_urls, start=1):
        google_reverse_image_data[f"results for image {index}"] = {}

        params = {
            # https://docs.python.org/3/library/os.html#os.getenv
            "api_key": os.getenv("API_KEY")     # your serpapi api key
            "engine": "google_reverse_image",   # SerpApi search engine
            "image_url": image_url,             # image URL to perform a reverse search
            "location": "Dallas",               # location from where search comes from
            "hl": "en",                         # language of the search
            "gl": "us"                          # country of the search
            # other parameters
        }

        search = GoogleSearch(params)           # where data extraction happens on the SerpApi backend
        results = search.get_dict()             # JSON -> Python dictionary

        # some queries may not include this information
        if results["knowledge_graph"]:
            knowledge_graph = {}

            knowledge_graph["title"] = results["knowledge_graph"]["title"]
            knowledge_graph["description"] = results["knowledge_graph"]["description"]

            google_reverse_image_data[f"results for image {index}"]["knowledge_graph"] = knowledge_graph

        # some queries may not include organic results
        if results["image_results"]:
            google_reverse_image_data[f"results for image {index}"]["organic_results"] = []

            for result in results["image_results"]:
                image_results = {}

                image_results["position"] = result["position"]
                image_results["title"] = result["title"]
                image_results["link"] = result["link"]
                image_results["snippet"] = result["snippet"]

                google_reverse_image_data[f"results for image {index}"]["organic_results"].append(image_results)

        # some queries may not include this information
        if results["inline_images"]:
            google_reverse_image_data[f"results for image {index}"]["inline_images"] = []

            for result in results["inline_images"]:
                google_reverse_image_data[f"results for image {index}"]["inline_images"].append({
                    "source": result["source"],
                    "thumbnail": result["thumbnail"]
                })

    return google_reverse_image_data


if __name__ == "__main__":
    print(json.dumps(main(), indent=4, ensure_ascii=False))

전제 조건

pip install google-search-results

코드 설명

라이브러리 가져오기:

from serpapi import GoogleSearch
import os, json

도서관
목적

os
환경 변수(SerpApi API 키) 값을 반환합니다.

json
추출된 데이터를 JSON 개체로 변환합니다.
GoogleSearchSerpApi 웹 스크래핑 라이브러리를 사용하여 Google 결과를 스크랩하고 파싱합니다.

그런 다음 데이터를 검색할 URLlist이 있어야 합니다(반복 가능한 모든 것이 될 수 있음).

image_urls = [
    "https://user-images.githubusercontent.com/81998012/182214192-59dfb3fe-522c-4979-bb42-9f8091dfd9d6.jpg",
    "https://user-images.githubusercontent.com/81998012/182025185-27df7683-24d5-4747-904b-9f3a6045705b.jpg",
    "https://user-images.githubusercontent.com/81998012/182025195-fec95c5c-aee1-448b-9165-ce9dc1b77a56.jpg",
    "https://user-images.githubusercontent.com/81998012/182027073-4b09a0b7-ec55-415f-bcb0-7a457e87c0b4.jpg",
    "https://user-images.githubusercontent.com/81998012/182025215-ce739965-5c4f-4735-8581-566e03b609f2.jpg",    
]

다음으로 main 함수(선택 사항)를 생성하고 추출된 데이터를 저장할 임시 함수dict를 생성해야 합니다.

def main():
    google_reverse_image_data = {}

다음 단계에서는 모든 image_urls를 반복하고 해당 값을 "image_url" params 키에 전달해야 합니다.

for index, image_url in enumerate(image_urls, start=1):
    google_reverse_image_data[f"results for image {index}"] = {}

    params = {
        "engine": "google_reverse_image",   # SerpApi search engine
        "image_url": image_url,             # image URL to perform a reverse search
        "location": "Dallas",               # location from where search comes from
        "hl": "en",                         # language of the search
        "gl": "us",                         # country of the search
        # https://docs.python.org/3/library/os.html#os.getenv
        "api_key": os.getenv("API_KEY"),    # your serpapi api
    }

    search = GoogleSearch(params)           # where data extraction happens on the SerpApi backend
    results = search.get_dict()

암호
설명

enumerate()
iterable에 카운터를 추가하고 반환합니다. 이 경우 어떤 결과가 어떤 이미지에 속하는지 보다 명시적으로 표시하는 데 사용됩니다.

이제 원하는 특정 데이터가 반환되는지 확인해야 합니다if. 이 경우 knowledge_graph , organic_results ( image_results ) 및 inline_images 만 확인합니다.

if results["knowledge_graph"]:
    knowledge_graph = {}

    knowledge_graph["title"] = results["knowledge_graph"]["title"]
    knowledge_graph["description"] = results["knowledge_graph"]["description"]

    google_reverse_image_data[f"results for image {index}"]["knowledge_graph"] = knowledge_graph

if results["image_results"]:
    google_reverse_image_data[f"results for image {index}"]["organic_results"] = []

    for result in results["image_results"]:
        image_results = {}

        image_results["position"] = result["position"]
        image_results["title"] = result["title"]
        image_results["link"] = result["link"]
        image_results["snippet"] = result["snippet"]

        google_reverse_image_data[f"results for image {index}"]["organic_results"].append(image_results)

if results["inline_images"]:
    google_reverse_image_data[f"results for image {index}"]["inline_images"] = []

    for result in results["inline_images"]:
        google_reverse_image_data[f"results for image {index}"]["inline_images"].append({
            "source": result["source"],
            "thumbnail": result["thumbnail"]
        })

이제 데이터를 반환해야 합니다.

return google_reverse_image_data

마지막 단계는 데이터를 확인하고 인쇄하기 위해 Python 관용구를 추가하는 것입니다.

if __name__ == "__main__":
    print(json.dumps(main(), indent=4, ensure_ascii=False))

산출:

{
    "results for image 1": {
        "knowledge_graph": {
            "title": "Stairs",
            "description": "Stairs are a structure designed to bridge a large vertical distance by dividing it into smaller vertical distances, called steps. Stairs may be straight, round, or may consist of two or more straight pieces connected at angles. Types of stairs include staircases, ladders, and escalators."
        },
        "organic_results": [
            {
                "position": 1,
                "title": "Stairs - Wikipedia",
                "link": "https://en.wikipedia.org/wiki/Stairs",
                "snippet": "Stairs are a structure designed to bridge a large vertical distance by dividing it into smaller vertical distances, called steps. Stairs may be straight, ..."
            }, ... other organic results
            {
                "position": 4,
                "title": "Foto HD de Claudio Schwarz - Unsplash",
                "link": "https://unsplash.com/es/fotos/ipcsI15th5I",
                "snippet": "Nuevo: Unsplash ahora está disponible en varios idiomas. Puedes volver a cambiar al inglés cuando quieras. Próximamente habrá más idiomas."
            }
        ],
        "inline_images": [
            {
                "source": "https://www.flickr.com/photos/thepiratesgospel/6107309586/",
                "thumbnail": "https://serpapi.com/searches/62e907cb5b54ef5af08d6ff2/images/6886d2b2c5499da05d656e39563b001cc2a1b485150c7a5761ed1190edbccb0f.jpeg"
            }, ... other thumbnails
            {
                "source": "https://en-gb.facebook.com/qualitycarpetsdirect/posts/",
                "thumbnail": "https://serpapi.com/searches/62e907cb5b54ef5af08d6ff2/images/6886d2b2c5499da08e88e75176317a1c08b36d2c2800f2329de12d162eab24e9.jpeg"
            }
        ]
    }, ... other results
    "results for image 5": {
        "knowledge_graph": {
            "title": "Art",
            "description": "Art is a diverse range of human activity, and resulting product, that involves creative or imaginative talent expressive of technical proficiency, beauty, emotional power, or conceptual ideas."
        },
        "organic_results": [
            {
                "position": 1,
                "title": "Art.com | Wall Art: Framed Prints, Canvas Paintings, Posters ...",
                "link": "https://www.art.com/",
                "snippet": "Shop Art.com for the best selection of wall art and photo prints online! Low price guarantee, fast shipping & free returns, and custom framing options ..."
            },
            {
                "position": 2,
                "title": "Art - Wikipedia",
                "link": "https://en.wikipedia.org/wiki/Art",
                "snippet": "Art is a diverse range of human activity, and resulting product, that involves creative or imaginative talent expressive of technical proficiency, beauty, ..."
            }
        ],
        "inline_images": [
            {
                "source": "https://www.leireunzueta.com/journal/tag/summer",
                "thumbnail": "https://serpapi.com/searches/62e907d0e13508b8c60f4c3b/images/6c0c95a05f3c4aa45e83ffe98a6112df67130eb20d484feef2c133f72ab49a3f.jpeg"
            }, ... other thumbnails
            {
                "source": "https://unsplash.com/photos/onMwdrVfMuE",
                "thumbnail": "https://serpapi.com/searches/62e907d0e13508b8c60f4c3b/images/6c0c95a05f3c4aa4deae63325b5a810305c9d32f794fc72c1849753080f116fa.jpeg"
            }
        ]
    }
}

왜 API를 사용합니까?

파서를 처음부터 만들고 유지 관리할 필요가 없습니다.

Google의 차단 우회: CAPTCHA를 해결하거나 IP 차단을 해결합니다.

프록시 및 CAPTCHA 솔버에 대한 비용을 지불합니다.

브라우저 자동화를 사용할 필요가 없습니다.

SerpApi는 브라우저 자동화 없이 매우 빠른 응답 시간으로 백엔드의 모든 것을 처리하므로 훨씬 빨라집니다.

평균은 20개의 검색 쿼리를 기준으로 ~2.07초입니다(스크린샷에는 15개의 검색 쿼리가 표시됨).

연결

Code in the online IDE

Google Reverse Image API

원래 SerpApi에 게시됨: https://serpapi.com/blog/scrape-google-finance-markets-in-python/

가입 |

Feature Request 💫 또는 Bug 🐞 추가

Reference

이 문제에 관하여(SerpApi에서 Google Reverse Images API 사용), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://dev.to/chukhraiartur/using-google-reverse-images-api-from-serpapi-17f7

텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.

우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)

VBA : "조건부 서식"을 특정 열을 조건으로 범위로 설정

dotenv를 사용하여 Node.js 환경 변수를 설정하는 방법

좋은 웹페이지 즐겨찾기

개발자 우수 사이트 수집

개발자가 알아야 할 필수 사이트 100선 추천 우리는 당신을 위해 100개의 자주 사용하는 개발자 학습 사이트를 정리했습니다