python 파충류 의 scrapy 프레임 명령 행 (슈퍼 상세)

지식 포인트 1. 프로젝트 생 성

scrapy startproject testproject
#  testproject

출력 결 과 는:


C:\Users\qs418>scrapy startproject testproject
New Scrapy project 'testproject', using template directory 'd:\\python_exe\\lib\\site-packages\\scrapy\\templates\\project', created in:
    C:\Users\qs418\testproject

You can start your first spider with:
    cd testproject
    scrapy genspider example example.com

지식 포인트 2. 프로젝트 에 입장 중 ·:

cd testproject

지식 포인트 3. spider 생 성

scrapy genspider baidu www.baidu.com
 #     baidu spider

출력 결 과 는;

Created spider 'baidu' using template 'basic' in module:
  testproject.spiders.baidu

지식 포인트 4. 각종 템 플 릿 이해

scrapy genspider -l

출력 결과

Available templates:
  basic
  crawl
  csvfeed
  xmlfeed

지식 포인트 5, 템 플 릿 지정

scrapy genspider -t crawl zhihu www.zhihu.com

출력 결과:



C:\Users\qs418>scrapy genspider -t crawl zhihu www.zhihu.com
Created spider 'zhihu' using template 'crawl'

6. 필기 crawl 학습: spider 를 실행 하 는 방법, 실행 중인 spider 의 이름 을 지정 할 수 있 습 니 다. 예 를 들 어:

scrapy crawl zhihu.py

check: 코드 에 오류 가 있 는 지 확인 하 는 데 사용 합 니 다.

scrapy check zhihu.py

scrapy list: 항목 의 모든 이름 을 되 돌려 줍 니 다 scrapy edit: 명령 행 에서 fetch 를 편집 합 니 다. 웹 소스 코드 를 되 돌려 줍 니 다. response 와 같 습 니 다.

scrapy fetch http://www.baidu.com

로그 지우 기: headers 받 기

scrapy fetch --nolog --headers http://www.baidu.com

출력 결과:


C:\Users\qs418>scrapy fetch --nolog --headers http://www.baidu.com
> Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
> Accept-Language: en
> User-Agent: Scrapy/1.5.1 (+https://scrapy.org)
> Accept-Encoding: gzip,deflate
>
< Cache-Control: private, no-cache, no-store, proxy-revalidate, no-transform
< Content-Type: text/html
< Date: Thu, 02 Aug 2018 04:36:31 GMT
< Last-Modified: Mon, 23 Jan 2017 13:27:32 GMT
< Pragma: no-cache
< Server: bfe/1.0.8.18
< Set-Cookie: BDORZ=27315; max-age=86400; domain=.baidu.com; path=/

방향 변경 금지: - no redicrect

scrapy fetch --no-direct http://www.baidu.com

view: 웹 페이지 를 파일 로 저장 하고 열 면 자동 테스트 에 사용 할 수 있 습 니 다.

scrapy view http://www.baidu.com

셸: 명령 행 모드 의 상호작용 을 하고 사용 가능 한 변 수 를 되 돌려 줍 니 다.

scrapy shell http://www.baidu.com

parse: 일부 매개 변 수 를 입력 하여 되 돌아 오 는 결 과 를 봅 니 다. 출력 seetings 를 포맷 하 는 것 과 같 습 니 다. 현재 설정 정 보 를 가 져 옵 니 다.

scrapy settings -h
# -h

출력:

C:\Users\qs418\quotetutorial>scrapy settings -h
Usage
=====
  scrapy settings [options]

Get settings values

Options
=======
--help, -h              show this help message and exit
--get=SETTING           print raw setting value
--getbool=SETTING       print setting value, interpreted as a boolean
--getint=SETTING        print setting value, interpreted as an integer
--getfloat=SETTING      print setting value, interpreted as a float
--getlist=SETTING       print setting value, interpreted as a list

Global Options
--------------
--logfile=FILE          log file. if omitted stderr will be used
--loglevel=LEVEL, -L LEVEL
                        log level (default: DEBUG)
--nolog                 disable logging completely
--profile=FILE          write python cProfile stats to FILE
--pidfile=FILE          write process ID to FILE
--set=NAME=VALUE, -s NAME=VALUE
                        set/override setting (may be repeated)
--pdb                   enable pdb on failure

C:\Users\qs418\quotetutorial>

runspider: spider 실행

scrapy runspider  baidu.py

version: scrapy 버 전 출력

scrapy version -v

bench: 현재 파충류 의 속도 측정

이 내용에 흥미가 있습니까?

현재 기사가 여러분의 문제를 해결하지 못하는 경우 AI 엔진은 머신러닝 분석(스마트 모델이 방금 만들어져 부정확한 경우가 있을 수 있음)을 통해 가장 유사한 기사를 추천합니다:

로마 숫자를 정수로 또는 그 반대로 변환

그 중 하나는 로마 숫자를 정수로 변환하는 함수를 만드는 것이었고 두 번째는 그 반대를 수행하는 함수를 만드는 것이었습니다. 문자만 포함합니다'I', 'V', 'X', 'L', 'C', 'D', 'M' ; 문자열이 ...

텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.

CC BY-SA 2.5, CC BY-SA 3.0 및 CC BY-SA 4.0에 따라 라이센스가 부여됩니다.

제6 주 프로젝트 1

Activity는 Handler를 이용하여 Thread와 통신하여 간단한 데모를 썼다

좋은 웹페이지 즐겨찾기

개발자 우수 사이트 수집

개발자가 알아야 할 필수 사이트 100선 추천 우리는 당신을 위해 100개의 자주 사용하는 개발자 학습 사이트를 정리했습니다