Python Crawler
Python Tutorial
Packages
Requests
1 | import requests |
BeautifulSoup
Iteration
小错误
Information Extraction
Find_all
定向爬取
中文字符
chr(12288)
tplt = “{0:^10}\t{1:{3}^10}\t{2:^10}”
Scrapy
- Start a scrapy project
Shell 1
scrapy startproject python123demo
python123demo/
: root directoryscrapy.cfg
: configuration file of scrapypython123demo/:
customer code__init__.py
: Initialization scriptitems.py
: Items code template (Inheritance)middlewares.py
: Middlewares code template (Inheritance)piplines.py
: Pipelines code template (Inheritance)settings.py
: Settings of projectspiders/
: Index of code templates (Inheritance)
- Generate a scrapy spider
Shell 1
2cd python123demo
scrapy genspider demo python123.io - Review demo.py
python123demo/spider/demo.py 1
2
3
4
5
6
7
8
9
10
11# -*- coding: utf-8 -*-
import scrapy
class DemoSpider(scrapy.Spider):
name = 'demo'
allowed_domains = ['python123.io']
start_urls = ['http://python123.io/']
def parse(self, response):
pass - Configure the spider
python123demo/spider/demo.py 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15# -*- coding: utf-8 -*-
import scrapy
class DemoSpider(scrapy.Spider):
name = 'demo'
#allowed_domains = ['python123.io']
start_urls = ['http://python123.io/ws/demo.html']
def parse(self, response):
fname = response.url.split('/')[-1]
with open(fname, 'wb') as f:
f.write(response.body)
self.log('Saved file {0}.'.format(name))
pass - Run the spider
Shell 1
2pwd
scrapy crawl demo
1 | # -*- coding: utf-8 -*- |
Return
sends a specified value back to its caller whereas Yield
can produce a sequence of values. We should use yield when we want to iterate over a sequence, but don’t want to store the entire sequence in memory.
Yield are used in Python generators
. A generator function is defined like a normal function, but whenever it needs to generate a value, it does so with the yield keyword rather than return. If the body of a def contains yield, the function automatically becomes a generator function.
1 | def gen(n): |
Previous Close | 208.08 |
Open | 211.70 |
Bid | 225.60 x 800 |
Ask | 225.70 x 900 |
Day's Range | 202.32 - 224.46 |
52 Week Range | 60.97 - 224.46 |
Volume | 32,390,121 |
Avg. Volume | 14,473,820 |
Market Cap | 63.121B |
Beta (5Y Monthly) | N/A |
PE Ratio (TTM) | 6,396.29 |
EPS (TTM) | 0.04 |
Earnings Date | Jun 02, 2020 |
Forward Dividend & Yield | N/A (N/A) |
Ex-Dividend Date | N/A |
1y Target Est | 132.07 |
Fair Value
Related Research
- Analyst Report: Zoom Video Communications, Inc.
- Technical Assessment: Bearish in the Intermediate-Term
One More Thing
Python Algorithms - Words: 2,640
Python Crawler - Words: 1,663
Python Data Science - Words: 4,551
Python Django - Words: 2,409
Python File Handling - Words: 1,533
Python Flask - Words: 874
Python LeetCode - Words: 9
Python Machine Learning - Words: 5,532
Python MongoDB - Words: 32
Python MySQL - Words: 1,655
Python OS - Words: 707
Python plotly - Words: 6,649
Python Quantitative Trading - Words: 353
Python Tutorial - Words: 25,451
Python Unit Testing - Words: 68