TIL Python Basics Day 48 - Selenium
Selenium
Css selector tip: https://saucelabs.com/resources/articles/selenium-tips-css-selectors
Installing driver
- Selenium package interact with different browsers(Chrome, firefox, safari etc), Driver provides the bridge. (we use driver for chrome)
from selenium import webdriver
#install selenium
chrome_driver_path = "C:\dayeon2020\chromedriver.exe" #for mac: no .exe
driver = webdriver.Chrome(executable_path=chrome_driver_path)
driver.get("https://www.amazon.com/Cuckoo-
CRP-P0609S-Cooker-10-10-11-60/dp/B01JRTZVVM/ref=sr_1_4?qid=1611738512&sr=8-4")
driver.close() #single window
# driver.quit() #entire program. regardless how many tabs
Locating
Find and locate HTML elements
- BS has its limits when the website is written JS, Angular etc
driver.get("https://www.amazon.com/Cuckoo-
CRP-P0609S-Cooker-10-10-11-60/dp/B01JRTZVVM/ref=sr_1_4?qid=1611738512&sr=8-4")
price = driver.find_element_by_id("priceblock_ourprice")
print(price.text)
driver.quit() #entire program. regardless how many tabs
Locating with class name
search bar
logo
from selenium import webdriver
#install selenium
chrome_driver_path = "C:\dayeon2020\chromedriver.exe"
driver = webdriver.Chrome(executable_path=chrome_driver_path)
driver.get("https://www.python.org/")
#search_bar
search_bar = driver.find_element_by_name("q")
print(search_bar.tag_name) #print: input
print(search_bar.get_attribute("placeholder")) #print: search
#logo
logo = driver.find_element_by_class_name("python-logo")
print(logo.size) #print: {'height': 72, 'width': 255}
driver.quit() #entire program. regardless how many tabs
Locating with selector
- Be aware of (".documentation-widget a") dot before the selector. Even though the full class name was "small-widget documentation-widget"
#selector
driver.get("https://www.python.org/")
documentation_link = driver.find_element_by_css_selector(".documentation-widget a")
print(documentation_link.text) #print: docs.python.org
driver.close()
X-path
- XPath can be used to navigate through elements and attributes in an XML document.
- Right click and copy xpath. We need to change the double quotes inside xpath into single quotes as it will crash with double quotes outside.
driver.get("https://www.python.org/")
bug_link = driver.find_element_by_xpath("//*[@id='site-map']/div[2]/div/ul/li[3]/a")
print(bug_link.text) #print: Submit Website Bug
Differnece btw find_element's' & find_element
find_elements: returns all that matchaes in a LIST form
Article: Locating strategy
- find_elements
events = driver.find_elements_by_xpath('//*[@id="content"]/div/section/div[2]/div[2]/div/ul')
print(events)
# -> print object in a list
[<selenium.webdriver.remote.webelement.WebElement
(session="0621dbfca2d047fd85c6dba14c554b6c", element="fac0caf1-8b8a-4ee0-9f2d-5d347cb3df72")>]
print(type(events))
# -> <class 'list'>
for i in events:
print(i.text)
- find_element
use" .text.splitlines()" to put each items in a list when there are mutiple lists under ul element
event = driver.find_element_by_xpath('//*[@id="content"]/div/section/div[2]/div[2]/div/ul')
print(event)
#-> object
<selenium.webdriver.remote.webelement.WebElement
(session="5107d054b8b39ed4b87b64ddb660c292", element="4c2f8ec2-f025-4fab-a2e3-2b6f99ef154e")>
print(type(event))
#-> <class 'selenium.webdriver.remote.webelement.WebElement'>
Task. scraping events section at Python.org (module #414)
In my code, range(1, 6) is not scalable.
Angela used css selector after inspecting the structure of the website.
Selecting the right method will only come only with experience and knowledge in HTML&CSS
<MY CODE>
from selenium import webdriver
chrome_driver_path = "C:\dayeon2020\chromedriver.exe"
driver = webdriver.Chrome(executable_path=chrome_driver_path)
driver.get("https://www.python.org/")
events = {}
for i in range(1, 6):
time = driver.find_element_by_xpath(f"//*[@id='content']/div/section/div[2]/div[2]/div/ul/li[{i}]/time")
name = driver.find_element_by_xpath(f"//*[@id='content']/div/section/div[2]/div[2]/div/ul/li[{i}]/a")
events[i - 1] = {
'time': f"2021-{time.text}",
'name': name.text
}
print(events)
#print: {0: {'time': '2021-01-30', 'name': 'BelPy 2021'},
1: {'time': '2021-01-30', 'name': 'PyCamp Leipzig'},
2: {'time': '2021-02-19', 'name': 'PyCascades 2021'},
3: {'time': '2021-03-18', 'name': 'PyCon Cameroon 2021'},
4: {'time': '2021-04-22', 'name': 'GeoPython 2021'}}
driver.quit()
Angela's
<Angela's code>
event_times = driver.find_elements_by_css_selector(".event-widget time")
event_names = driver.find_elements_by_css_selector(".event-widget li a")
print(event_times) #-> selenium object
events = {}
for n in range(len(event_times)):
events[n] = {
"time": event_times[n].text,
"name": event_names[n].text
}
print(events)
#print: {0: {'time': '01-30', 'name': 'BelPy 2021'},
1: {'time': '01-30', 'name': 'PyCamp Leipzig'},
2: {'time': '02-19', 'name': 'PyCascades 2021'},
3: {'time': '03-18', 'name': 'PyCon Cameroon 2021'},
4: {'time': '04-22', 'name': 'GeoPython 2021'}}
From Q&A section using dictionary comprehension
Using dictionary comprehension. Something I innitially tried and failed.
.splitlines() : single items into a list
range(range(0, len(events), 2)) : scalable way! zero to end of the list, every 2 steps
events = driver.find_element_by_xpath(
'//*[@id="content"]/div/section/div[2]/div[2]/div/ul').text.splitlines()
print(events)
#['01-30', 'BelPy 2021', '01-30', 'PyCamp Leipzig',
'02-19', 'PyCascades 2021', '03-18', 'PyCon Cameroon 2021', '04-22', 'GeoPython 2021']
dictionary = {i: {'time': events[i], 'name': events[i + 1]} for i in range(0, len(events), 2)}
print(dictionary)
Hidden year problem
- The year is hidden by CSS if the window is too small.
When printing the text inside the "time" tag, only the text that is visible (i.e. text where the CSS property "visibility" is equal to "visible") will be printed.
However, on python.org the CSS property "visibility" of the year component of the event is set to "hidden" when the browser window is resized to a certain width.
This is why, when the browser window has a specific width, the year component of the event is not displayed.
Example 1:
driver.set_window_size(width=100, height=200)
driver.get("https://www.python.org/")
event_times = driver.find_elements_by_css_selector(".event-widget time")
for time in event_times:
print(time.text)
Example 2:
driver.maximize_window()
driver.get("https://www.python.org/")
event_times = driver.find_elements_by_css_selector(".event-widget time")
for time in event_times:
print(time.text)
Task. wikipedia
from selenium import webdriver
chrome_driver_path = "C:\dayeon2020\chromedriver.exe"
driver = webdriver.Chrome(executable_path=chrome_driver_path)
driver.get("https://en.wikipedia.org/wiki/Main_Page")
num = driver.find_element_by_id("articlecount")
print(num.text) #print 6,237,906 articles in English
num2 = driver.find_element_by_css_selector("#articlecount a")
print(num2.text) #print 6,237,906
driver.quit()
How to Automate Filling Out Forms and Clicking Buttons
with Selenium
click() with .find_element_by_link_text
Link sits in between anchor tags 'a'
from selenium import webdriver
chrome_driver_path = "C:\dayeon2020\chromedriver.exe"
driver = webdriver.Chrome(executable_path=chrome_driver_path)
driver.get("https://en.wikipedia.org/wiki/Main_Page")
count = driver.find_element_by_css_selector("#articlecount a")
# count.click()
all_fortals = driver.find_element_by_link_text("All portals")
all_fortals.click()
# driver.quit()
click() with send_keys()
we need to import Keys which has a bunch of different keys CONSTANT like ENTER, SHIFT, ALT etc
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
chrome_driver_path = "C:\dayeon2020\chromedriver.exe"
driver = webdriver.Chrome(executable_path=chrome_driver_path)
driver.get("https://en.wikipedia.org/wiki/Main_Page")
search = driver.find_element_by_name("search")
search.send_keys("python")
search.send_keys(Keys.ENTER)
#### task. filling in the signup form
top = driver.find_element_by_class_name("top")
top.send_keys("python")
middle = driver.find_element_by_class_name("middle")
middle.send_keys("lee")
bottom = driver.find_element_by_class_name("bottom")
bottom.send_keys("[email protected]")
# click = driver.find_element_by_class_name("btn-block")
# click.send_keys(Keys.ENTER)
Project: The Cookie Clicker Project
time()
Function time.time returns the current time in seconds since 1st Jan 1970. The value is in floating point, so you can even use it with sub-second precision. In the beginning the value t_end is calculated to be "now" + 15 minutes. The loop will run until the current time exceeds this preset ending time.
Try this:
import time
t_end = time.time() + 60 * 15
while time.time() < t_end:
# do whatever you do
This will run for 15 min x 60 s = 900 seconds.
word
on a similar vein
Author And Source
이 문제에 관하여(TIL Python Basics Day 48 - Selenium), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://velog.io/@daylee/TIL-Python-Basics-Day-48-Selenium저자 귀속: 원작자 정보가 원작자 URL에 포함되어 있으며 저작권은 원작자 소유입니다.
우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)