Web Scraping: More Selenium & Ways To Interact With Websites (Python)

wordpress_logon function

Yesterday I introduced Selenium to my web scraping fun, see here: https://geektechstuff.com/2019/05/07/web-scraping-introducing-selenium-python/ if you missed it.

Today I am going to use Selenium to interact with a website by:

  • Opening a webpage
  • Logging in
  • Retrieving information from the page

As much as I’ve enjoyed using Wikipedia in my previous examples (and if you too are using Wikipedia please consider donating to them) today I am going to look at logging into WordPress.com.

geektechstuff.com is hosted on WordPress.com and every so often I like seeing how many daily visitors my site has. So I thought it would be cool if I created a Python program to do this for me.

wordpress_logon function
wordpress_logon function

 

def wordpress_login():
   # import selenium webdriver
   from selenium import webdriver
   import time
   # logon details
   username_wp = “geektechstuff@virginmedia.com”
   password_wp = “”
   # set browser / browser options
   browser = webdriver.safari.webdriver.WebDriver(quiet=False)
   # get page
   # delay to give page chance to load
   time.sleep(5)
   # Fields by ID for username page
   username_field = browser.find_element_by_id(‘usernameOrEmail’)
   username_field.send_keys(username_wp)
   # element by XPath
   login_button = browser.find_element_by_xpath(“//button[text()=’Continue’]”)
   login_button.click()
   time.sleep(5)
   password_field = browser.find_element_by_id(“password”)
   password_field.send_keys(password_wp)
   login_button.click()
   time.sleep(5)
   wp_hub = browser.find_element_by_xpath(“//*[@id=’header’]/a[1]”)
   wp_hub.click()
   time.sleep(5)
   # looking for visitors stats
   # element by CSS selector
   stat = browser.find_element_by_css_selector(“#my-stats-content > div.card.stats-    module.is-chart-tabs > ul > li:nth-child(2) > a > span.value”)
   stat_value = stat.text
   time.sleep(5)
   print(“Total number of visitors so far today =”, stat_value)
   browser.quit()

I’ve uploaded a video to show the function in action, please note the output on the terminal:

 

Selenium allows for elements to be selected by various functions including:

  • CSS Selector
  • XPath
  • ID
  • Link Text

This function uses a few different element method selections to get the task done, but you may be wondering how to find out where to get the details for these from. I’ve put a very brief video together to show how I accessed them via Safari’s Develop menu.