Installing Selenium and ChromeDriver on OS X
Want to use Selenium to scrape with Chrome OS X? Let’s do it!
We’ll need to install a couple things:
- Selenium, which allows you to control browsers from Python
- ChromeDriver, which allows software to control Chrome (like Selenium!)
Installing ChromeDriver
STEP ONE: Downloading ChromeDriver
First, download ChromeDriver from its terribly ugly site. It looks like a scam or like it was put together by a 12 year old, but I promise it’s good and cool and nice.
You’ll want chromedriver_mac64.zip
. That link should download 2.40
, but if you want something more recent just go to the page and download the right thing.
STEP TWO: Unzipping ChromeDriver
Unzip chromedriver_mac64.zip
and it will give you a file called chromedriver
(no extension). This is the magic software!
STEP THREE: Moving ChromeDriver somewhere sensible
Now we need to move ChromeDriver somewhere that Python and Selenium will be able to find it (a.k.a. in your PATH
).
You could do this from the command line, but newer versions of OS X are… problematic, so let’s just do it using Finder.
- Open up a new Finder window (the file browsing thing)
- From the top menu select Go > Go to Folder…
- Type
/usr/local/bin
and click Go - Copy the
chromedriver
file to this folder. If things are going right, it’ll ask you for your password.
Done and done!
Installing Selenium
If you google about Selenium, a lot of the time you see things about “Selenium server” and blah blah blah - you don’t need that, you aren’t running a huge complex of automated browser testing machines. You don’t need that. We just need plain ol’ Selenium.
Let’s use pip3
to install Selenium for Python 3.
pip install selenium
Installing Chrome
Oh, you also need to make sure you have Chrome installed and is in the Applications
folder. If it isn’t in Applications
then ChromeDriver and Selenium won’t be able to find it.
Test it
Want to make sure it works? Run the following to pull all of the headlines from the New York Times homepage.
from selenium import webdriver
driver = webdriver.Chrome()
driver.get("https://www.nytimes.com")
headlines = driver.find_elements_by_class_name("story-heading")
for headline in headlines:
print(headline.text.strip())