Skip to content

This code uses Selenium and Beautiful Soup to scrape job listings information from Indeed.com. It opens a webdriver, enters a search query, clicks the search button, and uses Beautiful Soup to parse the HTML content. The script extracts relevant information, including job title, company, and location, and outputs it to the console.

Notifications You must be signed in to change notification settings

PKHarsimran/Indeed_JobTitle_Scrapper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 

Repository files navigation

Job Scraper using Selenium and Beautiful Soup

A simple Python script that scrapes job listings from the Indeed website using Selenium and Beautiful Soup.

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Prerequisites

You will need the following libraries installed:

click
openpyxl
pandas
selenium
bs4

You can install them using pip with the following command:

    pip install selenium <library name for example:  pip install selenium bs4 >

Running the script

Clone this repository to your local machine and run the script using Python:

    python scrape_indeed.py --driver [DriverClass] --search-query [search-query] --search-location [search-location]

Options

--driver (required) - The webdriver to use. Can be one of the following: Firefox, Chrome, Safari, or Edge.

--search-query (required) - The query to search for on indeed.com.

--search-location (required) - The location to search from indeed.com. Can be in the format of a city and country or Remote.

The script uses the selected webdriver to navigate to indeed.com, enter the search query and location, and scrape job information from the results page. The job information is then printed to the console and stored in an Excel workbook. If a workbook with the same name already exists, the data is appended to the existing workbook.

Built With

  • Selenium - A web testing framework used to automate web browsers
  • Beautiful Soup - A library used to extract data from HTML and XML files
  • Click - a Python library for creating beautiful and user-friendly command line interfaces.
  • pandas - a library in Python used for data analysis and manipulation, providing high-performance, easy-to-use data structures and data analysis tools.
  • openpyxl - "openpyxl" is a library for reading and writing Excel 2010 files (i.e. .xlsx) using the openpyxl module.

About

This code uses Selenium and Beautiful Soup to scrape job listings information from Indeed.com. It opens a webdriver, enters a search query, clicks the search button, and uses Beautiful Soup to parse the HTML content. The script extracts relevant information, including job title, company, and location, and outputs it to the console.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages