Skip to content

This project is focused on scraping millions of emails dynamically from thousands of web pages automatically from the website [fredmiranda.com](https://fredmiranda.com/). The goal of this project is to create a dataset of email addresses that can be used for various purposes.

Notifications You must be signed in to change notification settings

Anas1108/Scrap-Millions-of-Emails

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Scrap Millions of Emails using Selenium and Python

Introduction

This project is focused on scraping millions of emails dynamically from thousands of web pages automatically from the website fredmiranda.com. The goal of this project is to create a dataset of email addresses that can be used for various purposes.

Tools

  • Selenium
  • Python
  • IPython Notebook
  • CSV library

Usage

  1. Install Selenium and its dependencies:
  • pip install selenium
  1. Clone or download the repository from GitHub.
  2. Open the email_scraping.ipynb file using IPython Notebook.
  3. Install the required libraries mentioned in the first cell of the notebook.
  4. Change the URL in the url variable to the desired web page you want to scrape emails from.
  5. Run all the cells in the notebook. The code will start scraping emails from the web page and will keep running until all the pages are scraped.
  6. Once the code is finished running, a CSV file named email_dataset.csv will be created in the same directory as the notebook. The file will contain the email addresses scraped from the website.

Conclusion

This project demonstrates how to use Selenium and Python to scrape emails dynamically from thousands of web pages automatically. The dataset created can be used for various purposes and can also be easily exported to other formats.

About

This project is focused on scraping millions of emails dynamically from thousands of web pages automatically from the website [fredmiranda.com](https://fredmiranda.com/). The goal of this project is to create a dataset of email addresses that can be used for various purposes.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published