Skip to content

Latest commit

 

History

History
28 lines (21 loc) · 2.03 KB

File metadata and controls

28 lines (21 loc) · 2.03 KB

Scraping-Sneakers-with-BeautifulSoup

pexels-introspectivedsgn-4263994 Source: Pexels.com

Project Description

This project involves scraping data from sneakers.123 website (sneakers123.com) to gather information about sneakers including Brand, Name, ID, Number of Stores, Availability, Price, and Discount. The goal is to collect a dataset of over 200,000 entries, which will then be cleaned to produce a refined dataset of approximately 150,000 entries for further analysis.

What Im Doin

In this project, we conducted web scraping on sneakers.123 website to extract essential data related to sneakers. The data collected includes the brand, name, ID, number of stores selling the sneaker, availability, price, and discount. We aimed to create a comprehensive dataset that captures the diverse range of sneakers available on the website.

Scraping Process

We utilized web scraping techniques using Python and libraries such as Beautiful Soup and Requests to automate the extraction of data from sneakers.123 website. This process involved navigating through web pages, parsing HTML content, and extracting relevant information from each sneaker listing.

Data Cleaning

After gathering the raw data, we performed data cleaning procedures to refine the dataset. This involved handling missing values, standardizing formats, removing duplicates, and ensuring data integrity. The goal was to prepare a clean dataset suitable for analysis and modeling.

Dataset Overview

Original Dataset Size: 200,000+ entries Refined Dataset Size: Approximately 150,000 entries

Technologies Used

  • Python
  • Beautiful Soup
  • Requests
  • Pandas

Next Steps

The cleaned dataset is now ready for further analysis and insights. We can explore trends in sneaker popularity, pricing dynamics, brand preferences, and more. The refined dataset provides a solid foundation for conducting statistical analysis and building machine learning models.