This is Part 1 of a two-part analysis exploring Bergen's bicycle rental trends, now implemented in Python! In this installment, we perform data cleaning, build basic predictive models, and end with an interactive map showcasing bike traffic across stations. Part 2 will delve deeper into advanced methods, integrating weather data for precision modeling and actionable insights.
- Bysykkel Data: Historical bike rental data provided by Bergen Bysykkel.
- Weather Data: Hourly weather observations obtained from SeKlima.
-
Python-Powered Data Cleaning:
- Combined hourly bike rental data for 2024 into a clean and structured format.
- Addressed missing values, ensured consistent timestamps, and added derived features (e.g., weekday, hour).
-
Foundational Predictive Modeling:
- Utilized Python's
scikit-learn
library for linear regression to predict hourly rentals by station, weekday, and time.
- Utilized Python's
-
Interactive Mapping:
- Created an engaging map using Python's
folium
library to visualize station activity, complete with color-coded markers and popups.
- Created an engaging map using Python's
- Integration of weather data to evaluate its influence on bike rentals.
- Advanced predictive models leveraging machine learning techniques.
- Comparative analysis of model performances.
Stay tuned as we transition from foundational techniques to advanced predictive analytics in Python!
- Python version >= 3.8
- Required libraries:
pandas
,numpy
,matplotlib
,seaborn
,scikit-learn
,folium
, etc.
The initial problem framing and directions for this project are inspired by the BAN400 course at the Norwegian School of Economics (NHH). To my knowledge, the earliest solution to this exam available on GitHub is:
- Hoa Nguyen’s Bergen Bysykkel Analysis, which utilized 2021 data and was implemented in R.
Building on these foundations, this project transitions the analysis to Python, leveraging tools like pandas
, scikit-learn
, and folium
to provide an enhanced and updated perspective using 2024 data.
A notable feature in this implementation is the use of Python's folium
library for geospatial visualization. This approach mirrors the functionality of the leaflet
package in R, leveraging free OpenStreetMap data to create an interactive mapping experience, complete with zoom controls, hover effects, and categorical markers.
- This project references the BAN400 preparation material as a learning guide while ensuring originality in its implementation.
- All work is aligned with NHH’s academic integrity guidelines and complies with the Bergen Bysykkel API's open-source licensing.