Skip to content

This is an example repo for using Airbnb dataset with a machine learning algorithm to predict rent prices.

Notifications You must be signed in to change notification settings

codescrum/airbnb-ml-workshop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Airbnb Machine Learning Workshop Repository

This repository contains the files and source code for the Ruby Unconference 2017 Airbnb Workshop, follow the steps for the configuration of the python environment and use the interactive jupyter notebooks to review the workshop.

Setup

  • Summary: To execute the Jupyter notebooks we need to install a python distribution with all the necessary packages (e.g. IPython, Numpy, Scikit-Learn). There is a distribution that already has this packages, so we need to install it. For intructions about the installation visit Anaconda.

Setting your python environment

Note: If you already have configured yourself a python development environment and are comfortable with it, then you can directly proceed to running the installing the dependencies and running this notebook.

These instructions were created to install an isolated environment so that it does not interfere with any work you may have.

(option 1) Using pyenv and virtualenv (lighter)

Installing pyenv from github

This will get you going with the latest version of pyenv and make it easy to fork and contribute any changes back upstream.

  1. Check out pyenv where you want it installed. A good place to choose is $HOME/.pyenv (but you can install it somewhere else).

     $ git clone https://github.com/pyenv/pyenv.git ~/.pyenv
    
  2. Define environment variable PYENV_ROOT to point to the path where pyenv repo is cloned and add $PYENV_ROOT/bin to your $PATH for access to the pyenv command-line utility.

    $ echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bash_profile
    $ echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bash_profile

    Zsh note: Modify your ~/.zshenv file instead of ~/.bash_profile.
    Ubuntu and Fedora note: Modify your ~/.bashrc file instead of ~/.bash_profile.
    Proxy note: If you use a proxy, export http_proxy and HTTPS_PROXY too.

  3. Add pyenv init to your shell to enable shims and autocompletion. Please make sure eval "$(pyenv init -)" is placed toward the end of the shell configuration file since it manipulates PATH during the initialization.

    $ echo -e 'if command -v pyenv 1>/dev/null 2>&1; then\n  eval "$(pyenv init -)"\nfi' >> ~/.bash_profile

    Zsh note: Modify your ~/.zshenv file instead of ~/.bash_profile.
    Ubuntu and Fedora note: Modify your ~/.bashrc file instead of ~/.bash_profile.

  4. Restart your shell so the path changes take effect. You can now begin using pyenv.

    $ exec "$SHELL"
  5. Install Python versions into $(pyenv root)/versions. For example, to download and install Python 2.7.13, run:

    $ pyenv install 2.7.13
  6. Install virtualenv pip install virtualenv

  7. Clone this repository git clone git@github.com:codescrum/airbnb-ml-workshop.git

  8. Go into the project directory cd airbnb-ml-workshop

  9. Create an virtualenv environment inside the project folder virtualenv .

  10. Activate the virtualenv environment in the project folder source bin/activate

  11. Upon activation, run pip install -r requirements.txt to install the dependencies in the virtual environment

(option 2) Using Anaconda

Note: if you already have anaconda or another py, then use it!. Otherwise

  • Install anaconda according to the instructions.
  • [Optional] Run pip install -r anaconda-requirements.txt to install the packages requirements manually.

Running the jupyter notebook

  • Download the dataset from insideairbnb (link) to the data directory.
  • Extract the downloaded gunzip files in the data directory: gunzip *.zip
  • Run the Jupyter notebook inside the notebooks directory: jupyter-notebook notebooks/ML_Workshop.ipynb
  • If you are going to run the notebook from a different device in the same network, use the url generated after running the jupyter notebook list at the terminal. That should display an url (e.g http://IP_ADDRESS:PORT/?token=xxxxxxxxxxxxxxxxxxxxxxx). Now you can access the notebook from your browser selecting the available notebook that you need.
  • If you wish to use the dark Jupyter notebook dark theme (possibly a bit better for your eyes) run the following command: jt -t chesterish -T (the -T option is for "activate the toolbar, which is a useful thing to have")
  • If you wish to go back and use the default Jupyter notebook theme, then just run the following command: jt -r to remove the customizations.

Dependencies used

  • Python 2.7.13
  • jupyter ~=1.0.0
  • jupyterthemes ~=0.18.0
  • numpy ~=1.12.1
  • pandas ~=0.20.1
  • scikit-learn ~=0.18.1
  • seaborn ~=0.8

Who do I talk to?

  • Mail to: jairo.diaz |at| codescrum.com, miguel.diaz |at| codescrum.com or milton.arango |at| codescrum.com
  • By Codescrum

Contributors


About

This is an example repo for using Airbnb dataset with a machine learning algorithm to predict rent prices.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published