ML4T Local Environment

Attention

STARTING IN FALL 2019, THIS COURSE USES PYTHON 3.6. MAKE CAREFUL NOTE OF THIS AND DO NOT FALL BACK ON OLD WIKI PAGES FOR PROJECT TEMPLATES AND ENVIRONMENT CONFIGURATION INSTRUCTIONS.

The information on this page is for those who are interested to have a Python development environment on their own machine. Keep in mind that even if you set up your own environment, your code still MUST run correctly on Gradescope. Please see ML4T_Software_Setup for information on how to run, and how to check out the code scaffolding for the projects.

Overview

Important notes

  • We use a specific, static dataset for this course, which we will provide. If you download your own data from Yahoo (or elsewhere), you will get wrong answers on assignments.
  • While these instructions should work for either Windows, macOS, or Linux, we strongly recommend developing on recent versions of Ubuntu LTS, as there may be significant differences on Windows. You can easily create an Ubuntu based virtual machine image to develop on using any freely available VM software (VirtualBox is probably the easiest and cheapest).
  • Regardless of your OS and install method, it has to work on Gradescope. If your code fails to run on Gradescope, “it works on my machine” is not a valid excuse, and you will receive no credit.

The assignments in this class are in Python (version 3.6), and rely heavily on a few important libraries. These libraries are under active development, which unfortunately means there can be some compatibility issues between versions. This isn’t an issue when using Gradescope as it is setup with compatible versions of each library, but if you want to work from your local machine it is very important to make sure you have exactly the same library versions. To that end, here is a list of each library and its version number, provided in the conda environment format:

name: ml4t
dependencies:
- python=3.6
- cycler=0.10.0
- kiwisolver=1.1.0
- matplotlib=3.0.3
- numpy=1.16.3
- pandas=0.24.2
- pyparsing=2.4.0
- python-dateutil=2.8.0
- pytz=2019.1
- scipy=1.2.1
- seaborn=0.9.0
- six=1.12.0
- joblib=0.13.2
- pytest=5.0
- future=0.17.1
- pip
- pip:
  - pprofile==2.0.2
  - jsons==0.8.8
If you are familiar with conda, you can use this to create an environment for this class which matches those version numbers. Here is an outline:

  1. Install miniconda/anaconda (if it is not already installed). Save the above yml fragment as environment.yml.
  2. Create an environment for this class:
conda env create --file environment.yml

3. Activate the new environment:

conda activate ml4t

Matplotlib on Mac

If you are using a Mac and when attempting to plot charts you get anĀ exception with a stack trace, including a mention of libtk and tkinter, try the following to change the backend

$ mkdir -p ~/.matplotlib
$ echo "backend: TkAgg" > ~/.matplotlib/matplotlibrc

Optional software