ML4T Software Setup

Notice

A zip file containing the grading script and any template code or data will be linked off of each assignment’s individual wiki page. A zip file containing the grading and util modules, as well as the data, is available here: ML4T_2022Spr. The instructions on running the test scripts provided are listed below.

Overview

Most of the projects in this class will include a local testing script for student evaluation. We are providing the testing scripts with the template code for these projects to test their code to ensure they are API compatible.

Important Notes

Your code MUST run properly on Gradescope. You will need to set up a local development environment with specific libraries and versions, which you can follow the instructions here: ML4T_Local_Environment.
We use a specific, static dataset for this course, provided in the repository detailed below. If you download your own data from Yahoo (or elsewhere), you will get the wrong assignments.
We reserve the right to modify the grading script while maintaining API compatibility with what is described on the project pages. This includes modifying or withholding test cases, changing point values to match the given rubric, and changing timeout limits to accommodate grading deadlines. The scripts are provided as a convenience to help students avoid common pitfalls or mistakes and are intended to be used as a sanity check. Passing all tests does not guarantee full credit on the assignment and should be considered a necessary but insufficient condition for completing an assignment.
Using github.gatech.edu to back up your work is an excellent idea which we encourage. However, make sure that you do not make your solutions to the assignments public. It’s easy to do this accidentally, so please be careful:
- Do not put your solutions in a public repository. Repositories on github.com are public by default. The Georgia Tech GitHub, github.gatech.edu, provides the same interface and allows for free private repositories for students.

Getting code templates

As of Spring 2018, code for each of the individual assignments is provided in zip files linked to the individual project page. The data, grading module, and util.py, common across all assignments, are available here ML4T_2022Spr.zip (same file as above).

Running the grading scripts

The above zip files contain the grading scripts, data, and util.py for all assignments. Some project pages will also link to a zip file containing a directory with some template code. You should extract the same directory containing the data and grading directories and util.py (ML4T_2022Spr/). To complete the assignments, you’ll need to modify the templates according to the assignment description.

To test your code, you’ll need to set up your PYTHONPATH to include the grading module and the utility module util.py, which are both one directory up from the project directories. Here’s an example of how to run the grading script for the optional (deprecated) assignment Assess Portfolio (note, grade_anlysis.py is included in the template zip file for Assess Portfolio):

This assumes you’re typing from the folder ML4T_2022Spr/assess_portfolio/. This will print out a lot of information and produce two text files: points.txt and comments.txt. It will probably be helpful to scan through all of the output printed out in order to trace errors to your code, while comments.txt will contain a succinct summary of which test cases failed and the specific errors (without the backtrace). Here’s an example of the contents of comments.txt for the first assignment using the unchanged template:

The comments.txt file will contain a summary of which tests were passed or failed and any error messages. The points.txt file reports the score from the autograder, used by the teaching staff to automate grading submitted code in a batch run, and can be safely ignored by students.