Project 3 (Summer 2020)

For full instructions on how to complete the projects for this class, see the full project description. The process as a whole is the same for each of the three projects: what differs is the type of problems your agent will be solving.

For Project 3, it will be solving Problem Sets D & E, which look like this:

Your grade will be based on three components: your agent’s performance (30%), your agent’s implementation (20%), and your project reflection (50%).

Performance (30%)

Upon submission, your agent will be tested against Problem Sets D and E’s Basic, Test, Challenge, and Raven’s problems. Only its performance on the Basic and Test problems will impact your grade.

Each of those four sets (Basic D, Basic E, Test D, and Test E) is worth one-fourth of the performance grade (7.5% of the project each). For each set, you will receive full credit if your agent correctly answers at least 7 out of 12 problems on that set. You will receive no credit if your agent correctly answers fewer than 7 problems on the set. That means that your agent should answer at least 7 out of 12 Basic D problems, 7 out of 12 Basic E problems, 7 out of 12 Test D problems, and 7 out of 12 Test E problems correctly to get the full 30%.

In determining your Performance score, we will take your best submission for each set. Your best scores for each set do not have to come on the same submission; we will use your best submission per set.

Good performance on the test itself is not the be-all, end-all goal of the project, but we do believe that building an agent that performs at least adequately is a key part of accomplishing the project’s learning goals. This policy is specifically designed to incentivize designing an agent that does decently well, but also still allow room for creative approaches that may not be able to achieve the absolute best possible score.

Make sure to submit with the -error-check-3 flag before submitting to the full autograder to ensure that your code will run error-free. Otherwise, you may lose an attempt.

Revise & Reflect (20%)

Unlike other projects in other classes, this project is not solely focused on the final deliverable. Instead, we are explicitly focused on the revision and reflection process. We expect everyone to develop their agent, regularly submit it for testing against the autograder, reflect on the results, and revise their agent to attempt to perform better. You should use the Basic and Challenge problems for rapid feedback and improvement on your local computer, but you should regularly submit against the autograder as well to see your performance on the Test and Raven’s problems. You have up to 10 attempts against the full autograder.

From everyone, we want to see one of three things:

Your agent reach perfect performance.
A few (4-5 or more) submissions where your agent generally is getting better and better, although never reaching perfection.
Several (7-8 or more) submissions where you are clearly attempting to improve your agent, even if you are not successful.

Note that although those sound somewhat objective, there is some subjectivity here. Submitting your agent five times within a couple hours with slight tweaks to each submission is clearly not as good as submitting on each of five consecutive days making more substantive improvements each time. We will look at the rate of submission, the amount of revision, and the improvement to performance to assign your Revise & Reflect grade. Note that the project journal will also ask you to write a brief reflection on each individual submission: if you are able to write something substantive and interesting on each submission to the autograder, you are likely on the right track.

Note that if we observe you resubmitting the same code several times with no notable revision simply to check off the submission counts above, you will receive 0 for this section and may be further penalized on the Journal (as you could not honestly write about resubmitting the same code several times). (We do, however, differentiate this from the occasional resubmission, especially if your code is non-deterministic; but your journal would reflect that rationale as well.)

Implementation data will be pulled from the autograder; you do not need to submit anything to Canvas for this.

Journal (50%)

For the journal, you will write a personal journal on your process of constructing the agent. This journal should follow this structure, with each section and journal entry clearly marked:

Introduction: A ~1 page introduction to your overall idea for addressing the problem. Ideally, you’ll write this prior to even beginning implementation. This should include a reference to Projects 1 and 2: are you building on what you did before, or are you starting from scratch on a new approach?
Journal Entries: For each submission to the autograder, write a short reflection. In the reflection, you should address the following questions:
- When was this submission sent in?
- What did you change for this version? Why?
- How would you compare this version of the agent to the way you feel you, a human, approach the problems? Does it think similarly to how you think, or differently?
- How did it perform? What problems or types of problems did it do well on? Where did it struggle? How is its efficiency?
Conclusion: A ~1 page summary of your final agent. In the conclusion, you should address the following questions:
- How would you characterize the overall process of designing your agent? Trial-and-error? Deliberate improvement? Targeting one type of problem at a time?
- How similar do you feel your final agent is to how you, a human, would approach the test? Why or why not?
- What improvements would you make if you had more time and/or more computational resources?

The maximum length of your journal is based on your number of submissions to the autograder. For each submission, you may have 1 page in your reflection, plus 3 additional pages: one for the Introduction, one for the Conclusion, and one to give you a little flexibility. If you include any citations, your references section does not count against your length limit, but any figures or diagrams do.

For example, if you submit to the autograder 5 times, your journal may be up to 8 pages. If you submit to the autograder 10 times, your journal may be up to 13 pages. Note that we only enforce the overall length limit, not a per-section length limit. We expect, for example, that you will probably write more for your first submission than for any subsequent submission. We would expect that for most people, the journal entry for the first submission would be 2-3 pages, and each subsequent one may be less than a page.

Your journal must be written in in JDF format. Any content beyond the length corresponding to your number of submissions will not be considered for a grade.

If you would like to include additional information beyond the length limit, you may include it in clearly-marked appendices. These materials will not be used in grading your assignment, but they may help you get better feedback from your classmates and grader.

Submission Instructions

Your actual code should be submitted directly to the autograder according to the full project directions. Complete your project journal using JDF, then save your submission as a PDF. Journals should be submitted to the corresponding assignment submission page in Canvas. You should submit a single PDF for this assignment. This PDF will be ported over to Peer Feedback for peer review by your classmates. If your assignment involves things (like videos, working prototypes, etc.) that cannot be provided in PDF, you should provide them separately (through OneDrive, Google Drive, Dropbox, etc.) and submit a PDF that links to or otherwise describes how to access that material.

This is an individual assignment. All work you submit should be your own. Make sure to cite any sources you reference, and use quotes and in-line citations to mark any direct quotes.

Late work is not accepted without advanced agreement except in cases of medical or family emergencies. In the case of such an emergency, please contact the Dean of Students.

Grading Information

Your overall project grade will be posted to the gradebook in Canvas, along with a comment indicating the breakdown of scores across these three categories. Students whose agents perform in the top 10 in the class (Basic D + Basic E + Test D + Test E, ties broken by Raven’s and Challenge problems) will receive 5 extra points on their final average.

Peer Review

After submission, your assignment will be ported to Peer Feedback for review by your classmates. Grading is not the primary function of this peer review process; the primary function is simply to give you the opportunity to read and comment on your classmates’ ideas, and receive additional feedback on your own. All grades will come from the graders alone.

You receive 1.5 participation points for completing a peer review by the end of the day Thursday; 1.0 for completing a peer review by the end of the day Sunday; and 0.5 for completing it after Sunday but before the end of the semester. For more details, see the participation policy.