{"id":2461,"date":"2022-01-09T18:08:48","date_gmt":"2022-01-09T18:08:48","guid":{"rendered":"https:\/\/lucylabs.gatech.edu\/ml4t\/project-4-documentation\/"},"modified":"2022-01-10T11:53:00","modified_gmt":"2022-01-10T11:53:00","slug":"project-4-documentation","status":"publish","type":"page","link":"https:\/\/lucylabs.gatech.edu\/ml4t\/spring2022\/project-4-documentation\/","title":{"rendered":"Project 4 Documentation"},"content":{"rendered":"<p>[et_pb_section fb_built=&#8221;1&#8243; _builder_version=&#8221;3.22&#8243; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;post_content&#8221;][et_pb_row _builder_version=&#8221;3.25&#8243; background_size=&#8221;initial&#8221; background_position=&#8221;top_left&#8221; background_repeat=&#8221;repeat&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;post_content&#8221;][et_pb_column type=&#8221;4_4&#8243; _builder_version=&#8221;3.25&#8243; custom_padding=&#8221;|||&#8221; global_colors_info=&#8221;{}&#8221; custom_padding__hover=&#8221;|||&#8221; theme_builder_area=&#8221;post_content&#8221;][et_pb_text _builder_version=&#8221;4.5.6&#8243; background_size=&#8221;initial&#8221; background_position=&#8221;top_left&#8221; background_repeat=&#8221;repeat&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;post_content&#8221;]<\/p>\n<div class=\"document\">\n<div class=\"documentwrapper\">\n<div class=\"bodywrapper\">\n<div class=\"body\" role=\"main\">\n<div class=\"section\" id=\"module-DTLearner\">\n<h1 style=\"text-align: center;\">Project 4: Defeat Learners<\/h1>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<h2><span style=\"text-decoration: underline;\">DTLearner.py<\/span><\/h2>\n<p>&nbsp;<\/p>\n<dl class=\"class\">\n<dt id=\"DTLearner.DTLearner\"><em class=\"property\">class <\/em><code class=\"sig-prename descclassname\">DTLearner.<\/code><code class=\"sig-name descname\">DTLearner<\/code><span class=\"sig-paren\">(<\/span><em class=\"sig-param\">leaf_size=1<\/em>, <em class=\"sig-param\">verbose=False<\/em><span class=\"sig-paren\">)<\/span><\/dt>\n<dd>\n<p>This is a decision tree learner object that is implemented incorrectly. You should replace this DTLearner with<br \/> your own correct DTLearner from Project 3.<\/p>\n<dl class=\"field-list simple\">\n<dt class=\"field-odd\">Parameters<\/dt>\n<dd class=\"field-odd\">\n<ul class=\"simple\">\n<li><strong>leaf_size<\/strong> (<em>int<\/em>) \u2013 The maximum number of samples to be aggregated at a leaf, defaults to 1.<\/li>\n<li><strong>verbose<\/strong> (<em>bool<\/em>) \u2013 If \u201cverbose\u201d is True, your code can print out information for debugging.<br \/> If verbose = False your code should not generate ANY output. When we test your code, verbose will be False.<\/li>\n<\/ul>\n<\/dd>\n<\/dl>\n<dl class=\"method\">\n<dt id=\"DTLearner.DTLearner.add_evidence\"><code class=\"sig-name descname\">add_evidence<\/code><span class=\"sig-paren\">(<\/span><em class=\"sig-param\">data_x<\/em>, <em class=\"sig-param\">data_y<\/em><span class=\"sig-paren\">)<\/span><\/dt>\n<dd>\n<p>Add training data to learner<\/p>\n<dl class=\"field-list simple\">\n<dt class=\"field-odd\">Parameters<\/dt>\n<dd class=\"field-odd\">\n<ul class=\"simple\">\n<li><strong>data_x<\/strong> (<em>numpy.ndarray<\/em>) \u2013 A set of feature values used to train the learner<\/li>\n<li><strong>data_y<\/strong> (<em>numpy.ndarray<\/em>) \u2013 The value we are attempting to predict given the X data<\/li>\n<\/ul>\n<\/dd>\n<\/dl>\n<\/dd>\n<\/dl>\n<dl class=\"method\">\n<dt id=\"DTLearner.DTLearner.author\"><code class=\"sig-name descname\">author<\/code><span class=\"sig-paren\">(<\/span><span class=\"sig-paren\">)<\/span><\/dt>\n<dd>\n<dl class=\"field-list simple\">\n<dt class=\"field-odd\">Returns<\/dt>\n<dd class=\"field-odd\">\n<p>The GT username of the student<\/p>\n<\/dd>\n<dt class=\"field-even\">Return type<\/dt>\n<dd class=\"field-even\">\n<p>str<\/p>\n<\/dd>\n<\/dl>\n<\/dd>\n<\/dl>\n<dl class=\"method\">\n<dt id=\"DTLearner.DTLearner.query\"><code class=\"sig-name descname\">query<\/code><span class=\"sig-paren\">(<\/span><em class=\"sig-param\">points<\/em><span class=\"sig-paren\">)<\/span><\/dt>\n<dd>\n<p>Estimate a set of test points given the model we built.<\/p>\n<dl class=\"field-list simple\">\n<dt class=\"field-odd\">Parameters<\/dt>\n<dd class=\"field-odd\">\n<p><strong>points<\/strong> (<em>numpy.ndarray<\/em>) \u2013 A numpy array with each row corresponding to a specific query.<\/p>\n<\/dd>\n<dt class=\"field-even\">Returns<\/dt>\n<dd class=\"field-even\">\n<p>The predicted result of the input data according to the trained model<\/p>\n<\/dd>\n<dt class=\"field-odd\">Return type<\/dt>\n<dd class=\"field-odd\">\n<p>numpy.ndarray<\/p>\n<\/dd>\n<\/dl>\n<\/dd>\n<\/dl>\n<\/dd>\n<\/dl>\n<p><span class=\"target\" id=\"module-gen_data\"><\/span><\/p>\n<p><span class=\"target\"><\/span><\/p>\n<h2><span style=\"text-decoration: underline;\"><span class=\"target\">gen_data.py<\/span><\/span><\/h2>\n<p><span class=\"target\"><\/span><\/p>\n<dl class=\"function\">\n<dt id=\"gen_data.author\"><code class=\"sig-name descname\">author<\/code><span class=\"sig-paren\">(<\/span><span class=\"sig-paren\">)<\/span><\/dt>\n<dd>\n<dl class=\"field-list simple\">\n<dt class=\"field-odd\">Returns<\/dt>\n<dd class=\"field-odd\">\n<p>The GT username of the student<\/p>\n<\/dd>\n<dt class=\"field-even\">Return type<\/dt>\n<dd class=\"field-even\">\n<p>str<\/p>\n<\/dd>\n<\/dl>\n<\/dd>\n<\/dl>\n<dl class=\"function\">\n<dt id=\"gen_data.best_4_dt\"><code class=\"sig-name descname\">best_4_dt<\/code><span class=\"sig-paren\">(<\/span><em class=\"sig-param\">seed=1489683273<\/em><span class=\"sig-paren\">)<\/span><\/dt>\n<dd>Returns data that performs significantly better with DTLearner than LinRegLearner.<br \/> The data set should include from 2 to 10 columns in X, and one column in Y.<br \/> The data should contain from 10 (minimum) to 1000 (maximum) rows.<\/dd>\n<\/dl>\n<dl class=\"field-list simple\">\n<dt class=\"field-odd\">Parameters<\/dt>\n<dd class=\"field-odd\">\n<p><strong>seed<\/strong> (<em>int<\/em>) \u2013 The random seed for your data generation.<\/p>\n<\/dd>\n<dt class=\"field-even\">Returns<\/dt>\n<dd class=\"field-even\">\n<p>Returns data that performs significantly better with DTLearner than LinRegLearner.<\/p>\n<\/dd>\n<dt class=\"field-odd\">Return type<\/dt>\n<dd class=\"field-odd\">\n<p>numpy.ndarray<\/p>\n<\/dd>\n<\/dl>\n<dl class=\"function\">\n<dt id=\"gen_data.best_4_lin_reg\"><code class=\"sig-name descname\">best_4_lin_reg<\/code><span class=\"sig-paren\">(<\/span><em class=\"sig-param\">seed=1489683273<\/em><span class=\"sig-paren\">)<\/span><\/dt>\n<dd>\n<p>Returns data that performs significantly better with LinRegLearner than DTLearner.<br \/> The data set should include from 2 to 10 columns in X, and one column in Y.<br \/> The data should contain from 10 (minimum) to 1000 (maximum) rows.<\/p>\n<dl class=\"field-list simple\">\n<dt class=\"field-odd\">Parameters<\/dt>\n<dd class=\"field-odd\">\n<p><strong>seed<\/strong> (<em>int<\/em>) \u2013 The random seed for your data generation.<\/p>\n<\/dd>\n<dt class=\"field-even\">Returns<\/dt>\n<dd class=\"field-even\">\n<p>Returns data that performs significantly better with LinRegLearner than DTLearner.<\/p>\n<\/dd>\n<dt class=\"field-odd\">Return type<\/dt>\n<dd class=\"field-odd\">\n<p>numpy.ndarray<\/p>\n<\/dd>\n<\/dl>\n<\/dd>\n<\/dl>\n<p><span class=\"target\" id=\"module-LinRegLearner\"><\/span><\/p>\n<p><span class=\"target\"><\/span><\/p>\n<p><span class=\"target\"><\/span><\/p>\n<h2><span style=\"text-decoration: underline;\"><span class=\"target\">LinRegLearner.py<\/span><\/span><\/h2>\n<p><span class=\"target\"><\/span><\/p>\n<dl class=\"class\">\n<dt id=\"LinRegLearner.LinRegLearner\"><em class=\"property\">class <\/em><code class=\"sig-prename descclassname\">LinRegLearner.<\/code><code class=\"sig-name descname\">LinRegLearner<\/code><span class=\"sig-paren\">(<\/span><em class=\"sig-param\">verbose=False<\/em><span class=\"sig-paren\">)<\/span><\/dt>\n<dd>\n<p>This is a Linear Regression Learner. It is implemented correctly.<\/p>\n<dl class=\"field-list simple\">\n<dt class=\"field-odd\">Parameters<\/dt>\n<dd class=\"field-odd\"><strong>verbose<\/strong> (<em>bool<\/em>) \u2013 If \u201cverbose\u201d is True, your code can print out information for debugging.<br \/> If verbose = False your code should not generate ANY output. When we test your code, verbose will be False.<\/dd>\n<\/dl>\n<\/dd>\n<\/dl>\n<dl class=\"method\">\n<dt id=\"LinRegLearner.LinRegLearner.add_evidence\"><code class=\"sig-name descname\">add_evidence<\/code><span class=\"sig-paren\">(<\/span><em class=\"sig-param\">data_x<\/em>, <em class=\"sig-param\">data_y<\/em><span class=\"sig-paren\">)<\/span><\/dt>\n<dd>\n<p>Add training data to learner<\/p>\n<dl class=\"field-list simple\">\n<dt class=\"field-odd\">Parameters<\/dt>\n<dd class=\"field-odd\">\n<ul class=\"simple\">\n<li><strong>data_x<\/strong> (<em>numpy.ndarray<\/em>) \u2013 A set of feature values used to train the learner<\/li>\n<li><strong>data_y<\/strong> (<em>numpy.ndarray<\/em>) \u2013 The value we are attempting to predict given the X data<\/li>\n<\/ul>\n<\/dd>\n<\/dl>\n<\/dd>\n<\/dl>\n<dl class=\"method\">\n<dt id=\"LinRegLearner.LinRegLearner.author\"><code class=\"sig-name descname\">author<\/code><span class=\"sig-paren\">(<\/span><span class=\"sig-paren\">)<\/span><\/dt>\n<dd>\n<dl class=\"field-list simple\">\n<dt class=\"field-odd\">Returns<\/dt>\n<dd class=\"field-odd\">\n<p>The GT username of the student<\/p>\n<\/dd>\n<dt class=\"field-even\">Return type<\/dt>\n<dd class=\"field-even\">\n<p>str<\/p>\n<\/dd>\n<\/dl>\n<\/dd>\n<\/dl>\n<dl class=\"method\">\n<dt id=\"LinRegLearner.LinRegLearner.query\"><code class=\"sig-name descname\">query<\/code><span class=\"sig-paren\">(<\/span><em class=\"sig-param\">points<\/em><span class=\"sig-paren\">)<\/span><\/dt>\n<dd>\n<p>Estimate a set of test points given the model we built.<\/p>\n<dl class=\"field-list simple\">\n<dt class=\"field-odd\">Parameters<\/dt>\n<dd class=\"field-odd\">\n<p><strong>points<\/strong> (<em>numpy.ndarray<\/em>) \u2013 A numpy array with each row corresponding to a specific query.<\/p>\n<\/dd>\n<dt class=\"field-even\">Returns<\/dt>\n<dd class=\"field-even\">\n<p>The predicted result of the input data according to the trained model<\/p>\n<\/dd>\n<dt class=\"field-odd\">Return type<\/dt>\n<dd class=\"field-odd\">\n<p>numpy.ndarray<\/p>\n<\/dd>\n<\/dl>\n<\/dd>\n<\/dl>\n<p><span class=\"target\" id=\"module-testbest4\"><\/span><\/p>\n<p><span class=\"target\"><\/span><\/p>\n<h2><span style=\"text-decoration: underline;\"><span class=\"target\">testbest4.py<\/span><\/span><\/h2>\n<p><span class=\"target\"><\/span><\/p>\n<dl class=\"function\">\n<dt id=\"testbest4.compare_os_rmse\"><code class=\"sig-name descname\">compare_os_rmse<\/code><span class=\"sig-paren\">(<\/span><em class=\"sig-param\">learner1<\/em>, <em class=\"sig-param\">learner2<\/em>, <em class=\"sig-param\">x<\/em>, <em class=\"sig-param\">y<\/em><span class=\"sig-paren\">)<\/span><\/dt>\n<dd>\n<p>Compares the out-of-sample root mean squared error of your LinRegLearner and DTLearner.<\/p>\n<dl class=\"field-list simple\">\n<dt class=\"field-odd\">Parameters<\/dt>\n<dd class=\"field-odd\">\n<ul class=\"simple\">\n<li><strong>learner1<\/strong> (<em>class:&#8217;LinRegLearner.LinRegLearner&#8217;<\/em>) \u2013 An instance of LinRegLearner<\/li>\n<li><strong>learner2<\/strong> (<em>class:&#8217;DTLearner.DTLearner&#8217;<\/em>) \u2013 An instance of DTLearner<\/li>\n<li><strong>x<\/strong> (<em>numpy.ndarray<\/em>) \u2013 X data generated from either gen_data.best_4_dt or gen_data.best_4_lin_reg<\/li>\n<li><strong>y<\/strong> (<em>numpy.ndarray<\/em>) \u2013 Y data generated from either gen_data.best_4_dt or gen_data.best_4_lin_reg<\/li>\n<\/ul>\n<\/dd>\n<dt class=\"field-even\">Returns<\/dt>\n<dd class=\"field-even\">\n<p>The root mean squared error of each learner<\/p>\n<\/dd>\n<dt class=\"field-odd\">Return type<\/dt>\n<dd class=\"field-odd\">\n<p>tuple<\/p>\n<\/dd>\n<\/dl>\n<\/dd>\n<\/dl>\n<dl class=\"function\">\n<dt id=\"testbest4.test_code\"><code class=\"sig-name descname\">test_code<\/code><span class=\"sig-paren\">(<\/span><span class=\"sig-paren\">)<\/span><\/dt>\n<dd>\n<p>Performs a test of your code and prints the results<\/p>\n<\/dd>\n<\/dl>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p><!-- \/divi:html --><\/p>\n<p><!-- divi:html --><\/p>\n<div class=\"clearer\"><\/div>\n<p><!-- \/divi:html --><\/p>\n<p><!-- divi:html --><\/p>\n<div class=\"footer\">\u00a92020, ML4T Staff |\u00a0Powered by <a href=\"http:\/\/sphinx-doc.org\/\">Sphinx 2.2.0<\/a> &amp; <a href=\"https:\/\/github.com\/bitprophet\/alabaster\">Alabaster 0.7.12<\/a><\/div>\n<p><!-- \/divi:html --><\/p>\n<p>[\/et_pb_text][\/et_pb_column][\/et_pb_row][\/et_pb_section]<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Project 4: Defeat Learners &nbsp; &nbsp; DTLearner.py &nbsp; class DTLearner.DTLearner(leaf_size=1, verbose=False) This is a decision tree learner object that is implemented incorrectly. You should replace this DTLearner with your own correct DTLearner from Project 3. Parameters leaf_size (int) \u2013 The maximum number of samples to be aggregated at a leaf, defaults to 1. verbose (bool) [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"parent":2441,"menu_order":21,"comment_status":"closed","ping_status":"closed","template":"","meta":{"_et_pb_use_builder":"on","_et_pb_old_content":"","_et_gb_content_width":"","footnotes":""},"class_list":["post-2461","page","type-page","status-publish","hentry"],"jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/lucylabs.gatech.edu\/ml4t\/wp-json\/wp\/v2\/pages\/2461","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/lucylabs.gatech.edu\/ml4t\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/lucylabs.gatech.edu\/ml4t\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/lucylabs.gatech.edu\/ml4t\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/lucylabs.gatech.edu\/ml4t\/wp-json\/wp\/v2\/comments?post=2461"}],"version-history":[{"count":3,"href":"https:\/\/lucylabs.gatech.edu\/ml4t\/wp-json\/wp\/v2\/pages\/2461\/revisions"}],"predecessor-version":[{"id":2541,"href":"https:\/\/lucylabs.gatech.edu\/ml4t\/wp-json\/wp\/v2\/pages\/2461\/revisions\/2541"}],"up":[{"embeddable":true,"href":"https:\/\/lucylabs.gatech.edu\/ml4t\/wp-json\/wp\/v2\/pages\/2441"}],"wp:attachment":[{"href":"https:\/\/lucylabs.gatech.edu\/ml4t\/wp-json\/wp\/v2\/media?parent=2461"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}