Month: February 2019

Prelab Assignment

Instructions
You must meet all base requirements in the syllabus to receive any credit.
Work in your Prelab06 directory, and do not copy any data les into that folder.
Remember to commit & push the required le(s) to GitHub periodically. We will only grade
what’s been pushed to GitHub!
Do not push to GitHub any le that is not required. You will lose points if your repository
contains more les than required!
Make sure you le compiles. You will not receive any credit if your le does not compile.
Name and spell the le, and the functions, exactly as instructed. Your scripts will be graded by an
automated process. You will lose some points, per our discretion, for any function that does
not match the expected name.
Make sure your output from all functions match the given examples. You will not receive points
for a question whose output mismatches the expected result.
Unless otherwise specied, you cannot use any external library, but you can use any module in the
Python Standard Library to solve this lab, i.e. anything under:
https://docs.python.org/3.7/library/index.html
Make sure you are using Python 3.7 for your Prelab.

Desperately Seeking Sutton

Description
One aspect of research in reinforcement learning (or any scientific field) is the replication of
previously published results. There are a few benefits you might reap from replicating papers.
One benefit of replication is that it augments your understanding of the material. Another benefit
is that it puts you in a good position both to extend existing literature and consider new
contributions to your field. Replication is also often challenging. You may find that values of key
parameters are missing, that described methods are ambiguous, or even that there are subtle
errors. Sometimes obtaining the same pattern of results is not possible.
For this project, you will read Richard Sutton’s 1988 paper
. Then you will create an implementation and replication of the results
found in figures 3, 4, and 5. (It might also be informative to compare these results with those in
Chapter 7 of Sutton’s textbook: “
You will present your work in a written report of a maximum of 5 pages. The report should
include a description of the experiment replicated, how the experiment was implemented (the
environment, algorithms, etc), and the outcome of the experiment. You should describe how well
your results match the results given in the paper as well as any significant differences. Describe
any pitfalls you ran into while trying to replicate the experiment from the paper (e.g. unclear
parameters, contradictory descriptions of the procedure to follow, results that differ wildly from
the published results). What steps did you take to overcome those pitfalls? What assumptions did
you make? And, why were these assumptions justified? Add anything else that you think is
relevant to discuss.

Generic Sequences

Objective
The purpose of this project is to develop a generic class representing the sequence
data structure. For storage of data the class shall use an ArrayList object. You will also
review how to construct and use Java classes as well as obtain experience with
software design and testing.
ABET Program Learning Outcomes:
The ability to recognize the need for data structures and choose the appropriate
data structure (1, 2, 6)
The Problem
In this project you implement a sequence ADT as defined in the book. However, your
class will be ArrayList based rather than array based. The fields will be chosen
accordingly and in the application class you will test and demonstrate the behavior of
the sequence ADT.
Design, Implementation
The reading book provides good help and patterns for design and implementation.
However, as opposed to the concrete types used in the book, you shall implement a
generic
Sequence class for the data structure. In another class named Applications
you shall test and exercise the Sequence methods. For this purpose the generic code
will be run with the concrete type Integer from the Java library.
1. Sequence class specification
This class will represent a set sequence, that is duplicate values are not
supposed to be in the sequence. You do not have to ensure this property at
adding elements, we may consider this a precondition for all methods. You may
opt for implementing a bag sequence, in that case you should change the field,
see comment below. The specification of a non-generic DoubleArraySeq class is
fairly well detailed in your reading, pp 150
– 158. Follow that design when
adequate and care for the differences as consequences of generic
implementation and ArrayList storage.

Vertebral Column Data Set

This Biomedical data set was built by Dr. Henrique da Mota during a medical residence
period in Lyon, France. Each patient in the data set is represented in the data set
by six biomechanical attributes derived from the shape and orientation of the pelvis
and lumbar spine (in this order): pelvic incidence, pelvic tilt, lumbar lordosis angle,
sacral slope, pelvic radius and grade of spondylolisthesis. The following convention is
used for the class labels: DH (Disk Hernia), Spondylolisthesis (SL), Normal (NO) and
Abnormal (AB). In this exercise, we only focus on a binary classication task NO=0
and AB=1.
(a) Download the Vertebral Column Data Set from: https://archive.ics.uci.
edu/ml/datasets/Vertebral+Column.
(b) Pre-Processing and Exploratory data analysis:
i. Make scatterplots of the independent variables in the dataset. Use color to
show Classes 0 and 1.
ii. Make boxplots for each of the independent variables.
Use color to show
Classes 0 and 1 (see ISLR p. 129).
iii. Select the rst 70 rows of Class 0 and the rst 140 rows of Class 1 as the
training set and the rest of the data as the test set.
(c) Classication using KNN on Vertebral Column Data Set
i. Write code for k-nearest neighbors with Euclidean metric (or use a software
package).
ii. Test all the data in the test database with k nearest neighbors. Take de-
cisions by majority polling.
Plot train and test errors in terms of k for
k ∈ {208, 205, . . . , 7, 4, 1, } (in reverse order). You are welcome to use smaller
increments of k. Which k is the most suitable k among those values? Cal-
culate the confusion matrix, true positive rate, true negative rate, precision,
and F -score when k = k.1
iii. Since the computation time depends on the size of the training set, one may
only use a subset of the training set. Plot the best test error rate,which
is obtained by some value of k, against the size of training set, when the
size of training set is N ∈ {10, 20, 30, . . . , 210}.Note: for each N , select
your training set by choosing the rst N/3 rows of Class 0 and the rst
N N/3 rows of Class 1 in the training set you creatd in 1(b)iii. Also, for
each N , select the optimal k from a set starting from k = 1, increasing by 5.
For example, if N = 200, the optimal k is selected from {1, 6, 11, . . . , 196}.
This plot is called a Learning Curve.
Let us further explore some variants of KNN.

Practice Generic Nodes

In this assignment you are to implement a generic Node class with a type parameter for the
elements stored in a linked list.
Exercises – Requirements
1. (2 pts) Add a generic Node class to your Java project.
2. (16 pts total) In the class
(2 pts) Declare the usual fields data and link; follow the book specification but use
the generic type parameter as necessary
(8 pts) Define the accessor methods and mutator methods for both fields)
(6 pts) Define an initializer constructor, see the book
Remember the generic syntax: at the definition the name of the constructor is
Node, however every time you use Node for a type you must postfix the generic;
when the constructor is used to instantiate a Node object a concrete type for the
generic parameter like Node<String> must be used
3. (56 pts total, 8 pts each) Define the Node class methods as required below.
Define the addNodeAfter( ) method as explained in the PP presentation, use the
generic type as needed.
Define removeNodeAfter( ) method as explained in the PP presentation, use the
generic type as needed.
Define a non-static listLength( ) method; no parameter; the method iterates thru
the list from the calling object (this), counts the nodes and returns the count; use
while loop for iteration
Define a static listLength( ) method, see the book; must use its own generic type;
takes a node parameter; the method iterates thru the list from parameter node),
counts the nodes and returns the count; use for loop for iteration
Define a static listSearch ( ) method, see the book; must use its own generic type;
takes a node parameter, and a data parameter, both generic; the method iterates
thru the list from parameter node, checks if data is target (use the equals( )

Mathematics

You may work with a single partner, but you will have to write your own
solution. If you do work with a partner, please list both of your names on
your assignment. You are not allowed to discuss the assignment with anyone
other than your partner, the TAs, and the instructor.
Question 1) (5 points) Write out Church’s Thesis in your own words.
Question 2) (30 points)
Let L = {x ∈ {a, b, c}|x = anbncn, n ∈ N}. You will need to write out a
TM Mthat decides L.
a) (15 points) Describe in words how Mworks. That is, use descrip-
tions like “Write a ’#’ and move left” or “Move right until you see a
blank character”.
b) (15 points) Formally describe your TM, using the terminology
from example 3.9 in Sipser M2.
Question 3) (25 points)
a) (20 points) Let Mbe the automaton from Figure 3.8 in Sipser.
Suppose that Mis given an input of 0000. Give the rst 10 congu-
rations that Mwill take.
b) (5 points) Suppose that Mis given an arbitrary input x and
allowed to run, generating the congurations C0, C1, . . .. What is the
largest number of characters by which any Cand Ci+1 can dier?
Why?
Question 4) (20 points) Use dovetailing to show that the language
L= { M |M halts on at least one input}
is recognizable.
Question 5) (20 points) Show that any innite recognizable language has
an innite decidable subset.

Image Processing

Topic: Image enhancement in the spatial domain. (Chapter 3 of the text book) using
Matlab Image Processing Toolbox
Tasks:
1. Intensity transformation: Select a gray level transformation method (Gamma
correction, histogram equalization, etc.) and apply it to images. Discuss the
results.
2. Spatial Filtering:
a. Select a linear smoothing filter and apply it to a noisy image. Vary the
filter size (3 x 3, 5 x 5, etc.) and compare the results.
b. Select an edge extraction filter and apply it to both noisy and noise-free
image and discuss your results.
c. Select an image sharpening filter and apply it to different kind of
images and discuss your findings.
d. Apply a non-linear order-statistic filter (e.g. median filter) on the noisy
image in part (a) and compare the results.
Report must contain the following information:
a. A brief discussion of the concept.
b. Spatial filters and parameter (if any) used.
c. Results (figures, tables, etc.) and concluding remarks.
d. Matlab code with necessary comments.
Note: You need to know how to add noise of different types to a clean image.

Image classification, dimensionality reduction

Notes:
(a) Submit a zip file of the completed iPython notebooks to Tabula. Make sure to put comments in your
code explaining what you have done and what you have observed from the results.
(b) This assignment will contribute to 25% of your overall mark.
This assignment consists of 7 exercises which are split into 4 parts. The first 3 parts can be found in the lab
of Classification while the last part can be found in the lab of Dimensionality Reduction. Please download
the corresponding ipynb files from the module web page
https://warwick.ac.uk/fac/sci/dcs/teaching/material/cs909.
Part 1: Binary Classification of MNIST Dataset
In the first set of tasks, you will evaluate a number of popular classifiers for the task of recognising
handwritten digits from MNIST dataset. Specifically, we will focus on distinguishing between 7 and
9 which are known to be a hard pair. You will use the scikit-learn libraries.
Binary Classification in scikit-learn
All classifiers in scikit-learn follow a common pattern that makes life much easier. Follow these steps for all
the tasks below.
1. Instantiate the classifier with appropriate parameters
2. Train/fit the classifier with training data and correct labels
3. Test the classifier with unseen data
4. Evaluate the performance of classifier

Compute and Elastic Scaling

In this assignment you will be using a web application that is provided to you (MiniTwit – a
simple Twitter clone) and deploying it in EC2 using many of the features you would want to use
for a service of this type. First you will set up your AWS account to be ready to create EC2
instances (virtual servers), and then you will create an instance and SSH to it. Then you will run
the MiniTwit application and see that you can reach and use the app from your web browser.
Next you will set up an EC2 instance to automatically start the MiniTwit application when the
instance launches, so that you don’t need to SSH in and run it manually. Then you will
configure Elastic Load Balancing (ELB) to distribute user (browser) connections between
multiple MiniTwit servers, and configure Auto Scaling to scale the number of MiniTwit servers
out and in, depending on the load (i.e., how many browser requests are hitting them each
second).
Also, here are a few top-level documentation pages that you may use now and in future
projects too.
Part 1: Setting up EC2
In this part, you will create a single EC2 instance and run a simple web application on it using
the Flask framework for Python. Before you start, you will need your AWS account fully set up,
per the instructions that were emailed to you.
First, log in to AWS and go to the EC2 service. (You can use the AWS search bar, or expand “All
Services” and find EC2 in the Compute section.) If you’re in the right place, you should see a
toolbar on the left with options like Instances, AMIs, Volumes, Security Groups, etc. The first
thing you need to do is make sure you’re connecting to the correct Region (i.e., the correct data
center – Amazon has several datacenters worldwide for running AWS services). On the upper
right, between your account name and “Support” it should say the name of a location (e.g.,
Oregon, Ohio, N. Virginia, etc.). If it does not say Oregon, then click on it and choose the option
“US West (Oregon)” from the drop down. It should now say Oregon, and you will be using the
Oregon Region (data center) from now on. (You shouldn’t need to do this ever again, unless
you change it, but it’s always a good idea to double check each time you log in.)
Create a Key Pair
The next thing you need to do is go to the “Key Paris” section (under the NETWORK & SECURITY
heading). You should see a button at the top of the main pane called “Create Key Pair” – click on
this. You will be prompted to give it a name – name it whatever you think makes sense. (It won’t
matter much, since you’ll probably only have one key pair for now.) Your browser will then
prompt you to download a file with a .pem extension – this file contains your private key, which
you will use to SSH to your EC2 instances. Save your .pem file somewhere safe – if you lose it,
you cannot recover it! Should you lose your .pem file, your key pair will become useless; you
will need to create a new key pair then, save the new .pem file, and associate all your EC2
instances with the new key pair.
You will need to be sure the permissions are set correctly on your .pem file – if not, SSH may
refuse to connect! On Linux or MacOS, open a command prompt and change to the directory
containing your .pem file. Then set the permissions to read-only for the file owner with this
command:
chmod 400 <pem key file>
Configure a Security Group
Finally, configure a default Security Group. This is basically a firewall, which restricts what kind
of network traffic from which hosts (machines on the network) is allowed to reach your EC2
instances. Click on “Security Groups” (under the NETWORK & SECURITY heading). There should
be one security group already there with a name along the lines of “default”. Select that  

security group. In the details section below the list of security groups, you will see
“Description”, “Inbound”, “Outbound”, etc. Click on the “Outbound” tab. It should say
something like “All traffic – All – All – 0.0.0.0/0”, which basically means instances using this
security group can send any network traffic to anywhere. This is fine; what they’re allowed to
receive is the real security concern, so now click on “Inbound”. You should see a rule that says
it will accept all traffic of all types incoming from a particular Security Group ID, and that ID
happens to be the ID of this group – in other words, instances in this security group can send
anything to other instances in the same security group. We’ll need to add a couple things to let
us reach these instances from outside the cloud, though. Click on the Edit button, and then use
the Add Rule button to add the following incoming rules:
SSH – TCP – 22 – Custom – 66.194.72.0/24
a. this allows SSH to your instances from campus computers, such as cs1.seattleu.edu
b. the “Description” field is optional, but you may want to enter something to remind
yourself of what this rule is for when you look at it again weeks or months later
SSH – TCP – 22 – My IP
c. choosing “My IP” instead of “Custom” will automatically fill in your current IP
d. this allows SSH to your instances from the computer you are currently using – this
may be redundant if you are on campus, but setting this up with your home IP
address may be convenient for working from home
e. Note: if you change to a different computer, this does not follow you! so make sure
you’re on the computer you want to use for your work when you set up this rule.
Custom TCP Rule – TCP – 5000 – 0.0.0.0/0
f. this allows connecting to your instances from anywhere on port 5000, which we will
use instead of the default HTTP port 80 in order to keep our service a little more
private (although not really because anyone can easily find port 5000 by trial-and-
error) – in effect, the service on port 5000 will be publicly accessible!
Now that all that’s done, you probably won’t need to mess with it any more. (Maybe
occasionally, but not frequently… e.g., if you want to update the “My IP” rule for a new
location…)
Part 2: Launching an EC2 Instance
Now that everything’s set up, let’s launch an EC2 instance and run a web service. Click on
“Instances” (under the INSTANCES section), then click on the “Launch Instance” button. Now
you will have to choose which AMI (disk image) to use for your EC2 instance’s virtual hard drive.
On the left are several categories – choose the “My AMIs” category, then check the “Shared
with me” option.
You should see an image called “Flask Base” (possibly with a version number, like “Flask Base
v1.3”). If you do not see this AMI listed, please contact the instructor.