Month: January 2018

RPS and Image Transformations

Rock, Paper, Scissors (30 points)

For the first part of this project, you will be implementing the game of Rock, Paper, Scissors. For those unfamiliar with the rules, typically the game is played with two people who use hand gestures to represent a rock (closed fist), paper (an open hand), or scissor (a vee made with your fingers.) Each person displays their choice at the same time and the winner is determined by (winner in bold):

Scissors cuts paper, paper covers rock, rock breaks scissors

Your job is to write a program where a human can play against the computer in a best-of-5 tournament. The first to win three games wins the match. Have the human player enter their choice, and then have the computer randomly pick its choice. If the two match, the game is a tie and doesn’t count. Otherwise you will add one to the score of the winner. After the match is over, you should ask the user if they would like to play again.

Example:

Welcome to Rock, Paper, Scissors

Would you like to play? yes

What is your choice? scissors
The computer chooses rock. You lose this game!

The score is now you: 0 computer: 1

Hints

  • Generating random numbers in C is a two-step part. First, we need to seed the random number generator onceper program. The idiom to do this is:
    srand((unsigned int)time(NULL));
  • When we need random numbers, we can use the rand() function. It returns an unsigned integer between 0 and RAND_MAX. We can use modulus to reduce it to the range we need:

int value = rand() % (high – low + 1) + low;

Image Transformations (70 points)

A Bitmap Image File (typical extension: .bmp) is a container format for a big array of pixels. There are a variety of ways that BMP files can encode image data, but we will focus on one particular form and write a program that performs two simple image transformations: Inverting the image and Converting a color image to grayscale.

Inverting an image means to take each pixel (a “picture element” – basically one discrete colored point in a larger image) and produce the “opposite” color, which we will define as being the bitwise-NOT of each pixel’s color value.

Converting a color image to grayscale is precisely what it says, we will take the various colors of the image and replace them by differing intensities of the color gray.

We will be assuming Windows Bitmap files whose contents are 24-bit RGB color. This means that each pixel is represented by a 24-bit number, split into three 8-bit parts. The first part is the intensity of the color blue, the second is the intensity of the color green, and the third is the intensity of the color red, each expressed as an integer value from 0-255. (Yes, that’d actually make it BGR and not RGB, but BMP is just weird that way…)

NLP

1
Task: Language Modeling of Different Datasets
Your task is to analyze the similarities and differences in different domains using your language model.
1.1
Data
The data archive (available on Canvas) contains corpus from three different domains, with a train, test, dev, and
readme le for each of them. The domains are summarized below, but feel free to uncompress and examine the
les themselves for more details (will be quite helpful to perform your analysis).
Brown Corpus: Objective of the corpus is to be the standard corpus to represent the present-day (i.e. 1979)
edited American English. More details are available at http://www.hit.uib.no/icame/brown/bcm.html.
Gutenberg Corpus: This corpus contains selection of text from public domain works by authors including Jane
Austen and William Shakespeare (see readme le for the full list). More details about Project Gutenberg is
Reuters Corpus: Collection of nancial news articles that appeared on the Reuters newswire in 1987. The cor-
pus is hosted on the UCI ML repository at https://archive.ics.uci.edu/ml/datasets/Reuters-21578+
1.2
Source Code
I have released some initial source code, available at https://github.com/sameersingh/uci-statnlp/tree/
. The interface and a simple implementation of a language model is available in lm.py , which you
can extend to implement your models. In generator.py , I provide a generic sentence sampler for a language
model. The le data.py contains the main function, that reads in all the train, test, and dev les from the archive,
trains all the unigram models, and computes the perplexity of all the models on each other’s data. The README
le provides a little bit more detail. Of course, feel free to ignore the code if you do not nd it useful.
2
What to Submit?
Prepare and submit a single write-up (PDF, maximum 5 pages) and relevant source code (compressed in a single
zip
or tar.gz le; we will not be compiling or executing it, nor will we be evaluating the quality of the code) to
Canvas. Do not include your student ID number, since we might share it with the class if it’s worth highlighting.
The write-up and code should contain the following.
2.1
Implement a Language Model (20 points)
The primary task of the homework is to implement a non-trivial language model. You are free to pick the type of
the model, such as discriminative
/neural or generative. If you decide to implement an n-gram language model, it
should at least use the previous two words, i.e. a trigram model (with appropriate ltering and smoothing). Your
language model should support “start of sentence”, i.e. when the context is empty or does not have enough tokens.
Use appropriate smoothing to ensure your language model outputs a non-zero and valid probability distribution  

for out-of-vocabulary words as well. In order to make things efcient for evaluation and analysis, it might be
worthwhile to implement serialization of the model to disk, perhaps using pickle .
In the write up, dene and describe the language model in detail (saying “trigram
+laplace smoothing” is not
sufcient). Include any implementation details you think are important (for example, if you implemented your
own sampler, or an efcient smoothing strategy). Also describe what the hyper-parameters of your model are and
how you set them (you should use the dev split of the data if you are going to tune it).
2.2
Analysis on In-Domain Text (40 points)
Here, you will train a model for each of the domains, and anayze only on the text from their respective domains.
Empirical Evaluation: Compute the perplexity of the test set for each of the three domains (the provided
code would do this for you), and compare it to the unigram model. If it is easy to include a baseline version
of your model, for example leaving out some features or using only bigrams, please do so. Provide further
empirical analysis of the performance of your model, such as the performance as hyper-parameters and
/or
amount of training data is varied, or implementing an additional metric.
Qualitative: Show examples of sampled sentences to highlight what your models represent for each domain.
It might be quite illustrative to start with the same prex, and show the different sentences each of them
results in. You may also hand-select, or construct, sentences for each domain, and show how usage of certain
words
/phrases is scored by all of your models (function lm.logprob_sentence() might be useful for this).
2.3
Analysis on Out-of-Domain Text (40 points)
In this part, you have to evaluate your models on text from a domain different from the one it was trained on. For
example, you will be analyzing how a model trained on the Brown corpus performs on the Gutenberg text.
Empirical Evaluation: Include the perplexity of all three of your models on all three domains (a 3 × 3 matrix,
as computed in data.py ). Compare these to the unigram models, and your baselines if any, and discuss
the results (e.g. if unigram outperforms one of your models, why might that happen?). Include additional
graphs
/plots/tables to support your analysis.
Qualitative Analysis: Provide an analysis of the above results. Why do you think certain models
/domains
generalize better to other domains? What might it say about the language used in the domains and their
similarity? Provide graphs, tables, charts, examples, or other summary evidence to support any claims you
make (you can reuse the same tools as the qualitative analysis in § 2.2, or introduce new ones).

GradeCalculator

The main objective in this assignment is to build a small prototype which would be useful in
simulating and testing a partial set of requirements of a fully-featured tool that an instructor
can use to maintain partial grades and compute final grades in a course. Secondly, to give you
practice designing and coding simple classes and methods using loops, conditionals and console
I/O in the Java language. Lastly, to give you an initial exposure to object-oriented techniques to
solve computational problems. To attain these goals, you will organize the programming of this
prototype around the following modular entities:
First, you will write a Java class named Student that allows the creation and maintenance of the
marks, as well as of the final mark, for each student in the course.
Second, you will write a Java class named GradeCalculator which will drive the operational
features of your tool and the text-based console I/O for receiving commands from and issuing
results to a user.
Description
I — User Interface and Program Flow
Upon starting, your tool will display on the console the following menu:
Grade Calculator (Version 0.1). Author: [Your first and last name]
1 – Simulate Course Marks
2 – View/Update Student Marks
3 – Run Mark Statistics
Select Option [1, 2 or 3] (9 to Quit):
Upon completing the processing of any of the operations 1, 2, or 3, your program will return to
display the previous menu and wait for the user to select a menu option. If the user enters 9,
the program terminates.
a) Simulate Course Marks – Selecting this option will run a simulation that creates all the
Student objects for a course of size N students, assigns to each student randomly-generated
marks for assignments 1 and 2, and the final exam. If Student objects from previous course
marks simulations exist when running this simulation, their status attribute will set to false
before creating the Student objects corresponding to the new simulation.
First, the program will prompt the user to enter the class size by displaying the message: Enter
course enrollment size:
Next, the program will issue in sequence the following prompts to the user, asking for weight
percentages within given ranges: Enter weight assignment 1 (20-30):, Enter weight assignment
2 (20-30):, Enter weight final exam (40-60):. If any of the weights is out of range, the program 

will repeat the prompt for the respective weight. If the entered weights do not add up to 100,
the program will display the message << Error: weights do not add up to 100% >>, and will
return to and display the selection menu.
b) View/Update Student Marks – Selecting this option will display the prompt Enter Student
Number:
If the user enters an invalid student number, or the number corresponds to an inactive Student
object (i.e., object’s status attribute is set to false), the program will display the message:
[entered student number] is invalid, and will return to and display the selection menu.
Otherwise, it will display the message: View or Update? (V/U):, and will wait for the user to
enter a selection. If the user enters V, the program will display all the course marks for the
selected student. If the user selects U, the program will display the prompt Mark Type? (A1, A2
or FE): and will wait for a selection from the user. Once the user has input one of the possible
options, the program will display the selected mark as: [Mark Type] is [mark], and will return
to and display the selection menu. If the user enters an invalid mark type, the program will
display the message: [Mark Type] is an invalid mark type, and will return to and display the
selection menu.

web site design

 The detailed functional requirements are listed as follows (basic
requirements):
1.
User can add favorite cat with pictures and description.
2.
User can list all the cats on the web page.
3.
User can search and sort the cats.
4.
User can browse the detailed information for the selected cat.
5.
User can give comments for any cat.
6.
User can give “like” for any cat.
7.
System should give each cat a score based on some kind of
algorithm combining comments and “like”s for each cat.
8.
Administrator can browse the “like” statistics for cats.
9.
Administrator can browse the score statistics for cats.
10.
Administrator have right to remove any cat item from website.
Additional requirements for the coursework:
°
Language: English
°
Use wordpress as opensource platform to build your web application
°
Naming your variables and functions with meaningful identification.
Please analyze above requirements and generate your team’s design
document. Such design document should include at least following
content:
1.
Corresponding Entity-Relationship diagram (including wordpress
original entities);
2.
SQL statements for basic requirements;
3.
Web page navigation diagram;
4.
The skin of your website application (wordpress theme)

C语言

Part 1: Why learn and use C? (25 points)

To start off this assignment, please look into the history of C, the influence it has had on other languages, and some examples of what the language is used for today and then answer the question “Why learn and use C?”. Your answer should be at least 1 paragraph.

There is no “correct” answer to this question. I think that it is an important question to reflect on though when learning any new programming language so that is why I have included it as part of this assignment.

Part 2: Defining the Problem and your Approach Towards Finding a Solution (75 points)

The following is very much a real-world example (though scaled down so that it will be possible for you to complete the assignment in the time that is left in this semester) of a problem that you could solve using C.

The Problem

Given a small database of person accounts (a sqllite3 database – which is written in C itself – that will be provided to you shortly), determine which set of accounts in that database are authorized to access a resource and should therefore be sent to the appropriate destination “system” based on the roles associated with those accounts.

The database will contain person records, roles, resources (i.e. and by resource I mean something that requires a person to have a specific role in order to access it), a mapping of the roles to people, and a mapping of the roles to resources. This should be all of the information that you need in order to programmatically make the determination about which accounts are authorized. I will post an ERD detailing this prior to lab on 11/16.

Each destination “system” that I mentioned will just be a location in local memory (again, for the sake of keeping the scale manageable). When writing to these locations (and I will post more details about the what and how of this piece soon), you will need to allocate memory, write to it, and have a way to maintain it. You should not need these specific implementation details to to come up with a preliminary solution design though.

Designing a Solution

For this part of the assignment, you should not be programming anything. Instead, you should make sure that you understand the functional details of this assignment.

It should also be an opportunity for you to examine your approach to problem solving. The number one reason that I’ve seen students (and even professionals) fail to complete an assignment or task is because they did not understand what the problem was asking and the steps they needed to take in order to solve the problem. If you take the time to document and think about your design and approach, I do not think that you will find yourself in a place where you are unable to complete this assignment.

For this part of the assignment, how you present your design is up to you. A couple of options are to create a process flow diagram or to create a specification document of some kind, such as a scaled down version of a software requirements specification (SRS) document (some things that normally go into an SRS document, just aren’t applicable or aren’t worth spending too much time on because of the limited scale of this assignment. I will post some examples of some documentation that I’ve done to solve problems similar to this one.

Part 3: Writing Pseudo-code (75 points)

Once you have your preliminary design, you should then start to think about specifics of your implementation. Rather than jumping right in and writing C code though, you should first take the intermediate step of writing pseudo-code. I will expect you to turn this in.

When you move on from part 3 to part 4, do not worry about going back and updating your pseudo-code if you find that you’ve deviated from it in your actual code. The pseudo-code is just a stepping stone to help you write the actual code, so I don’t think it is necessary or useful to update it once it is done unless you find that you need to do a massive overhaul of your code and once again use the pseudo-code as your stepping stone.

Part 4: Implementing the Solution (100 points)

I am purposefully holding off on writing anything here. I will post more details here over the weekend. I really strongly recommend that you do these parts in order (at least parts 2-5), so you should not need these details until you’ve completed parts 2 and 3.

Part 5: Testing your Solution (25 points)