Author: cs daixie

Digital Image Processing

1. DFT:
(15 Pts.) Write code for computing the forward Fourier transform, inverse Fourier transform, and magnitude of the Fourier transform.
The input to your program is a 2D matrix of size 15X15.

– Starter code available in directory frequency_filtering/
– dft/Dft.py: Edit the functions “forward_transform”, “inverse_transform”, and “magnitude”, you are welcome to add more functions.
– For this part of the assignment, please implement your own code for all computations, do not use inbuilt functions – like “fft”, “dft”, “abs” from numpy, opencv or other libraries – that directly accomplish the objective of the question. You can use functions such as “sin” and “cos” from the math package as needed.
– This part of the assignment can be run using dip_hw3_part_1.py (Do Not modify this file)
– Usage:

python dip_hw3_part_1.py
– Please make sure your code runs when you run the above command from prompt/terminal
– Do not save any output in your code. Any output images or files will be saved to “output/” folder (dip_hw3_part_1.py automatically does this)

————-
2. Linear Filtering:
(30 Pts.) Write code to perform spatial filtering on an image using 5X5 Gaussian filter and a 3X3 Laplacian filter.

– Starter code available in directory spatial_filtering/
– spatial_filtering/filtering.py:
– \__init__(): Will initialize the required variable for filtering (image). DO NOT edit this function.
– get_gaussian_filter(): Write your code to compute a 5X5 Gaussian filtering here.
– get_laplacian_filter(): Wring your code to initialize a 3X3 Laplacian filtering here.
– filter(): Wring your code to perform filtering operation on the input image using the specified filter.
– For this part of the assignment, please implement your own code for all computations, do not use inbuilt functions from numpy, opencv or other libraries that directly accomplish the objective of the question. You can use functions from math package if needed.
– This part of the assignment can be run using dip_hw3_part_2.py (Do Not modify this file)
– Usage:

python dip_hw3_part_2.py
– Please make sure your code runs when you run the above command from prompt/terminal
– Do not save any output in your code. Any output images or files will be saved to “output/” folder (dip_hw3_part_2.py automatically does this)

————-
3. Frequency Filtering:
(30 Pts.) Write code to perform image filtering in the frequency domain. The input image is corrupted with periodic noise.
To perform filtering:
1. Compute the DFT of the image.
2. Inspect the DFT to identify the noise frequencies.
3. Create a filter using trial and error to reject the noise frequencies.
4. Compute the inverse DFT of the filtered frequencies to obtain the filtered image.

The input to your program is an image with periodic noise.

– Starter code available in directory frequency_filtering/
– frequency_filtering/filtering.py:
– \__init__(): Will initialize the required variable for filtering (image). An addsinNoise function will be called to add sine noise to a given image. DO NOT edit this function.
– get_mask: Write your code to generate the masks to reject noise frequencies.
– filter(): Write your code to perform image filtering here. The required variables have been initialized and can be used as self.image, self.mask.
– The function returns three images, filtered image, magnitude of the DFT, and magnitude of filtered dft
– Perform necessary post-processing to make the images visible. (Log compression, full contrast stretch, etc.)
– post_process_image(): Write your code to perform post-processing here.
– For this part of the assignment, You can use **inbuilt functions to compute the Fourier transform**.
– For example, **you are welcome to use fft and dft libraries that are available in numpy and opencv**
– This part of the assignment can be run using dip_hw3_part_3.py (Do Not modify this file)
– Usage:

python dip_hw3_part_3.py
– Please make sure your code runs when you run the above command from prompt/terminal
– Do not save any output in your code. Any output images or files will be saved to “output/” folder (dip_hw3_filter.py automatically does this)

Verilog

• Simulate the two modules for functional correctness with the testbenches provided. Ensure that the generated result matches golden result provided.
• Perform synthesis checks with OpenRAM memory modules.

Design Description:

To build the hardware, for the FPGA implentation you are provided a simple dual port memory module, mem. v, which you can use to instantiate memory blocks accordingly. For ASICs, OpenRAM is used to generate SRAM blocks.
A high-level picture of the memory structure is shown below. You have to design s2mm. sv and mm2s. sv blocks.

OpenRAM: SRAM Blocks
The lab manual on OpenRAM provides details of how OpenRAM is used and configured to generate SRAM macros. The description here details how OpenRAM is used in this lab.
The configuration file config_sram_lr _lw. py is used to configure the SRAM macros needed for Lab4. In this lab memories are instantiated in s2mm. sv and mm2s. sv. The details of these modules will be expanded in the next two sections, the memory requirements are summarized below.
Memory in s2mm. sv has the following requirements:

• Port : 1 Read 1 Write
• Width : 32
• Depth : Atleast M*M/N

Memory in mm2s. sv has the following requirements:

• Port : 1 Read 1 Write
• Width : D_W*2
• Depth : Atleast M*M/N

For each module and combination of paremeters M, N and o_w, different RAMs need to be generated.
Note: OpenRAM and the reference memory block given to you in mem. v have similar timings (NOT exact). You can use this in your design.
Stream to Memory-Mapped s2nwn.sv:
The systolic core previously designed in Lab 3 requires the input matrices A and B are streamed to the system one element per cycle with suitable phase offsets for different rows and columns. When M>N, we have to reuse data as multiple sub-matrix multiplications are needed. Consequently, some form of memory structure is needed for the purpose of holding the matrix elements before they are presented to the NxN. As a result, each memory block will be holding M*M/N matrix elements.

Writing Mechanism:
The s2mm.sv module has several inputs, two of which are s_axis_s2mm_tdata and s_axis_s2mm_tvalid, which represent streamed data and valid signals, respectively. Since only one data port and one valid port are available to us, the data/valids must be streamed serially across the 2*N memory blocks. For high-frequency operation, your design should pipeline the data/valids across the 2*N memory blocks through a set of 2*N pipeline stages (or shift registers). You must also generate write addresses and memory enable signals internally based on the system inputs/outputs. The mechanism for writing the 2*N memories is sequential such that the i th set of streamed M*M/N matrix elements is written to the ith memory block, so that eventually 2*M*M values are written in total. Note that all data that is written to memory should be valid. You can assume that s_axis_s2mm_tready signal is set to 1 which permits the host CPU to send you data continuously without interruption if possible. The inputs s_axis_s2mm_tkeep and the s_axis_s2mm_tlast signals can be ignored for this design.

 

sample database system

In this project, you are asked to design and implement a sample database system. Here are
general requirements.

The system should support a data model, which can be relation (as in MySQL) or
JSON (as in Firebase and MongoDB), or any other model of your choice. Users of the
system will structure their data using the model provided by the system.
• The system should have its own query language which should be different from
existing query languages, including SQL, and queries provided by Firebase and
MongoDB. It is ok that the language is like natural language, e.g., “find employees
who are at least 25 years old”.
• The query language should support projection (selecting a subset of rows), filtering
(selecting rows), join (e.g., combining multiple data tables), grouping, aggregation,
and ordering.
• The system should also provide commands for inserting, deleting, and updating the
data. These commands can be like that in existing database systems.
• You are free to decide how you store the data in a data model (e.g., you may store a
table in a file), and how you implement the data modification commands above.
• Your system should not load the entire database into the memory and process
queries and data modifications on the entire database. Instead, you should assume
that the database may potentially handle a large amount of data that might not fit in
the main memory.
• You should implement an interactive command line interface (similar to MySQL,
MongoDB, sftp clients, etc.) for users to interact with the systems, issue commands,
and get results. As an example, suppose your database is called MyDB, your
interface may look like:

o MyDB > create table person(a int);
o MyDB>Tablecreated.
o MyDB > insert into …
o MyDB > find employees who are at least 25 years old o MyDB>…
o MyDB > exit
• You should show how to create a database using your system to store multiple realworld
data

sets, and how the queries and data modifications work on the data sets.
• Your dataset should be some existing real-world dataset available on the Web. For
example,
Kaggle, google, etc. are good places to find such datasets。

Project

In this project, we implement the (forward) Gauss-Seidel method and the preconditioned steepest
descent methd (PSD) for solving the linear system Ax = b with A being symmetric positive de nite.
The preconditioner in this project is from the Jacobi iterative method: P = D = diag(A). The error
estimator in both iterative methods is the relative residual jjrkjj
jjbjj where rk = b􀀀Axk, and jjbjj is length
of b. We choose the tolerance  = 10􀀀10.

1. Implement the (forward) Gauss-Seidel method for solving Ax = b. Download the matrix A.dat
from Project 2 in Brightspace. Read the matix A (16001600) into your code. Generate b = Axt
with all components of xt being one. Run your code, and output the iteration number k and
relative residual jjrkjj
jjbjj into a le. Plot the relative residual as a function of iteration number. An
example of such plot is Figure 2.4 on Page 128 of the textbook (2nd Edition, attached).
2. Implement the preconditioned steepest descent method. The algorithm: selecting x0, computing
r0 = b 􀀀 Ax0, then:

Simulation based Process Scheduling

Explore multiple Schedulers and what OS tracks

You will learn how to write a Discrete Event Simulator

Specification of Process and Random Behavior
0 200 40 90
40 100 10 40
50 20 10 10
60 200 5 20
Process
1
Process
4
process
arrival/start
Total
CPU time
CPU
Burst
IO
Burst

Cpu
burst = random ( range [ 1.. Proc.cpu_burst ]

Ioburst
= random ( range [ 1.. Proc.io_burst ]

Project

Part 1
Write a MIPS assembly language subroutine called PrintName that displays your username to
the console in SPIM. Use the syscalls discussed in class and in Appendix A of the textbook to
accomplish this. Your program should be named project_1_part_1.s. Submit ONLY the
subroutine, using the template found on the Projects page of the course website. Your
subroutine will be graded by making multiple subroutine calls (using jal PrintName) to
PrintName to ensure that it works repeatedly.

Part 2
Write a MIPS assembly language subroutine called GetCode that asks the user to enter a 7 bit
code consisting of ones and zeros. When the user is finished entering the data, they should hit
the Enter key. The data should be stored in memory as a NULL terminated ASCII string at the
address passed into the routine in register a1. The user should be prompted for the data by
displaying a prompt to the console asking them to enter the data. These prompts can be stored in
the beginning of the data segment, and should not reside outside of the range 0x10000000
through 0x1000FFFF in memory. Use the syscalls discussed in class and in Appendix A of the
textbook to implement this subroutine. Your program should be named project_1_part_2.s.
Submit ONLY the subroutine, using the template found on the Projects page of the course
website. Your subroutine will be graded by making multiple subroutine calls (using jal
GetCode) to GetCode to ensure that it works repeatedly.

Part 3
Write a MIPS assembly language subroutine called GetOnes that accepts a pointer to a NULL
terminated ASCII string in register a1. The routine should count the number of ones (ASCII
value 0x31) in the string, and return that number is register v0. The string should not be
modified by your routine. Your program should be named project_1_part_3.s. Submit ONLY
the subroutine, using the template found on the Projects page of the course website. Your
subroutine will be graded by making multiple subroutine calls (using jal GetOnes) to GetOnes to
ensure that it works repeatedly.

Part 4
Write a MIPS assembly language program that tests Parts 2 and 3. The program should allow
the user to repeatedly enter a code (calling GetCode) and obtain the number of ones in the code
(calling GetOnes). When the number of ones are obtained, that result should be displayed on the
screen. The program should end when the user decides to quit. Your program should be named
project_1_part_4.s.

Parrallel Gol

Programming Project #3: A parallel distributed implementation of Conway’s Game of Life (GoL) // orat least my version of it!

Assignment type: You can work in team sizes of at most two including yourself. If you want to workindividually, that’s fine too. But either way you should fill out the cover sheet. If you are working in teams,please ensure you are compliant with the submission requirements (see last section).

You are expected to the Pleiades cluster for this project.

Preliminaries: This project assumes that you have already understood how to write and test a simplehelloworld.c (https://wsu.instructure.com/courses/1650006/files/108598122?wrap=1)(https://wsu.instructure.com/courses/1650006/files/108598122/download?download_frd=1) MPI program onthe cluster. Please read the documentation(https://wsu.instructure.com/courses/1650006/pages/clusterinstructions) on how to use the Pleiadescluster, and how to compile and run an MPI job on the cluster before you work on this assignment.

For this project you may need to use one or more of the following MPI communication primitives:

System Integration

Background Information: Study Abroad – Assignment 2 by Dr. Peter

In recent years, the opportunity to incorporate international study into their degrees has become increasinglyaccessible for Australian students. Pursuing a study abroad experience is viewed as a pathway to enhancedemployability after graduation by students and as a valuable point of distinction among job applicants byemployers. Typically lasting one semester, these study abroad programs can also extend up to a year for thoseseeking a more extended international academic experience.

The Application Process:

The high-level process architecture of a study abroad application covers the core, support, and managementprocess. The core processes include activities of a student study abroad application and the studentadministration team revolving around an application (figure 1).

Project Brief (4 marks)

Provide a description on: the group meeting schedule or plan, each member’s role in the project, identifying risksand ways to manage risks, ways to communicate between members, and a log recording each meeting,discussion and dialogue between members in relation to the project. Note: all members need to contribute to theproject. The description and log need to be submitted together with the four tasks.

Task 1 Collaborative Business Process and EAI (4 marks)

When a student puts an application to study a course aboard, we need check her/his degree program structure,the relevant information about the course available in the partner’s university, to judge if the selected coursemeets the required credit points and the contents, and to check if her/his degree program allows such selection.A degree program structure normally contains the total credit points required, the core units required at differentlevel; while the unit information includes the prerequisite units and offering semester. Different universities mayhave different degree rules and course requirements.

MIPS Assembly

Part 1

Write a MIPS assembly language subroutine called PrintName that displays your username tothe console in SPIM. Use the syscalls discussed in class and in Appendix A of the textbook toaccomplish this. Your program should be named project_1_part_1.s. Submit ONLY thesubroutine, using the template found on the Projects page of the course website. Yoursubroutine will be graded by making multiple subroutine calls (using jal PrintName) toPrintName to ensure that it works repeatedly.

Part 2

Write a MIPS assembly language subroutine called GetCode that asks the user to enter a 7 bitcode consisting of ones and zeros. When the user is finished entering the data, they should hitthe Enter key. The data should be stored in memory as a NULL terminated ASCII string at theaddress passed into the routine in register a1. The user should be prompted for the data bydisplaying a prompt to the console asking them to enter the data. These prompts can be stored inthe beginning of the data segment, and should not reside outside of the range 0x10000000through 0x1000FFFF in memory. Use the syscalls discussed in class and in Appendix A of thetextbook to implement this subroutine. Your program should be named project_1_part_2.s.Submit ONLY the subroutine, using the template found on the Projects page of the coursewebsite. Your subroutine will be graded by making multiple subroutine calls (using jalGetCode) to GetCode to ensure that it works repeatedly.

Part 3

Write a MIPS assembly language subroutine called GetOnes that accepts a pointer to a NULLterminated ASCII string in register a1. The routine should count the number of ones (ASCIIvalue 0x31) in the string, and return that number is register v0. The string should not bemodified by your routine. Your program should be named project_1_part_3.s. Submit ONLYthe subroutine, using the template found on the Projects page of the course website. Yoursubroutine will be graded by making multiple subroutine calls (using jal GetOnes) to GetOnes toensure that it works repeatedly.

Part 4

Write a MIPS assembly language program that tests Parts 2 and 3. The program should allowthe user to repeatedly enter a code (calling GetCode) and obtain the number of ones in the code(calling GetOnes). When the number of ones are obtained, that result should be displayed on thescreen. The program should end when the user decides to quit. Your program should be namedproject_1_part_4.s.

Designing High Performant Systems for AI

1 [45+10 points]

Convolution of an image with a filter is a key operation in Convolutional Neural Networks(CNN). Consider the following algorithm to perform convolution between an H × W imageI using a k × k kernel K to produce output image O.

Listing 1: Naive Convolution Algorithm

1 for (Ix = 0; Ix < W; Ix++)

2 for (Iy = 0 ; Iy < H; Iy++)

3 for (Kx = 0; Kx < k; Kx++)

4 for (Ky = 0; Ky <k; Ky++) {

5 If (Ix+kx < W && Iy+Ky < H )

6 O[Ix][Iy] = O[Ix][Iy] + I[Ix+Kx][Iy+Ky]*K[Kx][Ky];

7 }

a) What are the dimensions of the output image O? [4 points]

b) What are the average data reuse rates of the elements of I, K and O in the algorithmabove? Data reuse rate is defined as the average of the ratio of the number of computeoperations performed on a data item and the number of times a data item is fetched.Assume there is no cache. Explicitly specify any other assumptions regarding localmemory that you make. [4.5 points]

c) What is the average data reuse rate of the entire algorithm? [1.5 points]Now consider the processor-memory architecture considered in the class. 2 pipelinesrunning at 2 Ghz. Memory latency of 10 ns (20 cycles) and a memory bandwidth of 64 bitsat 1 Ghz. Assume each data item is 32 bit.

c) Calculate the Sustained Performance (Best Case) of the algorithm. Use assumption 3that we discussed in the class for calculations. Do not worry too much about the edgecases (when If case is false). [5 points]

d) Calculate the Sustained Performance (Worst Case) of the algorithm. Again use assumption 3 that we discussed in the class for calculations. [5 points]

Assume you have a cache of size k24. Now, answer the following questions.

e) If we redesign the algorithm such that chunks of size k24of input image I are fetchedinto the cache and all the convolution operations in which this chunk is involved areperformed, what will be the actual data reuse rates of O and K. Note that the datareuse rate of I will be k2 as we are ignoring the edge cases for simplicity. Also, notethat there can be multiple such potential redesigns, so explain the idea of your redesignclearly. [10 points]

f) If we redesign the algorithm such that chunks of size k24of filter K are fetched into thecache and all the convolution operations in which this chunk is involved are performed,what will be the actual data reuse rates of O and I. Note that the data reuse rate ofK will be H × W. Also, note that there can be multiple such potential redesigns, soexplain the idea of your redesign clearly. [10 points]

g) Which of the above two redesigns of the algorithm will lead to maximizing the overalldata reuse, i.e., data reuse for all the data items combined ? [5 points]

h) (Extra Credits towards overall grade) Can you redesign the algorithm so that the datareuse rate of O turns out to be k2? Write the pseudo code for the same to receive theextra credits. [10 points]