Group Course Project 2024
Deadline: 23:59, June 5th 2024
Attention: Students can try to understand the project requirements first, and then work out a workflow of the program, including the main functions, sub-functions. The project needs knowledge from the of reading file which can be found in your textbook and lecture notes.
Requirements
1. In this project, please use A,B,C, … ,Sas the matrix names in the program.
A well-structured program should include some sub-functions for special purpose, for example: Reshape, Multiply, Add, Activate and SoftMax. (If your program has only one main function, you might lose some marks for non-functional requirements, but this will not affect marks for functional requirements.)
2. If a user inputs one image file name (e.g., p1.pgm), then output the file name and the digit after the recognition (e.g., p1.pgm: 9)
3. If a user inputs more than one image file names (e.g., p1.pgm, p2.pgm), the program should output each file name and its corresponding digit in the list (e.g., p1.pgm: 5, p2.pgm: 2).
4. After output is given, there should be a prompt, asking the user if he wants to continue. If he chooses to continue, input Y, otherwise, input N. If the user chooses Y, ask the user to input image file name to repeat the previous work.
Sample I/O:
5. Take the screenshot of the results of each number recognition, and put them in a pdf file, in the same the pdf file should also include the names of the group members, each group only need to submit ONE project!
6. When you submit your program, submit all the files including the pdf, pgm, .c, and .h (if there is any) files. If program has compiling errors, no score will be given.
Project Detail
In this project, students will use the trained data to write a program to recognize an image with handwriting digit, for example,
→ 0
The data are trained by a neural network, which is a computational model inspired by how biological neural networks function in the human brain. Neural networks consist of layers of interconnected nodes, often referred to as neurons. In the context of machine learning, neural networks are used for tasks such as pattern recognition, classification, regression, and more. A reference is provided in case you are interested in this topic.
For the model, we have built 1 Input Layer, 2 Hidden Layers, and 1 Output Layer:
. Input Layer: Utilizes Flatten to convert the 2D array of 28x28 pixels into a 1D array of 784 pixels. It’s necessary to flatten the input before passing it to the dense layers.
. Hidden Layers: Each of these layers contains 128 neurons and utilizes the ReLU activation function.
. Output Layer: This layer consists of 10 neurons, each representing a digit from 0 to 9, and utilizes the Softmax activation function to obtain the probability for each digit.
We have already trained the data and obtained the weights and biases stored in .txt files, for more information about the weights and biases, please refer to the reference provided. W1.txt and W2.txt correspond to the weights of the first and the second hidden layers, respectively. B1.txt and B2.txt are the biases. By using these data, you can write a program that can recognize a single handwritten digit number.
The image is stored in a file. To recognize the image, the image information in the file must be read and stored in a two-dimensional array (matrix), and a certain algorithm is used to calculate the similarity of the handwriting digit to 10 digits from 0 to 9. The digit with highest similarity will be the output. For example,
Phase I
→
Phase II
→
0
Figure 1 From Image to array, then from array to digit
You may need to follow the following steps to write a program for the recognition of digit numbers. Or you can write your own code to do this. You will get full credits if your program can recognize the digit numbers correctly.
Phase 1: prepare necessary functions
(1) Write a function to read a pgm file and store in a 28-by-28 matrix:
- Input: A file that stores an image with 28×28 pixels. Each pixel can be represented as an integer between 0 and 255 (inclusive).
- Output: Matrix A (28×28).
- Declare an array floatA[28][28], use the fgets, which is learned in Chapter 12, function to skip the first four lines in the file, which are some basic information about this image.
- Use fgetc function repeatedly to read each pixel from the fifth line until 784 chars have been obtained. Store these chars in the 28-by-28 matrix.
- Normalize each pixel value by dividing 255.0 and store into array (Matrix A).
char pixel=fgetc(fp);
pixel=(double)pixel / 255.0;
You may declare the function as
void pgm_to_matrix(const char *file_path, const int n_rows, const int n_cols,
double A[n_rows][n_cols]);
Example: if you read the file example/example0.pgm, the you matrix should be the same as the following if only 2 digits are printed.
→
(2) Write a function to reshape a m-by-n matrix to a 1-by-(m*n) matrix. For
example, if the 2-by-2 matrix is [1 2; 3 4], then the output should be a 1-by-4 matrix [1 2 3 4]. You may declear the function as
voidmatrix_to_vector(constintn_rows,constn_cols,constdoublematrix[n_rows][n_cols], doublevector[1][n_rows*n_cols]);
(3) Write a function to convert a 1-by-m*n matrix to a m-by-n matrix, which is
the reverse procedure of (2). For example, if the the input is a 1-by-4 matrix [1 2 3 4], then the output is a 2-by-2 matrix is [1 2; 3 4], then the output should be.
You may declare the function as
voidvector_to_matrix(constintn_rows,constn_cols,
constdoublevector[1][n_rows*n_cols],
doublematrix[n_rows][n_cols]);
(4) Write a function to read data in a txt file and store in a 1-by-n matrix, where n
is the number of elements in the txt file. You may create a simple txt file to
verify that your function is correct. You may declare the function as
voidread_txt_data(constchar*filename,constintn,doublearray[1][n]);
(5) Write a function to add two matrices, please refer to Linear Algebra we
learned last semester on how to add two matrices. You may declare the
function as
void add_matrices(const int n_rows, const int n_cols, const double mat1[n_rows][n_cols],
const double mat2[n_rows][n_cols], double result[n_rows][n_cols]);
(6) Write a function to multiply two matrices, please refer to Linear Algebra we
learned last semester on how to do matrix-matrix multiplication. You may declare the function as
void matrix_mul(const int n_rows, const int n_cols, const int p, const double matrix1[n_rows][p],
const double matrix2[p][n_cols], double matrix_product[n_rows][n_cols]);
(7) Activate function. Write a function to get the maximum of the m-th element of
the vector F and 0, then store in G:
Activate: G[m] = max(F[m], 0), m=0, 1,..., SIZE.
You may declare the function:
voidactive_function(constintSIZE,constdoubleF[1][SIZE],
doubleG[1][SIZE])
(8) SoftMax. The soft max of a vector is given by
where max(L) is the maximum value in Matrix L. For example, if L=[-1, 0.2, 4, -19, 10], then max(L) is 10. If you want to use max function in C, you may need to put #include <math.h> at the beginning of the program. You may define the function as
voidsoft_max(constintn_rows,constintn_cols,
constdoubleL[n_rows][n_cols],doubleS[n_rows][n_cols]);
Note:
1. Always check the correctness of your function by some easy examples!
2. Include necessary comments for each line of your code so that it can be understood easily.
Phase II: single handwritten digit number recognition, you may use the data in example folder to verify you code.
Step 1: read the pbm file and reshape the matrix to a 1 × 784 matrix by using the
function you write in Phase 1. If you follow the suggested declarations, they would be pgm_to_matrix and matrix_to_vector. Using the example file “example/example0.pgm”, the matrices are given as follows
Example:
Matrix A
|
(28×28)
|
|
Reshape ↓
|
Matrix B (1×784): [0.00 … … . 0.04 0.59 0.99 … … .. 0.00] (details refer to file
example/B.txt)
|
In this step, you should name the 28×28 matrix by A and the 1×784 matrix by B
Step 2: Apply the weights of the first hidden layer. The weights are store int input_files/W1.txt Use the function in Phase 1 (matrix_mul) to multiply B and C:
- read the weights from input_files/W1.txt and store in a 1×100352 array, you may name it W1
- use the function vector_to_matrix to reshape the array to a 784×128 array named C
Note that the first 784 elements in input_files/W1.txt should be the first row of C.
- use the function matrix_mul to multiply B and C, and store the result in D. For example, you may call the function by
where you have to define n_rows_B,n_cols_C,n_cols_B.
Input: Matrix B (1×784 array from Step 1) Matrix C (784×128 array)
Output: Matrix D.
Multiplication: Matrix D is the multiplication of Matrix B and Matrix C.
Example:
Matrix B (1×784): [0.00 … … . 0.04 0.59 0.99 … … .. 0.00] (details refer to file
example/B.txt) and Matrix C (read from the given file W1.txt)
|
Multiply ↓
|
Matrix D (1×128) : [-1.624571 -2.204840 -3.310512 -0.760301 -0.164472 … …] (details please refer to the file example/D.txt)
|
Step 3: Apply the biases of the first hidden layer. The biases are stored in input_files/B1.txt Input: Matrix D (1×128 array from Step 2)
Matrix E (1×128 array from file input_files/B1.txt) Output: Matrix F (1×128 array)
Add: F[m][n] = D[m][n] + E[m][n]
Exam le:
Matrix D (1×128): [-1.624571 -2.204840 -3.310512 -0.760301 -0.164472 … …] (details please refer to the file example/D.txt) and Matrix E (read from the given file B1.txt)
|
↓
|
Matrix F (1×128): [-1.681936 -2.193661 -3.388107 -0.646890 -0.148641 -1.297442 0.836912 … …] (details please refer to the file example/F.txt)
|
Figure 5 Phase II: first add
Step 4: Activation use the function active_function
Input: Matrix F (1×128 array from Step 3 )
Output: Matrix G (1×128 array) Activate: G[m] = max(F[m], 0)
Exam le:
Matrix F (1×128): [-1.681936 -2.193661 -3.388107 -0.646890 -0.148641 -
1.297442 0.836912 … …] (details please refer to the file example/F.txt)
|
Activate ↓
|
Matrix G (1×128): [0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.836912 … …] (details please refer to the file example/G.txt)
|
Step 5: Apply the weights of the second hidden layer. The weights are store int input_files/W2.txt This step is similar to Step 2, but using different matrices and weights. input_files/W2.txt contains and long vector, similar to Step 2, you should store them in an array of size 1×1280, and reshape to a 128×10 matrix. Note that the first 128 elements in input_files/W2.txt should be the first row of the 128×10 matrix.
Repeat Step 2
Replace B with G; replace W1.txt with W2.txt. Output Matrix H Example:
Matrix G (1×128) : [0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.836912 … …] (details please refer to the file G.txt) and Matrix from the file W2.txt
|
Second multiply ↓
|
Matrix H (1×10) : [ 13.662090 -15.919009 -1.698807 -7.702862 -6.640148 -7.278686 0.050854 1.045354 -7.263999 -1.825507 ]
|