代做Big Data and Machine Learning for Economics and Finance Assignment 1帮做R编程

2024-08-17

Assignment 1

Big Data and Machine Learning for Economics and Finance

Provide a document that contains your answers, R code, code output and com-mentson the code output. Comment each line of your R code as well. Give thorough explanations throughout.

Exercise 1. (75 points) For this exercise, download the ﬁle Assignment1_data1 .csv and create a data frame called Data1.

I. Consider the following ﬁgure constructed from the dataset Data1.

Figure 1. Some box plots

a) Write the R code to reproduce that ﬁgure.

b) What are the conditioning variables in those plots? Give a thorough interpreta-tion.

II. Consider another ﬁgure constructed from the same dataset.

Figure 2. Some histograms

a) Write the R code to reproduce that ﬁgure.

b) Describe the distributional information that lies behind each histogram.

III. Consider the following prediction models:

a) Predict Y given Z.

b) Predict Y given X .

c) Predict Y given X and Z.

d) Predict Y given X , Z, and XZ. (XZ stands for the “product” of X and Z and is called an interaction variable.)

Given X=2.7, and Z=“B”, give the prediction for each of the 4 models (the use of the function predict is not allowed in this question).

IV. Write R code to reproduce the following plot. Interpret what the plot represents.

Figure 3. Some regression output plot

V. Referring back to question III, compute the training mean squared error for each model.

Deduce which model gives the best predictions.

Exercise 2. (25 points) I asked chatGPT the following question:

Question: Assume we have the following sample of the two variables X and Y: (1.1,3.2,- 2,5,0.1,-1,4) and (A,B,C,A,C,B,A) respectively. Using the 3-nearest neighbour regression method, predict Y when X=2.

chatGPT gave the following long answer:

To predict Y for X = 2 using 3-nearest neighbor regression, follow these steps:

1. Calculate the distances between the new point X = 2 and each point in the dataset.

- For X = 1.1, the distance is | 2 - 1.1 | = 0.9.

- For X = 3.2, the distance is | 2 - 3.2 | = 1.2.

- For X = -2, the distance is | 2 - (-2) | = 4.

- For X = 5, the distance is | 2 - 5 | = 3.

- For X = 0.1, the distance is | 2 - 0.1 | = 1.9.

- For X = -1, the distance is | 2 - (-1) | = 3.

- For X = 4, the distance is | 2 - 4 | = 2.

2. Sort these distances in ascending order and identify the three smallest distances.

The distances are: 0.9, 1.2, 1.9, 2, 3, 3, 4.

The three smallest distances are 0.9, 1.2, and 1.9.

3. Find the corresponding Y values for these distances:

- Distance 0.9 corresponds to Y = 3.2 (from X = 1.1).

- Distance 1.2 corresponds to Y = 5 (from X = 3.2).

- Distance 1.9 corresponds to Y = 0.1 (from X = 0.1).

4. Compute the average of these Y values to predict Y for X = 2:

Predicted Y = (3.2 + 5 + 0.1)/3 = 8.3/3 ≈ 2.77

So, the predicted Y for X = 2 using the 3-nearest neighbor regression method is approximately 2.77.

Find all the mistakes in chatGPT's output (if any exist at all). If chatGPT's answer is incorrect, provide the correct answer. Justify thoroughly all arguments.

上一篇
代做MAE 115 APPLIED ENGINEERING THERMODYNAMICS Hw#7代写数据结构语言程序

下一篇
代做COMP2140 Web/Mobile Programming Semester 2, 2024代做Java程序

上一篇：代做MAE 115 APPLIED ENGINEERING THERMODYNAMICS Hw#7代写数据结构语言程序

下一篇：代做COMP2140 Web/Mobile Programming Semester 2, 2024代做Java程序