代做BSAN3210 Technology of Business Analytics Semester 2 2024代做留学生R程序

2024-11-11 代做BSAN3210 Technology of Business Analytics Semester 2 2024代做留学生R程序

The School-based Take-Home Assessment (A3)

Course code and name

BSAN3210 Technology of Business Analytics

Semester

Semester 2 2024

Assessment type

School Based Take-home Assessment

Due date and time

The assessment will be available from 7th of November from 9am.

The assessment is due at 5pm the 14th of November.

Please note: you will not be able to access the assessment task after this time.

Assessment window

You have a seven-day window in which you must complete

your assessment. You can access and submit your assessment at any time within the Five-day window.

Weighting

This assessment is worth 30 percent of your total mark for this course.

Permitted materials

This is an open book assessment – all materials permitted (including online resources).

Required/recommended materials

To complete this assessment successfully you will need a

computer with the following software: Microsoft Word, R, and RStudio.

Instructions

The assessment consists of 7 questions worth a total of 100

points.  Questions 1, 2 and 5 worth 10 points each, Questions 3 and 6 are worth 15 points each, and Questions 4 and 7

worth 20 points each – for a total of 100 points. Please answer all of the questions and clearly label each answer with

reference to the question it addresses. Submit your answers to all seven questions as a Microsoft Word document.

Assessment

extension/deferral

Please begin your assessment as soon as possible within the

available window. However, if you become unwell or

experience exceptional circumstances while completing this

assessment then submit an extension request before the due

date/time:.https://my.uq.edu.au/information-and- services/manage-my-program/exams-and-

assessment/applying-extension

Who to contact

Should you have any issues about the assessment task, you should contact Zara Taba using the course email address

(z.taba@uq.edu.au).

Important assessment conditions

The normal academic integrity rules apply to this assessment task.

You cannot cut-and-paste material other than your own work as answers.

You are not permitted to consult any other person –

whether directly, online, or through any other means – about any aspect of this assessment during the period that this assessment is available.

If it is found that you have given or sought outside assistance with this assessment then that will be deemed to be cheating and will result in disciplinary action.

By undertaking this online assessment you will be deemed to have acknowledged UQ’s academic integrity pledge to have made the following declaration:

“I certify that my submitted answers are entirely my own work and that I have neither given nor received any unauthorised

assistance on this assessment item” .

Question 1(10 marks)

a) Why is a data repository essential for organisations in the modern digital landscape? Are there specific scenarios where a data repository may not be necessary or may be overkill?

b) Describe the various architectures used for data repositories, such as data warehouses, data lakes, data lake. What are the key differences between these architectures, and in what situations might each be best applied?

c) Evaluate the advantages and limitations of each data repository architecture.

d) Explore the role of data repositories in supporting advanced analytics, AI, and machine learning initiatives. How can the choice of architecture influence an organisation’s ability to leverage data-driven insights?

Question 2 (10 marks)

a) How do "errors" and "bugs" differ in the context of software development?

b) Why is software testing a critical part of the development lifecycle?

c) Describe the various levels of software testing.

d) In your own project experience, which levels of testing have you implemented? Can you discuss specific examples or challenges you encountered?

Question 3 (15 marks)

The Cybersecurity Framework (discussed in the lecture notes) outlines a seven-step process for establishing or improving cybersecurity programs. Consider a small healthcare clinic that stores patient data electronically and relies on digital systems for daily operations. Write 100 words for each step on how you might use this Framework to manage cybersecurity in the clinic. Identify specific threats that are likely in this healthcare environment and how you would address them. Due to the brief word count, focus on one or two primary threats, such as data breaches or ransomware. List any assumptions about the clinic’s current cybersecurity status prior to this work.

Question 4 (20 marks)

Suppose you are working in an e-commerce company. Your team consists of 20 members, and each member is responsible for analysing customer feedback from 100 unique customers on recent purchases. They should gather information on the product type, customer age, and satisfaction level for each day over one week. Explain how you might use each of the following pre-processing steps in this context:

a) Data consolidation

b) Data cleansing

c) Data transformation

d) Data reduction

Write about 100 words for each bullet point above.

Question 5 (10 marks)

What are the key differences between data.frame. and data.table types in datamanipulation? What are the advantages and limitations of each, and in what contexts might one be more suitable than the other? For advanced data manipulation tasks, which type provides greater efficiency and flexibility?

Question 6 (15 marks)

Using “mtcars” dataset in R and answering to the following questions. For each question you need to provide the exact R codes in R-studio. You should provide the copy of the plots.

(make a screenshot from R plot)

a)   Create a scatterplot and a bar plot with any chosen variable with appropriate tittle and labels. (You need to add the R codes here)-

b)   Use a loop structure and shows 4 different plots with different colours and symbols. All 4 plots should be appeared in one window.

Question 7 (20 marks)

Install ggplot2 package in R and use “diamonds” dataset. Use sqldf package for answering the following questions. You need to provide the R outputs and R codes:

a)   Find the top 5 diamonds (highest price) for each combination of cut and colour, and report their average carat weight (use just one query, you can use the nested query)

b)   Grouping diamonds by clarity and provide the average priece in each group sorted ascending.