Assessment (non-exam) Brief
Module code/name
|
MSIN0154
|
Module leader name
|
|
Academic year
|
2024/25
|
Term
|
1
|
Assessment title
|
Individual Assignment 1
|
Individual/group assessment
|
Individual
|
Submission deadlines: Students should submit all work by the published deadline date and time. Students experiencing sudden or unexpected events beyond your control which impact your ability to complete assessed work by the set deadlines may request mitigation via the extenuating circumstances procedure. Students with disabilities or ongoing, long-term conditions should explore a Summary of Reasonable Adjustments.
Return and status of marked assessments: Students should expect to receive feedback within one calendar month of the submission deadline, as per UCL guidelines. The module team will update you if there are delays through unforeseen circumstances (e.g. ill health). All results when first published are provisional until confirmed by the Examination Board.
Copyright Note to students: Copyright of this assessment brief is with UCL and the module leader(s) named above. If this brief draws upon work by third parties (e.g. Case Study publishers) such third parties also hold copyright. It must not be copied, reproduced, transferred, distributed, leased, licensed or shared with any other individual(s) and/or organisations, including web-based organisations, without permission of the copyright holder(s) at any point in time.
Academic Misconduct: Academic Misconduct is defined as any action or attempted action that may result in a student obtaining an unfair academic advantage. Academic misconduct includes plagiarism, self-plagiarism, obtaining help from/sharing work with others be they individuals and/or organisations or any other form. of cheating that may result in a student obtaining an unfair academic advantage. Refer to Academic Manual Chapter 6, Section 9: Student Academic Misconduct Procedure - 9.2 Definitions.
Referencing: You must reference and provide full citation for ALL sources used, including AI sources, articles, text books, lecture slides and module materials. This includes any direct quotes and paraphrased text. If in doubt, reference it. If you need further guidance on referencing please see UCL’s referencing tutorial for students. Failure to cite references correctly may result in your work being referred to the Academic Misconduct Panel.
Use of Artificial Intelligence (AI) Tools in your Assessment: Your module leader will explain to you if and how AI tools can be used to support your assessment. In some assessments, the use of generative AI is not permitted at all. In others, AI may be used in an assistive role which means students are permitted to use AI tools to support the development of specific skills required for the assessment as specified by the module leader. In others, the use of AI tools may be an integral component of the assessment; in these cases the assessment will provide an opportunity to demonstrate effective and responsible use of AI. See page 3 of this brief to check which category use of AI falls into for this assessment. Students should refer to the UCL guidance on acknowledging use of AI and referencing AI. Failure to correctly reference use of AI in assessments may result in students being reported via the Academic Misconduct procedure. Refer to the section of the UCL Assessment success guide on Engaging with AI in your education and assessment.
For staff reference only: template version 1.0 September 2024
Content of this assessment brief
Section
|
Content
|
A
|
Core information
|
B
|
Coursework brief and requirements
|
C
|
Module learning outcomes covered in this assessment
|
D
|
Groupwork instructions (if applicable)
|
E
|
How your work is assessed
|
F
|
Additional information
|
Section A Core information
Submission date
|
26/11/2024
|
Submission time
|
10 am
|
Assessment is marked out of:
|
100
|
% weighting of this assessment within total module mark
|
30%
|
Section B Assessment Brief and Requirements
The assignment includes five questions. The first question focuses on the normal distribution. The second explores the concept of the value of information through conditional probability. The third involves working with a dataset and performing a series of hypothesis tests. The fourth applies probabilistic models in a business context. The fifth discusses how concepts from the course can be applied to a realworld case study.
Please refer to the attached assignment document at the end of this assessment brief for more details.
Section C Module Learning Outcomes covered in this Assessment
This assessment contributes towards the achievement of the following stated module Learning Outcomes as highlighted below:
• Understand key concepts in statistics.
• Interpret data using hypothesis testing.
• Make decisions under uncertainty.
• Apply statistical models to real-world business problems.
Section D: Groupwork Instructions (where relevant/appropriate)
N/A
Section E: How your work is assessed
Within each section of this assessment you may be assessed on the following aspects, as applicable and appropriate to this assessment, and should thus consider these aspects when fulfilling the requirements of each section:
• The accuracy of any calculations required.
• The strengths and quality of your overall analysis and evaluation;
• Appropriate use of relevant theoretical models, concepts and frameworks;
• The rationale and evidence that you provide in support of your arguments;
• The credibility and viability of the evidenced conclusions/recommendations/plans of action you put forward;
• Structure and coherence of your considerations and reports;
• Appropriate and relevant use of, as and where relevant and appropriate, real world examples, academic materials and referenced sources. Any references should use either the Harvard OR Vancouver referencing system (see References, Citations and Avoiding Plagiarism)
• Academic judgement regarding the blend of scope, thrust and communication of ideas, contentions, evidence, knowledge, arguments, conclusions.
• Each assessment requirement(s) has allocated marks/weightings.
Student submissions are reviewed/scrutinised by an internal assessor and are available to an External Examiner for further review/scrutiny before consideration by the relevant Examination Board.
It is not uncommon for some students to feel that their submissions deserve higher marks (irrespective of whether they actually deserve higher marks). To help you assess the relative strengths and weaknesses of your submission please refer to SOM Assessment Criteria Guidelines, located on the Assessment tab of the SOM Student Information Centre Moodle site.
The above is an important link as it specifies the criteria for attaining the pass/fail bandings shown below:
At UG Levels 4, 5 and 6:
80% to 100%: Outstanding Pass - 1st; 70% to 79%: Excellent Pass - 1st; 60%-69%: Very Good Pass - 2.1; 50% to 59%: Good Pass - 2.2; 40% to 49%: Satisfactory Pass - 3rd; 20% to 39%: Insufficient to Pass - Fail; 0% to 19%: Poor and Insufficient to Pass - Fail.
At PG Level 7:
86% to 100%: Outstanding Pass - Distinction; 70% to 85%: Excellent Pass - Distinction; 60%-69%: Good Pass - Merit; 50% to 59%: Satisfactory - Pass; 40% to 49%: Insufficient to Pass - Fail; 0% to 39%: Poor and Insufficient to Pass - Fail.
You are strongly advised to review these criteria before you start your work and during your work, and before you submit.
Upon receipt of your mark, you are strongly advised to not compare your mark with marks of other submissions from your student colleagues. Each submission has its own range of characteristics which differ from others in terms of breadth, scope, depth, insights, and subtleties and nuances. On the surface one submission may appear to be similar to another but invariably, digging beneath the surface reveals a range of differing characteristics.
Students who wish to request a review of a decision made by the Board of Examiners should refer to the UCL Academic Appeals Procedure, taking note of the acceptable grounds for such appeals.
Note that the purpose of this procedure is not to dispute academic judgement – it is to ensure correct application of UCL’s regulations and procedures. The appeals process is evidence-based and circumstancesmust be supported by independent evidence.
Section F: Additional information from module leader (as appropriate)
N/A
Individual Assignment
Instructions
Deadline
The deadline for this individual assignment is 10:00 am on 26/11/2024 (Tuesday). However, it is highly recommended that you start working on it well in advance. Problems 1 and 2 are fully covered by the material from Lectures 1 to 5, while Problem 3 is based on material from Lectures 6 to 8. Problem 4 includes questions of varying difficulty; for example, Problems 4.1 and 4.2 can be answered using knowledge from Lectures 1 to 3. Feel free to be creative in your answers to Problems 4 and 5.
Format
Please provide a detailed explanation of your thought process alongside your solutions. Submit your assignment as an individual PDF file on Moodle. Ensure that your answers are presented in the same order as the problems.
For Problems 1 to 4, which may include mathematical expressions, you have the option to either type your solutions or write them by hand. If writing by hand, take clear photos of your work and compile the images into the PDF. For Problem 5, which focuses on your hypothetical business plan and ideas, you must type your response; hand-written solutions are not permitted for Problem 5.
Make sure to submit a single PDF file, and ensure the file size is less than 10 MB.
Problem 1: Simple Investment (15 points)
Suppose that you have 400,000 pounds and you are contemplating the purchase of two investments, A and B. One year from now, Investment A can be sold at £X per £1 invested, and Investment B can be sold for £Y per £1 invested. You regard X and Y as statistically independent random variables. Assume that both X and Y are normally distributed with mean of 1.2 and standard deviation of 0.2.
Problem 1.1: Investing in A (5 points)
If you put all your money in Investment A, what is the probability that you will be able to sell it one year from now at a positive profit?
Problem 1.2: Diversification (5 points)
If you split your money between the two investments and invest £100,000 in Investment A and £300,000 in Investment B, what is the probability you will be able to sell your portfolio for a positive profit a year from now?
Problem 1.3: Optimal Diversification (5 points)
Based on Problems 1.1 and 1.2, discuss the optimal strategy for this investment. Please show the calculations if any to support your argument.
Problem 2: Value of Information (10 points)
A manager at an electric car manufacturer needs to make a sourcing decision regarding an electric battery. The manufacturer seeks to produce 100,000 units of its latest electric car model, and each unit will require a single battery. Supplier A can produce the batteries for the price of £4000 per unit. Supplier B can produce the batteries for the price of £2000 per unit. However, Suppliers A and B differ in the quality of the batteries they produce, especially in terms of defect rates.
For Supplier A, the manager believes with complete certainty that all 100,000 batteries will be delivered with no defects. For Supplier B, the manager estimates a 60% probability of no defects, a 30% probability of minor defects being found in the battery shipment, requiring additional rework costs amounting to £250,000,000, and a 10% probability of major defects, requiring additional rework costs amounting to £400,000,000. The electric car manufacturer intends to sell the car at a unit price of £35,000. Besides the battery, each car has a production cost of £28,000, independent of the sourcing decision.
Problem 2.1 (4 points): Sourcing without Additional Information.
Based on the information given so far, please answer the following questions:
• (1 points) Assume that the manager decides to source all 100,000 batteries from Supplier A. What will be the expected profit?
• (2 points) Assume that the manager decides to source all 100,000 batteries from Supplier B. What will be the expected profit?
• (1 point) Based on your answers to the two previous questions, which supplier should the manager reach out?
Problem 2.2 (6 points): Sourcing with Additional Information.
Suppose that Supplier B agrees to test its batteries and provide the manager with a detailed report before any sourcing decision is made. Based on this report, the manager can assess whether the batteries from Supplier B have major defects, minor defects, or no defects.
• (1 points) Assume that the report shows that Supplier B’s batteries have no defects. Which supplier should the manager choose, and what will be the corresponding profit from selling these 100,000 cars?
• (1 points) Assume that the report shows that Supplier B’s batteries all have minor defects. Which supplier should the manager choose and what will be the corresponding profit from selling these 100,000 cars?
• (1 points) Assume that the report shows that Supplier B’s batteries all have major defects. Which supplier should the manager choose and what will be the corresponding profit from selling these 100,000 cars?
• (2 points) Given that the manager receives the report, based on your answers to the previous questions, what will be expected profit from selling these 100,000 electric cars?
• (1 points) What is the maximum the manager should pay for this information/report? (Hint: compare the results of Problems 2.1.3 and 2.2.3.)
Problem 3: Hypothesis Testing (25 points)
The coffee chain Universal Coffee Lovers (UCL) recently revised its subscription service, the “UCL Club.” Before the change, members were charged £20 per month for unlimited coffee. Under the new policy, members are charged £5 per month, plus £1 for each coffee consumed. To understand how this new subscription model impacts customer behavior, the two CEOs, Akchen and Huang, collected data from club members and asked you to conduct a series of hypothesis tests.
The data file, coffee.csv, is available on Moodle. Please download it and open it by Excel or any other softwares. It contains information on the coffee consumption behavior. of 500 club members before and after the new subscription policy was implemented. The dataset has five columns: the first three columns provide details about each member’s ID, Gender, and Age. In the “Gender” column, the value “1” represents female and “0” represents male. The last two columns report each member’s average daily coffee consumption (in cups) before and after the new subscription took effect. If the consumption value is “-1” in the “After” column, it indicates that the customer has cancelled their subscription. For example, the first club member is a 38-year-old female who previously consumed 1.9 cups of coffee per day under the old subscription policy but has since cancelled her subscription.
Please use the dataset coffee.csv to answer the following questions.
Problem 3.1 (5 points): Gender and Coffee Consumption
The CEOs believe that coffee consumption per customer is not influenced by gender, representing the null hypothesis (H0): there is no difference in coffee consumption between genders. The manager, however, believes that gender does influence coffee consumption, representing the alternative hypothesis (H1): there is a difference in coffee consumption between genders. Using the coffee.csv dataset and coffee consumption data under the old subscription model, do we have enough evidence to accept the alternative hypothesis and reject the null hypothesis?
Problem 3.2 (5 points): Age and Coffee Consumption
The CEOs believe that younger customers are more valuable, asserting that they consume more coffee than older customers. This represents the null hypothesis (H0). In contrast, the manager holds the opposite view, forming the alternative hypothesis (H1). Let µY represent the population mean of coffee consumption for customers aged 20-40, and µO represent the population mean for customers aged 41-60. Using the coffee.csv dataset and the coffee consumption data under the old subscription model, do we have enough evidence to accept the alternative hypothesis and reject the null hypothesis?
Problem 3.3 (5 points): Subscription
The CEOs acknowledge that the new policy may negatively impact customer visits to UCL coffee shops. However, they remain optimistic, believing that the reduction in coffee consumption is modest – less than 0.5 cups per day per customer. This represents the null hypothesis (H0). In contrast, the investors believe the impact is more significant, forming the alternative hypothesis (H1). Using the coffee.csv dataset and considering only the customers who remained in the club under the new subscription model, do we have enough evidence to accept the alternative hypothesis and reject the null hypothesis?
Problem 3.4 (10 points): Policy Evaluation
The investors believe that the new subscription model has led to a decrease in revenue from club members (H0). In contrast, the CEOs believe the opposite (H1), thinking that the new model is boosting the business. The CEOs have asked you to conduct a hypothesis test to support their position based on the existing dataset coffee.csv. How would you approach this analysis?
Problem 4: Discrete Choice Modeling (25 points)
In this problem, we explore an application of probability in a business context, specifically focusing on discrete choice modeling. Discrete choice models are essential tools used to understand product demand and predict consumer behavior. The development of these models earned Daniel McFadden the Nobel Prize in Economics in 2000.
Consider a market with a set of n products. Let N = {1,2,...,n} represent the collection of these n products. Since some products may not be available due to stockouts or inventory limitations, we define an assortment S, which is a subset of N (denoted as S ⊆ N), to represent the set of available products. For instance, if there are five products, N = {1,2,3,4,5}, and a store decides not to sell products 3 and 5, the assortment S would be S = {1,2,4}. Additionally, we use 0 to denote the “no-purchase” option. In this context, i ∈ N refers to the action of “buying product i,” while 0 refers to “not buying anything.” We write N+ = {0} ∪ N = {0,1,2,...,n} and S+ = {0} ∪ S
A choice model M specifies a mapping from any assortment S ⊆ N to a probability distribution PM(· | S) over the N+. Here, PM(i | S) represents the probability that a random customer from the market chooses to buy product i and PM(0 | S) represents the probability that the customer decides not to make any purchase. For example, consider S = {1,2,4}, then PM(1 | S) = 0.3 and PM(0 | S) = 0.2 imply that a customer will buy product 1 with probability 0.3 and will not buy anything from S with probability 0.2.
One of the most fundamental choice models is the Multinomial-Logit (MNL) Model. In this model, each product i is associated with a parameter vi, which represents its expected utility to customers from an economics viewpoint. The no-purchase option is assumed to have an expected utility 0. Given an assortment S of available products, the MNL model calculates the probability that a customer will buy a particular product as follows:
, if i ∈ S,
PMNL(i | S) = 0, if i /∈ S,
, if i = 0.
Problem 4.1 (2 points): Probability Distribution
Please verify that the MNL model results in a valid probability distribution over the set N+.
Problem 4.2 (2 points): When Stockout Happens
Assume that N = {1,2,3} and (v1,v2,v3) = (0.5,0.3,0.7). Please calculate the distribution PMNL(· | N) over N+. If product 2 is out of stock, how will this distribution change?
Problem 4.3 (3 points): Product Demand
Based on your answer to the previous problem, do you think products are substitutes or complements under the MNL model? Discuss this and the limitations of the MNL model.
Problem 4.4 (5 points): Simple Logistic Regression
Now, let’s try to estimate the utility parameter in a simplified scenario. Consider a store that sells a set of products, N = {1,2,...,n}, and assume that these products are always in stock so that S = N = {1,2,...,n}. Suppose m customers visited the store, with m1 customers purchasing product 1, m2 purchasing product 2, and so on, up to mn customers purchasing product n. How would you estimate the utilities (v1,...,vn) in this case?
Problem 4.5 (4 points): Customers Have Their Tastes
Products can be described by their features. Let xrepresent the d-dimensional feature vector of product i. For example, an iPhone 16 Pro priced at £1000, with a 6.7-inch screen and 512GB memory, can be represented by the vector (1000,1,6.7,512), where the features are “price,” “whether it is an iPhone,”“screen size”, and “memory in GB”. Here, a binary value is used to represent “yes” or “no”, with 1 indicating “yes” and 0 indicating “no”.
Given product features x(i) of product i, we assume that the customer’s utility for product i takes the form.
d
ui = β0 + Xβk · x(ki), (1)
k=1
where each βk is referred to as a partworth parameter. The ui in Equation (1) is known as partworth utility, a numerical score that measures how much each feature of a product influences the customer’s decision to select an alternative.
Now, consider a market of two products, N = {1,2}. Each product is characterized by two features: x(1) = (2,1) for product 1 and x(2) = (1,3) for product 2. Suppose the customer’s partworth parameters are given by (β0,β1,β2) = (1,0.8,−0.2). Answer the following questions:
A. (2 points) Under the MNL model, what is the probability that the customer chooses to buy product 1 when both products 1 and 2 are available?
B. (2 points) Which feature, the first or the second, is more likely to be related to price? Explain your reasoning.
Problem 4.6 (5 points): Learning Customers’ Tastes
We aim to estimate the partworth parameters from data. Consider again the simplified version described in Problem 2.4, where assortment S = {1,2,...,n} includes all products, and out of a total of m customers, mi customers purchased product i, for i = 1,...,n. Additionally, each product i is characterized by a feature vector x.
• (3 points) How would you estimate the partworth parameters (β0,β1,...,βd) in this context? (Note: you can be creative in answering this question)
• (2 points) Will your suggested method still work if d > n? If not, how would you address this issue?
Problem 4.7 (2 points): Non-linear Utility
The assumption of a linear relationship between a product’s features and its utility has its limitations. For instance, consider a scenario where the partworth parameter for the “screen size” of a phone is positive. This would suggest that increasing the screen size makes customers more likely to buy the phone. However, this assumption breaks down if the screen size becomes excessively large – such as 20 inches, which is bigger than most laptops. Clearly, customers would not be interested in such a phone. How would you address the non-linear relationship between utility and product features in such cases?
Problem 4.8 (2 points): Heterogeneity
Finally, individual customers often have varying preferences. For example, younger customers might prefer smaller phones for convenience, while older customers may favor larger phones for easier readability. This illustrates the heterogeneity in customer preferences. How would you generalize the MNL model to account for such heterogeneity and incorporate diverse customer tastes? Again, you can be creative in answering this question.
Problem 5: Data Collection and Modeling (25 points)
Assume that you plan to open a pizza restaurant in Canary Wharf and would like to study the factors that may affect the restaurant’s revenue. Please briefly discuss your plan, i.e., which factors or variables that you are interested in, and the statistical tools and models that you will need and why.
Please note that you need to discuss at least five factors that may affect the revenue. You should focus on the explanation and justification of your plan and do not need to actually collect the data or carry out the test.