Linear regression with independent variable

All candidates

  1. (a) In this question, A,B, and C are random variables with means µA,µB, and µC re- spectively, and a common variance σ2. An independent random sample of size 10 is taken from each of these random variables, with the following results:

ā = 49.617 ∑10i=1 a 2 i = 24635.30 ∑

10 i=1(ai − ā)

2 = 16.832

b̄ = 48.526 ∑10i=1 b 2 i = 23583.16 ∑

10 i=1(bi − b̄)

2 = 35.438

c̄ = 51.031 ∑10i=1 c 2 i = 26100.33 ∑

10 i=1(ci − c̄)

2 = 58.704

i) Explain what assumptions on A,B, and C are required in order to perform a one- way ANOVA test of the null hypothesis µA = µB = µC against the alternative that this is false. [2 marks]

ii) Given that those assumptions hold, perform the ANOVA test at the 0.05 signifi- cance level. [5 marks]

iii) Let SSw denote the total sum of squares random variables in this ANOVA set-up. What is the distribution of SSw/σ2? [3 marks]

iv) Derive a 95% confidence interval for σ2. [4 marks]

Version 1 Page 2 of 7

Text Box
Free Hand
(b) You have performed a simple linear regression with independent variable x, resulting the following plot of the residual at xi against xi: Does this cast any doubt on the assumptions of the simple linear regression? Explain your answer. [2 marks]

Text Box
Page 3 of 8

All candidates

(c) You are given the following paired data

i xi yi

1 1 0.93

2 2 −0.70

3 3 −2.94

4 4 −5.00

5 5 −6.90

6 6 −9.16

7 7 −10.95

8 8 −12.87

9 9 −15.33

10 10 −16.96

for which

x̄ = 5.5 ȳ = −7.988,


∑ i=1

(yi − ȳ)2 = 336.62,


∑ i=1

(xi − x̄)2 = 82.5,


∑ i=1

(xi − x̄)(yi − ȳ) = −166.59.

Assume that the data satisfy the usual simple linear regression model, so that the yi are observations of a random variable Y whose conditional distribution given X = x has the form N(a + bx,σ2).

i) Calculate the predicted mean response if x = 11. [4 marks] ii) Find a 95% confidence interval for the slope parameter b. [5 marks]

Total: 25 marks

Version 1 Page 4 of 7

Text Box
Free Hand

All candidates

  1. i) State the central limit theorem for a sequence of independent and identically dis- tributed random variables X1,X2,… with finite mean µ and variance σ2. [3 marks]

ii) Let X be a random variable with the Bernoulli(p) distribution, so that

P(X = 1) = p, P(X = 0) = 1 − p.

a) Derive the expected value of X . [3 marks] b) Derive the variance of X . [3 marks]

iii) Let B be a random variable with the Binomial(n, p) distribution. Using the central limit theorem and the fact that B has the same distribution as

X1 +···+ Xn,

where X1,…,Xn are independent and each has the Bernoulli(p) distribution, show that

B − np√ np(1 − p)

has approximately a standard normal distribution for large n. [6 marks]

iv) Someone claims to be able to predict the outcome of a coin flip. In n = 100 flips they are correct 55 times. Let B be a random variable counting the number of successful predictions in 100 flips and p be the probability they predict a single flip correctly. Test the hypothesis H0 : p = 1/2 against the alternative H1 : p > 1/2 at significance level α = 0.05 using the test statistic

T = B − 100 p√ 100p(1 − p)


which you can assume to have approximately the N(0,1) distribution. [6 marks]

v) What is the minimum number of successful predictions out of 100 flips which would cause you to reject the null hypothesis in the test of part iv)? [4 marks]

Total: 25 marks

Version 1 Page 5 of 7

Text Box
Free Hand

All candidates

  1. (a) Hooke’s law states that the elongation L of a spring subjected to a weight force W is given by W = kL, where W is measured in kg, L is measured in mm, k is called the elastic constant of the spring and is expressed in kg/mm. A group of Physics students wants to measure the elastic constant of a spring. Not having very precise instruments they perform an experiment by applying to the spring weights of increas- ing intensity, from 10 to 50 kg, for five times. The measurements are influenced by the approximation of the reading of the lengths and by the fact that the spring does not behave like a perfect spring and the same weight applied several times does not give the same elongation. The measurements of elongation, in millimeters, for each test done are shown below.

Weight (kg) Measured elongation L (mm)

W L1 L2 L3 L4 L5

10 48.6 47.6 48.8 51.5 49.8

15 78.4 77.5 71.6 77.5 73.6

20 95.7 98.6 100.4 102.4 97.3

25 123.5 131.1 118.9 130.6 128.3

30 150.6 154.5 148.3 146.0 153.3

35 175.5 176.3 173.2 181.8 181.8

40 209.4 199.8 197.8 195.9 203.5

45 230.9 233.2 230.5 218.7 222.6

50 245.2 249.2 257.0 256.7 244.3

i. Write the equation of a simple linear regression model with repeated observa- tions for this dataset. [3 marks]

ii. Perform a statistical test to establish if a linear regression model with repeated observations is a good fit for these data. [6 marks]

iii. Perform a statistical test to establish if Hookes law can be assumed to be valid for this dataset with the 5% significance level. [5 marks]

(b) In this question, X is a continuous random variable with probability density function

f (x) =

{ α −1x(α

−1)−1 0 < x < 1 0 otherwise.

Sample Solution

Population mean size

The number of students who belong to the dance company at each of several randomly selected small universities is shown below . Estimate the true population mean size of a university dance company with 99 % confidence . 21 25 32 22 28 30 29 30 47 26 35 26 35 26 28 28 32 27 40

Sample Solution

Mathematical concept

A unit plan integrates multiple standards informing a mathematical concept and is taught in multiple lessons over several days. The large-scale planning process outlines activities that complement one another to ensure a successful learning experience for all students.

When planning units, educators must begin with the standards, learning targets/objectives, and summative assessment. Keeping the end learning goals in mind allows educators to plan the lessons to ensure students stay focused on what is important. Alignment between standards, learning objectives, instruction, and assessment provides a clear direction for both teachers and students. When students understand the goal of the lesson and all instruction, activities, and assessment are aligned, they are more likely to master the skill or concept.

For this assignment, use the “3-Day Unit Plan Template” to create a 3-day instructional unit plan and summative assessments designed for the students outlined in the “Class Profile” in the K-8 grade level of your choice. Use your state’s mathematics standards to select 1-3 math standards from one domain (number and operations, algebra, geometry, measurement, or data analysis and probability) to represent and teach in the unit. Complete the following when designing your unit:

· Clearly integrate instruction and assessments so that standards, learning targets/objectives, learning activities, and assessments are all aligned

· Use a variety of teaching strategies and technologies that encourage the students’ development of critical thinking and problem-solving skills

· Create opportunities for active inquiry, collaboration, and supportive interaction in the elementary classroom using effective verbal, nonverbal, and media communication techniques

Sample Solution

Quadratic Equations

learn about quadratic equations. This is one of the most common equations in mathematics, with a number of applications in real-world situations.

Respond to the following in a minimum of 175 words:

Write about a problem you have encountered or could encounter where a quadratic equation is useful.
What is the solution to the problem you discussed?

Sample Solution

Mathematical induction

This week you will learn about mathematical induction, polynomials, and the parts that make up polynomials. Polynomials are useful, because they can be written to represent real-world situations and solve actual problems.

Respond to the following prompts in a minimum of 175 words:

What is a situation in your life or future career that would require the use of polynomials?
How can you—or did you—apply polynomials to solve your problem?
Provide an example in your response.

Sample Solution

Regression (linear regression, multiple regression, or logistic regression)

In this task, you will identify a real-world business situation and use real data to perform a data analysis leading to an actionable recommendation. You are encouraged to select an issue in your workplace or program specialty area (e.g., IT management, HC management, or MBA). Publicly available data is also an option (see Course Tips).
Recommended Analysis Techniques:
• regression (linear regression, multiple regression, or logistic regression)
• time series or trend analysis
Note: you need to specify the specific type(s) of time-series analysis you plan to use or consider in Task 2 – i.e., regression, exponential smoothing, moving average, seasonality using multiple regression
• chi-square
• t-test (one sample, two independent samples, or paired)
• crossover analysis
• break-even analysis
Additional Approved Analysis Techniques:
• statistical process control
• linear programming
• decision tree
• simulation

Sample Solution

Solving an equation

Computer-typed reports will not be accepted.

  1. For the ordinary differential equation: (3 Marks)
    𝒅𝒕 = 𝒚 𝒔𝒊𝒏𝟑
    The initial condition is 𝑦(0) = 1.
    a. Use MS Excel to apply Euler’s method and solve the ODE at t=1 s using step sizes
    ℎ = 0.01, 0.2, 0.5 𝑠. Plot the results and discuss the effect of reducing the step size
    on the solution.
    b. Use Heun’s method with iterative corrector to solve for the first step, where the
    step size is ℎ = 1 𝑠. (Limit your answer to only three iterations). Assuming that
    the solution obtained from part (a) using ℎ = 0.01 𝑠 is the true solution, calculate
    the percent relative error.
    c. Use RK4 method to solve for the first step, where the step size is ℎ = 1 𝑠. Calculate
    the percent relative error as in part (b).
  2. Use the Euler’s method to solve: (2 Marks)
    𝟐 − 𝟎. 𝟓𝒕 + 𝒚 = 𝟎
    , where 𝑦(0) = 2 and 𝑦′(0) = 0. Solve from 𝑥 = 0 𝑡𝑜 1 using ℎ = 0.1. Show the first
    step in detail and continue the rest of the steps on MS Excel, showing the final results
    in a table.
  3. Use Euler’s method to solve the following system of ODEs: (2 Marks)
    𝒅𝒕 = −𝟐𝒚 + 𝟓𝒆
    𝒅𝒕 =
    , over the range 𝑡 = 0 𝑡𝑜 0.4 𝑠 using a step size of 0.1 with 𝑦(0) = 2 𝑎𝑛𝑑 𝑧(0) = 4. Show
    the first step in detail and continue the rest of the steps on MS Excel, showing the final
    results in a table.
  4. The basic differential equation of the elastic curve for a simply supported, uniformly
    loaded beam is given as: (3 Marks)
    where, 𝐸 = the modulus of elasticity and 𝐼 = the moment of inertia. The boundary
    conditions are 𝑦(0) = 𝑦(𝐿) = 0. The following parameter values apply: 𝐸 =
    200 𝐺𝑃𝑎,𝐼 = 30,000 𝑐𝑚4
    , 𝑤 = 15 𝑘𝑁/𝑚, 𝑎𝑛𝑑 𝐿 = 2 𝑚. Using (a) the shooting
    method and (b) the finite-difference approach (both with 𝛥𝑥 = 0.5 𝑚), calculate the
    deflection of the beam at 𝑥 = 1 𝑎𝑛𝑑 1.5 𝑚 and evaluate the percent true relative error
    if the analytical solution is:
  5. Develop a MatLab flowchart which will solve a 1st order ODE for a given ODE of the
    form 𝑑𝑦/𝑑𝑡 = 𝑓(𝑦,𝑡) for a given boundary condition of y(0)=value using Euler
    method. (2 Marks)
    a. Develop a structured well-documented MatLab code to solve:
    𝑑𝑡 = (1 + 4𝑡)√𝑦, 𝑦(0) = 1
    for the domain 0<t<5 with step sizes of 0.5, 0.25, 0.1, 0.01 and 0.001 and 0.0001 and
    use the tic/toc command to record the length of time required for each case.
    b. Plot the solution over the interval 0<t<5 and discuss the differences.
    c. On a separate graph plot the value y(5) as a function of step size and on the same
    graph, plot the execution time using a second y-axis.
    d. Store the ODE as a function. Then use the ode45 command to solve over the
    interval 0<t<5 and the tic/toc command. Comment on the differences in the results
    in (a) and (d).
    Important Notes:
    • Group work is encouraged while solving the problems, however each student should submit his/her
    own work.
    • Cheating and/or plagiarism are forms of academic dishonesty. Such cases will be spotted and a zero
    mark will be given to all copies regardless of who copied whose. Everyone should be protective of
    his/her own work, yet provide the appropriate help whenever asked. In the case of repeated incidents
    from the same student, the case will be raised to the academic integrity officer of the university.
    • Computerized reports will not be accepted, except for the code part. You must solve the assignment
    by hand and submit a scan of your solutions.
    • Students who wish to remark a question must a detailed email to the instructor (Dr. Abdelsalam) along
    with the report with detailed explanation about the case (i.e. where do you deserve an extra mark and
    why?). The instructor will refer the letter to the corresponding TA who will reply to the email

Sample Solution