Testing to Find Relationships Between Two Variables

 


There are three main statistical families of techniques that are commonly used to examine the relationship or association between variables. The chi-square family looks at the relationship strength of categorical variables, whereas the correlation family looks at the strength of linear relationship of interval/ratio variables—but some tests in this family can also examine other measurement-level combinations. The regression family goes one step further than the previous tests by assessing the relative strength of one or more variables in predicting the change in another variable.

Before committing to any particular test, you must clarify the nature of the relationship you are interested in, determine how many variables are involved, and determine the measurement levels of each variable.

Preparation
You are encouraged to review the chi-square, correlation, and regression materials from previous weeks. Then, review How to Choose a Statistical Test and the test-selection tutorials linked in the Resources to determine which test is most likely to be appropriate for your data type.

Instructions
Using the Framingham study data set, perform and interpret statistical tests that answer the following research questions. Then, provide a written analysis of your results.

At baseline, was there a significant association between diabetes (variable: diabetes1) and smoking status (variable: cursmoke1)?
At baseline, how much variation in participant cholesterol levels (variable: totchol1) could be explained by the variation in an individual's BMI (variable: bmi1)?
Written Analysis Format and Length
Format your analysis using APA style.

An APA Style Paper Tutorial is provided to help you in writing and formatting your analysis.
Your analysis should be 2–3 pages in length, not including the title page and references page.
Note: The requirements outlined below correspond to the grading criteria in the rubric. Be sure that your statistical analysis addresses each point, at a minimum. You may also want to read the rubric to better understand how each criterion will be assessed.

Perform the appropriate statistical tests (based on the assumption test).
Provide your rationale for test selection.
Interpret the results of your statistical tests (chi-square, correlation, and regression) for each research question.
Consider associated caveats and limitations.
Determine the practical, public health-related implications of your statistical tests (chi-square, correlation, and regression).
What evidence do you have that validates your conclusions?
Write clearly and concisely, using correct grammar, mechanics, and APA formatting.
Write for an academic audience, using appropriate statistical terminology, style, and form.
Express your main points and conclusions coherently.

 

Research Question 1: Association Between Diabetes and Smoking Status

 

The first research question asks: At baseline, was there a significant association between diabetes (variable: diabetes1) and smoking status (variable: cursmoke1)?

 

Rationale for Test Selection

 

To examine the association between two categorical nominal variables—Diabetes Status (dichotomous: Yes/No) and Current Smoking Status (dichotomous: Yes/No)—the appropriate statistical method is the Chi-Square Test of Independence ($\chi^2$). The Chi-Square test assesses whether the observed frequency distribution in a contingency table differs significantly from the expected distribution, assuming no relationship exists between the variables. Since both variables are binary, the assumption of independence is tested directly.

Sample Answer

 

 

 

 

 

 

 

Statistical Analysis of Framingham Study Variables

 

(This section would be presented on the first page of the paper, following the APA-7 Title Page. The document would be double-spaced throughout.)

 

Statistical Analysis of Framingham Study Variables

 

The following analysis utilizes the baseline data from the Framingham Heart Study to investigate the association between two categorical variables (diabetes and smoking status) and the predictive relationship between two interval/ratio variables (BMI and total cholesterol). The methodology, results, interpretation, and public health implications are presented for two distinct research questions.