How do age, race, years of education, region and gender affect self-reported health? To answer this question I found a dataset at Princeton University with the above variables captured in a survey of 3,712 individuals.
I. The Data
The dependent variable, self-reported health, has three possible responses: poor, fair or good.
The predictor variables are
- age, measured in years, which in the sample vary from 40 to 90
- education, measured in years of education, varying from 0 to 17
- race, measured 1 if black and 0 if non-black
- region, measured 1 if living in the south, otherwise 0. (The rationale for including the south as a region is previous research has identified the south as having some of the lowest state ranking on health indicators. For example, see America’s Health Rankings.)
- Gender, measured 1 for female and 0 for male.
II. Statistical Tool
As the dependent variable has three possible responses and the predictor variables are a mix of continuous and categorical variables the statistical tool of choice is multinomial logistic regression.
Self-reported health was regressed on age, race, years of education, region and gender. Somewhat surprisingly gender was not statistically significant as determined by both a Wald test and a nested likelihood ratio test. Gender was removed from the regression yielding a multinomial logistic model with four predictors (age, race, years of education and region) of self-reported health.
The final model is statistically significant (Prob > chi2 = 0.0000) with all predictors significant as well.
I believe graphs depicting the probabilities associated with “poor,” “fair” and “good” health responses against the predictor variables provide an illuminating way to understand the covariate associations with health.
A. The Impact of Age on Self-Reported Health
1. Probability of Self-Reported Health = Poor by Age
As age increases so do the probabilities of poor health responses, controlling for the other predictors. The shaded area in this graph and the following graphs depict 95% confidence interval ranges. Readers are advised to observe the probability vertical scales on each graph.
2. Probability of Self-Reported Health = Fair by Age
Holding constant all other covariates we find fair heath responses also increase with age with higher confidence interval ranges.
3. Probability of Self-Reported Health = Good by Age
Survey respondents exhibit significant decreases in the probability of reporting good health as they age, holding constant the other covariates.
B. The Impact of Race on Self-Reported Health
1. Probability of Self-Reported Health = Poor by Race
The following graph depicts significant differences in reporting poor health based on race.
Blacks are observed to have higher probabilities of reporting poor health than non-blacks, holding all other covariates constant.
2. Probability of Self-Reported Health = Fair by Race
While reporting health as being “fair by race” is statistically significant the confidence intervals are wide. As age increases for Blacks they are less likely to report fair health. The opposite trend is observed for non-blacks, controlling for all other covarites.
3. Probability of Self-Reported Health = Good by Race
Both groups exhibit decreasing probabilities of reporting good health as they age. However, non-blacks enjoy higher probabilities of self-reported good health with all other predictors held constant.
C. The Impact of Race & Education on Self-Reported Health
1. Probability of Self-Reported Health = Poor by Race and Education
Holding all other variables constant, race and education significantly impact poor health responses. The lower the level of years of education the higher the probabilities of poor health responses with Blacks exhibiting higher probabilities of poor health responses throughout the years of education distribution.
2. Probability of Self-Reported Health = Fair by Race and Education
Fair health responses exhibit diverse probability trends for the two groups. Black probabilities for fair health increase up to 12 years of education and then level off. Non-Black probabilities for fair health increase up to approximately 7 years of education, level off and then decline through the years of education distribution, controlling for all other covariates.
3. Probability of Self-Reported Health = Good by Race and Education
Both Blacks and non-Blacks exhibit increased probabilities of good health throughout the years of education distribution. Nonetheless, non-Blacks enjoy higher probabilities of self-reported good health than Blacks, holding constant the other covariates.
D. The Impact of Age and ‘South’ on Self-Reported Health
1. Probability of Self-Reported Health = Poor by ‘South’ and Age
Individuals living in the South have higher probabilities of self-reported poor health throughout the age distribution than individuals not living in the South, controlling for the influence of the other covariates.
2. Probability of Self-Reported Health = Fair by ‘South’ and Age
The probabilities associated with fair health responses for individuals living in and outside the South very within a relatively narrow range. At age 40 the probabilities for individuals living outside and in the South are .24 and .26, respectively. At age 90, the respective probabilities are .32 and .28. In both cases the confidence intervals are rather large, approximately .10.
3. Probability of Self-Reported Health = Good by ‘South’ and Age
Increasing age decreases the probabilities of self-reported good health responses throughout the age distribution, albeit people living outside the South enjoy higher probabilities of good health responses, controlling for all other covariates.
E. Interpretation of the Findings Using Profiles
It’s useful to understand the relationship among the variables by creating profiles of individuals and then calculate and interpret the associated probabilities of health responses. In the following I have created two profiles (ideal types in Max Weber’s language):
- Profile 1: Age 40, 17 years of education, non-Black, living outside the South
- Profile 2: Age 70, 8 years of education, Black, living in the South
Here are the respective probabilities for poor and good health responses:
The younger (age 40), more highly educated, non-Black living outside the South has a 79% probability of reporting good health. An older (age 70) Black with 8 years of education living in the South has has a very low (3%) probability of reporting good health.
Age, race, years of educational attainment and living in or outside the South have substantial associations with self-reported probabilities of having poor, fair or good health. Somewhat surprisingly, gender is not a statistically significant predictor of self-reported health.