I saw a group of young adults outside the grocery store today smoking near the doorway. That experience made me wonder what demographic variables might be associated with this known behaviorial risk. The database of choice for this research project is the Behavioral Risk Factor Surveillance System (BRFSS), an ongoing data collection program designed to measure behavioral risk factors for the adult population. Fortunately, I had previously downloaded the dataset for another research project.
The variables of interest in this study are:
- Smokers, the dependent variable, measured 1 = Yes, 0= No
- Race, a categorical variable, capturing race in two categories: white and non-white
- Education, a categorical variable with four levels of educational attainment: 1) Less than high school, 2) high school diploma, 3) attended college and 4) college degree.
- Income, a categorical variable capturing five levels of income: 1) Less than $15,000, 2) $15,000 to $25,000, 3) $25,000 to $35,000, 4) $35,000 to 50,000 and 5) $50,000 or more
- Age, a continuous variable, capturing the person’s age in years.
The number of observations with complete information equals 380,136.
As the dependent variable is a binary, categorical variable and because we are interested in predicting the probability of smoking, the statistical tool of choice is logistics regression. Smokers was regressed on race, education, income and age.
The model is statistically significant (Prob > chi2 = 0.000) as well as all covariates ( P>|z| = 0.000). To depict the findings I will present a series of graphs which demonstrate the probability of smoking based on the predictor.
It’s observed that as age and educational attainment increase the probability of smoking decreases with the differences in probability narrowing with age.
As age and income increase, smoking probabilities decrease. It’s also observed that probability differences narrow as age increases.
Whites exhibit greater probabilities of smoking then non-whites throughout the age distribution, narrowing as age increases.
D. Ideal Types
Using Max Weber’s concept of an ideal type - an analytical construct that provides an opportunity to make comparative observations - I calculated two ideal types for two individuals at age 40 with the best and worst characteristics associated with chances to become a smoker:
Individual 1 (All the worst characteristics associated with smoking) : Age 40 with less than a high school diploma, an income less than $15,000 and white.
Individual 2 (All the best characteristics associated with not smoking): Age 40 with a college degree, an income in excess of $50,000 and is a non-white.
Individual 1 has a 47% probability of becoming a smoker, compared to individual 2′s probability of only 7%.