ID 543: Day 4

Introduction to R

Homework review

Exercise

Characteristic	Male N = 501¹	Female N = 704¹
race_eth_cat
Hispanic	81 (16%)	130 (18%)
Black	138 (28%)	169 (24%)
Non-Black, Non-Hispanic	282 (56%)	405 (58%)
eyesight_cat
Excellent	228 (46%)	246 (35%)
Very Good	162 (32%)	223 (32%)
Good	85 (17%)	164 (23%)
Fair	21 (4.2%)	57 (8.1%)
Poor	5 (1.0%)	14 (2.0%)
glasses	221 (44%)	403 (57%)
age_bir	24.0 (21.0, 29.0)	21.0 (18.0, 26.0)
¹ n (%); Median (Q1, Q3)

Characteristic	Male N = 501¹	Female N = 704¹
Race/ethnicity
Hispanic	81 (16%)	130 (18%)
Black	138 (28%)	169 (24%)
Non-Black, Non-Hispanic	282 (56%)	405 (58%)
Eyesight
Excellent	228 (46%)	246 (35%)
Very Good	162 (32%)	223 (32%)
Good	85 (17%)	164 (23%)
Fair	21 (4.2%)	57 (8.1%)
Poor	5 (1.0%)	14 (2.0%)
Wears glasses	221 (44%)	403 (57%)
Age at first birth	24.0 (21.0, 29.0)	21.0 (18.0, 26.0)
¹ n (%); Median (Q1, Q3)

Participant characteristics
Variable	Total	Male N = 501	Female N = 704	P
Race/ethnicity				0.3
Hispanic	211 (18%)	81 (16%)	130 (18%)
Black	307 (25%)	138 (28%)	169 (24%)
Non-Black, Non-Hispanic	687 (57%)	282 (56%)	405 (58%)
Eyesight				<0.001
Excellent	474 (39%)	228 (46%)	246 (35%)
Very Good	385 (32%)	162 (32%)	223 (32%)
Good	249 (21%)	85 (17%)	164 (23%)
Fair	78 (6.5%)	21 (4.2%)	57 (8.1%)
Poor	19 (1.6%)	5 (1.0%)	14 (2.0%)
Wears glasses	624 (52%)	221 (44%)	403 (57%)	<0.001
Age at first birth	22.0 (19.0, 27.0)	24.0 (21.0, 29.0)	21.0 (18.0, 26.0)	<0.001

Characteristic	Beta	95% CI	p-value
(Intercept)	1,029	-3,747, 5,805	0.7
Sex
Male	—	—
Female	-4,681	-10,553, 1,191	0.12
Age at first birth	482	304, 661	<0.001
Race/ethnicity
Hispanic	—	—
Black	-300	-2,448, 1,849	0.8
Non-Black, Non-Hispanic	6,418	4,501, 8,336	<0.001
Sex/age interaction
Female * Age at first birth	162	-77, 400	0.2
Abbreviation: CI = Confidence Interval

Characteristic	OR	95% CI	p-value
Eyesight
Excellent	—	—
Very Good	0.92	0.70, 1.21	0.5
Good	0.86	0.63, 1.18	0.4
Fair	0.58	0.35, 0.95	0.032
Poor	1.16	0.46, 3.08	0.8
Sex
Male	—	—
Female	1.84	1.45, 2.33	<0.001
Income	1.00	1.00, 1.00	<0.001
Abbreviations: CI = Confidence Interval, OR = Odds Ratio

Characteristic	Model 1	Model 2
(Intercept)	-1,201	-4,657, 2,256	0.5	1,029	-3,747, 5,805	0.7
Sex
Male	—	—		—	—
Female	-833	-2,283, 617	0.3	-4,681	-10,553, 1,191	0.12
Age at first birth	571	448, 693	<0.001	482	304, 661	<0.001
Race/ethnicity
Hispanic	—	—		—	—
Black	-287	-2,436, 1,863	0.8	-300	-2,448, 1,849	0.8
Non-Black, Non-Hispanic	6,434	4,516, 8,352	<0.001	6,418	4,501, 8,336	<0.001
Sex/age interaction
Female * Age at first birth				162	-77, 400	0.2
Abbreviation: CI = Confidence Interval

1 / 84

ID 543: Day 4 Introduction to R

ID 543: Day 4
Homework review
Today’s goals
Tables
Easier way to make frequency tables
Cross-tabulations with tabyl()
tabyl()
Other percentages
Other functions
tabyl()
We can easily do a chi-squared test
Exercise
Making more complex tables
gtsummary::tbl_summary()
tbl_summary( nlsy,...
tbl_summary( nlsy,...
tbl_summary()
Table customization
Exercise
Regression
Regression
Regression
Regression
Regression
Helpful regression packages
broom::tidy()
broom::tidy() can also help other statistics
gtsummary::tbl_regression()
gtsummary::tbl_regression()
You could put several together
You could put several together
Exercise
Figures in R using ggplot()
Figures in R using ggplot()
Why ggplot?
ggplot builds figures by adding on pieces via a particular “grammar of graphics”
Basic structure of a ggplot
Let’s walk through the creation of a figure
Let’s walk through the creation of a figure
Let’s walk through the creation of a figure
Let’s walk through the creation of a figure
Let’s walk through the creation of a figure
Let’s walk through the creation of a figure
Let’s walk through the creation of a figure
Let’s walk through the creation of a figure
Let’s walk through the creation of a figure
What are some of the layers we may need for this one?
Returning to our basic structure
Scatterplot: geom_point()
What if we want to change the color of the points?
What if we want the color to correspond to values of a variable?
Alternative specification
Exercise
Let’s change the colors
Color palettes
Change the title on the legend
Change the axis scale
We can label the axis better
Exercise
Facets
ggplot(data = nlsy,...
ggplot(data = nlsy,...
ggplot(data = nlsy,...
ggplot(data = nlsy,...
ggplot(data = nlsy,...
ggplot(data = nlsy,...
ggplot(data = nlsy,...
Wait, these look like histograms!
“stat_bin() using bins = 30. Pick better value with binwidth.”
ggplot(nlsy, aes(x...
ggplot(nlsy, aes(x...
ggplot(nlsy, aes(x...
Themes to make our plots prettier
p <- ggplot(nlsy,...
p + theme_minimal()...
p + theme_dark()...
p + theme_classic()...
p + theme_void()...
p + ggthemes::theme_fivethirtyeight()...
p + ggthemes::theme_excel_new()...
Finally, save it!
More resources
Today’s summary
Today’s functions