Producing output tables
Very often you need to present the results of your analysis to your supervisor or a wider audience, and so you want it to look neat and tidy. Traditionally that meant exporting your results from your statistical software and bringing it into something Word or Powerpoint to clean up. That’s still true to a degree but there are R packages that let you produce web pages and slide decks and even an entire book. These are not the focus of this course, but I have decided to include it nevertheless.
Here we need two new R packages, tableone and labelled:
library(tableone)
library(labelled)
We read in the familiar heart dataset and then convert it to a table using the CreateTableOne() function:
heart <- read.csv("C:/epi551/heart.csv", header=T)
hearttable <- CreateTableOne(data=heart)
print(hearttable)
##
## Overall
## n 303
## Age (mean (SD)) 54.44 (9.04)
## Sex (mean (SD)) 0.68 (0.47)
## Chest_Pain_Type (mean (SD)) 3.16 (0.96)
## Resting_Blood_Pressure (mean (SD)) 131.69 (17.60)
## Serum_Cholesterol (mean (SD)) 246.69 (51.78)
## Fasting_Blood_Sugar (mean (SD)) 0.15 (0.36)
## Resting_ECG (mean (SD)) 0.99 (0.99)
## Max_Heart_Rate_Achieved (mean (SD)) 149.61 (22.88)
## Exercise_Induced_Angina (mean (SD)) 0.33 (0.47)
## ST_Depression_Exercise (mean (SD)) 1.04 (1.16)
## Peak_Exercise_ST_Segment (mean (SD)) 1.60 (0.62)
## Num_Major_Vessels_Flouro (%)
## ? 4 ( 1.3)
## 0.0 176 (58.1)
## 1.0 65 (21.5)
## 2.0 38 (12.5)
## 3.0 20 ( 6.6)
## Thalassemia (%)
## ? 2 ( 0.7)
## 3.0 166 (54.8)
## 6.0 18 ( 5.9)
## 7.0 117 (38.6)
## Diagnosis_Heart_Disease (mean (SD)) 0.94 (1.23)
## diagnosis = Yes (%) 139 (45.9)
This is not going to win you any graphic design awards, but it could suffice. One thing that maybe is not ideal are the long variable names. Perhaps the words are ok, but we don’t want words that are connected with underscores. We can use the var_label() function to assign different labels for the variable names. In this example, I have chosen short names with the idea of saving space. You wouldn’t present it like this at a conference, but maybe you would to your team.
var_label(heart) <- c("age","sex","pain","bp","chol","sugar","ecg",
"maxhr","angina","stdep","peakst","vessels","thal","dx","dx2")
hearttable <- CreateTableOne(data=heart)
print(hearttable, varLabels=T)
##
## Overall
## n 303
## age (mean (SD)) 54.44 (9.04)
## sex (mean (SD)) 0.68 (0.47)
## pain (mean (SD)) 3.16 (0.96)
## bp (mean (SD)) 131.69 (17.60)
## chol (mean (SD)) 246.69 (51.78)
## sugar (mean (SD)) 0.15 (0.36)
## ecg (mean (SD)) 0.99 (0.99)
## maxhr (mean (SD)) 149.61 (22.88)
## angina (mean (SD)) 0.33 (0.47)
## stdep (mean (SD)) 1.04 (1.16)
## peakst (mean (SD)) 1.60 (0.62)
## vessels (%)
## ? 4 ( 1.3)
## 0.0 176 (58.1)
## 1.0 65 (21.5)
## 2.0 38 (12.5)
## 3.0 20 ( 6.6)
## thal (%)
## ? 2 ( 0.7)
## 3.0 166 (54.8)
## 6.0 18 ( 5.9)
## 7.0 117 (38.6)
## dx (mean (SD)) 0.94 (1.23)
## dx2 = Yes (%) 139 (45.9)
You’ll notice that every line reads (mean (SD)). Maybe you or your audience finds this redundant. The explain parameter can remove it:
print(hearttable, varLabels=T, explain=F)
##
## Overall
## n 303
## age 54.44 (9.04)
## sex 0.68 (0.47)
## pain 3.16 (0.96)
## bp 131.69 (17.60)
## chol 246.69 (51.78)
## sugar 0.15 (0.36)
## ecg 0.99 (0.99)
## maxhr 149.61 (22.88)
## angina 0.33 (0.47)
## stdep 1.04 (1.16)
## peakst 1.60 (0.62)
## vessels
## ? 4 ( 1.3)
## 0.0 176 (58.1)
## 1.0 65 (21.5)
## 2.0 38 (12.5)
## 3.0 20 ( 6.6)
## thal
## ? 2 ( 0.7)
## 3.0 166 (54.8)
## 6.0 18 ( 5.9)
## 7.0 117 (38.6)
## dx 0.94 (1.23)
## dx2 = Yes 139 (45.9)
For variables that represent counts, like thal, the default is to show counts, but format=“p” can change these to percentages:
print(hearttable, varLabels=T, explain=F, format="p")
##
## Overall
## n 303
## age 54.44 (9.04)
## sex 0.68 (0.47)
## pain 3.16 (0.96)
## bp 131.69 (17.60)
## chol 246.69 (51.78)
## sugar 0.15 (0.36)
## ecg 0.99 (0.99)
## maxhr 149.61 (22.88)
## angina 0.33 (0.47)
## stdep 1.04 (1.16)
## peakst 1.60 (0.62)
## vessels
## ? 1.3
## 0.0 58.1
## 1.0 21.5
## 2.0 12.5
## 3.0 6.6
## thal
## ? 0.7
## 3.0 54.8
## 6.0 5.9
## 7.0 38.6
## dx 0.94 (1.23)
## dx2 = Yes 45.9