Whether you’re a professional Data Scientist, experienced academic researcher or a student learning the basics of data science, skills in creating publication-ready tables and graphics are essential.
Visually-appealing, clear and error-free tables can help enhance any report or presentation, and for some publications such as scientific articles or white papers, formatting tables in line with certain guidelines (e.g., APA format) is a requirement.
Despite all of its amazing functionalities and features, base R lacks easy and straightforward methods to make descriptive (aka summary) statistics tables and correlation tables in publication-ready formats.
Regardless of the particular research topic in data science, it is important to create a descriptive statistics table showing means, standard deviations, medians, and ranges (or more!), and a Pearson (or Spearman) bivariate correlation table showing the strength of associations between variables.
Luckily, there are two R packages that have exactly the functionality that you need to create publication-ready summary statistics and correlation tables in R. These packages are the psych package and the corx package, both of which have many additional options and more advanced capabilities that go beyond the scope of this article.
This step-by-step tutorial will walk you though the steps with a sample dataset included in all versions of R, showing exactly what you need to do to format a descriptive statistics table and correlation table in R, including how to open it in Excel and add it to a Word document or Google docs cloud document.
Creating, formatting and exporting a descriptive statistics table in R
First, you need to load the mtcars dataset that is built into R, which contains data from Motor Trend magazine (published in the U.S.), including fuel consumption and 10 characteristics of automobile design and performance for 32 automobiles (1973–74 models; more details about the variables here).
Next, the psych package and its describe() function (not to be confused with describe() in the Hmisc package) make it easy to calculate a wide variety of descriptive statistics. With a little magic from base R’s write.csv() function, you can export the table to a .csv file, which can be easily opened in Excel and then pasted into Word or Google docs.
Creating and formatting the descriptives table
#Create descriptives and correlation tables #install packages, if you do not have #them already installed install.packages("psych") install.packages("corx") #load packages and set options library(psych) library(corx) options(scipen = 100000) #Load sample mtcars data into R #as data.frame object data("mtcars") #check mtcars data.frame head(mtcars) str(mtcars) dim(mtcars) #dataset contains 11 variables about #car characteristics for 32 cars #use ?mtcars for more details about #this R sample dataset #run descriptive statistics and save #table as object to export desc_tab1 <- psych::describe(mtcars) desc_tab1
Exporting the formatted descriptives table to open in Excel
At this point, all you need to do it to take this newly created “desc_tab1” object and save it as a CSV file in your working directory, which can be set with the setwd(“filepath”), where filepath is the full path name to the folder on your computer where you want the table to be exported to.
#export descriptive table to CSV #in working directory to open in Excel write.csv(desc_tab1, file = "descriptives_table1.csv")
And then, just open the new .csv file in Excel, add your full variable names and statistic types (e.g., mean, median, etc.), reformat the number of decimal places and (boom!) it’s all set to copy and paste into Word or Google docs!
Creating, formatting and exporting (an APA-style) bivariate correlation table in R
Just like for creating a descriptive statistics table, an excellent R package (corx) has been developed to make the process of creating, formatting and exporting correlation tables so much easier and less time-consuming.
Here are the steps to follow and before you begin, make sure you have installed and loaded the corx package with the code above. Even better, this approach can be used to calculate Pearson or Spearman correlations, which have different applications based on the types of variables and are covered in more detail here.
Calculating the Pearson correlations and showing the table in R
First, using the mtcars dataset again as an example, the corx package makes it really easy to create correlation tables, as long as all variables are numeric. As you can see above, we ran the str() function, which shows that all variables are “num” for their type, which is a requirement for running bivariate correlations.
#run bivariate Pearson correlations #with default options corx(mtcars)
Exporting the Pearson correlation table
And that’s all there is to it!
As you can see, the table is already APA style formatted in R with a * for p < 0.05 and values are rounded to two decimal places, which you can then export with the base R write.csv() function to save the table as a .csv file to easily open in Microsoft Excel.
#save bivariate correlation table #and export to working directory #to open in Excel write.csv(corx(mtcars), file = "correlations_table1.csv")
Calculating Spearman correlations and exporting the table
You might also want to calculate Spearman rank-order correlations and/or just keep the lower part of the table (as the upper part above the diagonal is a mirror image), which you can do with just a few more options in the code as shown below.
#run bivariate Spearman correlations #and save only lower (non-repeated) #portion of table write.csv(corx(mtcars, method = "spearman", triangle = "lower"), file = "correlations_table2.csv")
Next, all you have to do is add the full variable names instead of their abbreviations in Excel, and then copy and paste the table into Word or Google docs.
It’s that easy!
A quick pro tip for very large correlation tables: Make sure to make the font size is very small in Excel (e.g., 8 point), highlight the whole sheet and then use “Format -> AutoFit Column Width” before pasting into a “Landscape” orientation page in Word or Google docs. This method will ensure that the table can fit on one page in Word or Google docs without the sides getting cut off.
The bottom line
Whether you are just getting started in data science or have many years of experience, easily generating publication-ready summary statistics and correlations tables are key skills to speed up your workflow and avoid errors that can happen when manually copying values.
Of course, real world datasets are (usually) not quite as clean or easy to work with as the built-in mtcars dataset in R and likely require some data wrangling in other R packages, so make sure you are up-to-date by reading our article on the top 10 essential R packages for data science and statistics.
Also, don’t miss our curated listing of real world datasets that are free to download and our resources about collecting your own survey data for a study or research project.
If you liked this article, make sure to check out our other recent R tutorials, including our guide to making beautiful pie charts and nested pie charts in R using the plotly package.