Statistics in Stata

After you have installed Stata on your computer, you are ready to import your first data set and run some basic statistics. This page provides a brief summary of how to open Stata, load a data set in, and run basic descriptive statistics.

We have also included some more advanced topics and resources at the end of the guide. If you need help downloading and installing the latest version of Stata, make sure check out our Installing Statistical Software Programs page.

First, import some data into Stata to work with. To do this, click on “File”, then go to “Import” and select the type of data set your would like to import. In most cases, you will be using Excel or .csv files, which are the first and second options in the menu. After you select your file, it will be automatically loaded into Stata to allow analyses. You can also import data in Stata by typing a line of code with the file name on your computer, which is shown below.

* Clear Stata's memory and then import an 
* Excel file (.xslx) or Comma Separated Values 
* (.csv) file

clear

import excel "c:\folder\filename.xls", sheet("sheetname")

* Code to import Comma Separated Values (.csv) files in Stata *

clear

import delimited "c:\mydata1.csv"

Next, you can run some basic statistics on your dataset, including descriptive (aka summary) statistics, correlations of the relationships between variables and some simple visualizations.

* Code to run descriptive statistics in Stata

summarize x y z

* For categorical variables

tabulate x y z

* Correlations

pwcorr x y z

* You can also use the correlate command

correlate x y z

* This command allows for other options, such as
* adding significance stars

correlate x y z, sig star(0.05)

* Spearman correlations

spearman x y z

* Scatterplot between two variables

scatter y x

* Add a linear regression line

scatter y x || lfit y x

There are many additional advanced topics in data science and statistics, and we at Data Science for Anyone are hard at work on creating new expert guides for R, Python, Stata and SPSS. For a super useful and easy-to-use guide to most common commands in Stata to manage, transform and analyze data using basic and advanced statistics, we highly recommend that you take a look at this page.

Make sure to check back on our site often for updates and make sure to take a trip over to this page for many free educational resources on data science and statistics provided by the Digital Research & Education Statistical Consulting (IDRE) at UCLA!

%d bloggers like this: