Stata is a statistical software for data wrangling, visualization, and analysis. It is most commonly used for econometrics, political, social and epidemiological research. Stata enables you to perform a vast range of statistical analysis and present results in a well-structured way. You can choose between using the menu and writing Stata commands with a script, which enables you to easily reproduce your analysis and share it with colleagues or for publications. A major advantage of Stata is that it is easier to learn than Python or R. Stata might be the best choice for you, if your supervisor or colleagues are using it as well. Although Stata offers a very extensive documentation on its build in commands, finding help online is not as easy as it is for Python or R. Another disadvantage of Stata is the lack of built-in commands to perform machine learning, although some user-written commands are now available.
The Digital Skills Lab is currently running the following series of Stata workshops:
Workshops will take place in person throughout the year. Click on the links below to book your place or express an interest so that you are notified as soon as workshops are available to book.
Workshops will take place on campus in LRB.R.08 in the lower ground floor of the Library.
Technical Requirements
If you are using your own laptop during the workshops please ensure you have the required software installed as below:
- Stata SE 15 or 16. You can request a license and instructions on how to download and install Stata by emailing tech.support@lse.ac.uk. Please ensure you request a license at least 3 working days before the workshop date.
The Department of Methodology also has some online tutorials in Stata that were produced in 2011 that students have found useful along with a YouTube training channel. The Department of Methodology also provides training for PhD and MSc students as well as staff in the design of social research and in qualitative and quantitative analysis. Information on this can be found on their Methods training page.
If you can't find what you're looking for below, please email digital.skills.lab@lse.ac.uk or attend one of our drop-in sessions for advice.
We also have workshops and self-study courses in SPSS, Python and R. See below if you're not sure which is right for you.
Stata vs SPSS: Which is right for you?
SPSS is similar to Stata in that you can either use drop down menus or use coding to do analysis. Stata generally has better support available from various sources such as the statalist forum, documentation online and built-in help. Stata is less expensive than SPSS, but SPSS deals better with very large datasets. You can request a license and instructions on how to download and install Stata by emailing tech.support@lse.ac.uk.
Stata vs R or Python
Stata is easier to learn than R and Python, providing easier access to running some types of analysis. It is still a popular choice in the social sciences, but R and Python are catching up. R and Python are open-source and freely available, where Stata needs a licence, which can be expensive. R and Python are better options for data visualisation and machine learning but you will have to learn to code in order to fully utilise them.
The Stata Fundamentals series teaches you the basics of using Stata for statistical analysis. After having completed the Stata Fundamentals series you will be able to import and explore data, use do-files to store commands in a script and perform a linear regression analysis. Our Stata Fundamentals workshop series is ideal for those with no or very little prior experience of using commands in Stata.
This workshops series teaches you the basics of using commands in Stata and will build your confidence to continue learning independently.
By the end of this session you will understand how to:
- Change the working directory
- Load data from different file formats into Stata
- View data or a subset of your data
- Create basic summary statistics
Click on the link below to check availability and book your place:
Stata Fundamentals 1: Import and inspecting data
In this workshop you will learn how to use do-files to create reproducible scripts and how to use commands to create frequency tables, compute new variables and save your data set.
By the end of this session you will understand how to:
- Using do-files to create reproducible scripts
- Creating frequency tables
- Compute new variables
- Saving datasets
Click on the link below to check availability and book your place:
Stata Fundamentals 2: Do-files and computing variables
In this workshop you will learn how to run a simple and multiple linear regression and create scatter plots to visualize the relationship between the variables in your analysis.
By the end of this session you will understand how to:
- Running a simple and multiple linear regression
- Identifying key parameters from the output
- Creating a scatter matrix
- Creating a scatter plot with line of best fit
Click on the link below to check availability and book your place:
Stata Fundamentals 3: Linear regression
In this workshop you will learn how macros and for loops are an essential tool in Stata that help you write more efficiently by reducing the total number of commands in your scripts. Another main benefit is that they make your scripts more readible and less susceptible to errors.
By the end of this session you will understand:
- when and why to use local macros
- how to create and use local macros in do-files
- why and how to use foreach loops
Click on the link below to check availability and book your place:
Stata Fundamentals 4: Foreach Loops
In this workshop you will learn how the if qualifier enables you to carry out a command on a subset of the dataset for which the expression being used in the if qualifier is true. This way you can run a command for specific categories, that is, for instance, only female participants, but also for a specific range of values of a continuous variable, for instance only participants with a minimum amount of income or a combination of both.
By the end of this session you will understand:
- about the purpose of if qualifiers
- how to use single if qualifiers
- how to combine multiple if qualifiers
Click on the link below to check availability and book your place:
Stata Fundamentals 5: if qualifiers
Depending on the source, data doesn't always come in a format that is suitable for analysis. The Data Cleaning in Stata series teaches you the fundamental techniques to clean and prepare data. After having completed this series, you will be able to remove illegal characters, transform text data to a coherent format, correctly label missing values and apply variable and value labels too. You will also know how to reshape data and combine different datasets. Please be aware that the Stata Fundamentals series is assumed to be prior knowledge for the Data Cleaning in Stata series.
In this workshop you will learn how to prepare string columns to a format that is suitable for data analysis and how to convert a string column with numerical values.
By the end of this session you will understand how to:
- Identify numerical values that have been imported as strings
- Modify string values
- Convert strings to numerical
Click on the link below to check availability and book your place:
Data Cleaning in Stata 1: Strings
In this workshop you will learn how to correctly label missing values and duplicates so that Stata can identify them as such when running an analysis or generating plots.
By the end of this session you will understand how to:
- Find and delete duplicates
- Identify observations containing missing values
- Correctly label missing values
- Deal with missing values
Click on the link below to check availability and book your place:
Data Cleaning in Stata 2: Missing values
In this workshop you will learn how to assign variable and value labels, which make your data and analysis output easier to read and interpret.
By the end of this session you will understand how to:
- Find and delete duplicates
- Identify observations containing missing values
- Correctly label missing values
- Deal with missing values
Click on the link below to check availability and book your place:
Data Cleaning in Stata 3: Labels
In this workshop you will learn how data from online sources often comes in the wide format. The long format, however, is generally considered a cleaner way to represent repeated measurements and is also required for certain analysis in Stata. This session teaches you how to reshape data between the wide and long format.
By the end of this session you will understand:
- Long vs wide data formats
- Reshaping data from wide to long
- Reshaping data with one and multiple identifiers
- Reshaping data from long to wide
- Reshaping data with string suffixes
Click on the link below to check availability and book your place:
Data Cleaning in Stata 4: Reshaping data
In this workshop you will learn how data for different economical or sociological variables often has to be downloaded as separate files. This session teaches you how to combine multiple datasets into a single file.
By the end of this session you will understand:
- Long vs wide data formats
- Reshaping data from wide to long
- Reshaping data from long to wide
- Merging multiple datasets into one
Click on the link below to check availability and book your place:
Data Cleaning in Stata 5: Merging data