STATA - Data Analysis, Comprehensive Statistical Software

Stata BE prices

Stata's expertise lies in the analysis of time based data. Stata provides not only the basic time series models like ARIMA but even the multivariate equivalents (VAR/VEC-Models) as well. Further you are able to model volatility using GARCH-models in Stata. Kaplan-Meier-curves are the way to analyse survival times, while mixed models help to analyse panel data. A mighty scripting language completes the package.

Stata produces all kinds of classical statistics. You can use it for descriptive statistics, hypothesis testing and visualization of data. Typically Stata is used in research and development. The large amount of different statistical methods helps scientists in all fields of applications (Social science, econometrics, epidimiology, medical research).

No matter if you are a student or a senior researcher, there is always the right version of Stata available: Stata/BE, Stata/SE and Stata/MP

Arguments for Stata:

Used in research and development
Wide range of statistical and graphical methods
Comprehensive statistical software
Flexible and especially powerful for analysis of time series
Easy to learn but mighty scripting language

Stata SE

Fast. Accurate. Easy to use. Stata is a complete, integrated software package that provides all your data science... more details

Download pricelist Product information

STATA MP

Fast. Accurate. Easy to use. Stata is a complete, integrated software package that provides all your data science... more details

Download pricelist Product information

EViews 14

EViews is your first choice in the field of econometrics! Whether linear regression, time series analysis using... more details

Download pricelist Product information

Stata/BE

Stata statistical software is a complete, integrated statistical software package that provides everything you need for data analysis, data management, and graphics. Stata is not sold in modules, which means you get everything you need in one package.

Easy to learn yet fully programmable for the most demanding data management and statistical requirements.

With Stata's menus and dialogs, you can easily point and click or drag and drop your way to all of Stata's statistical, graphical, and data management features. You can completely reshape your data, create group-level variables for panel or longitudinal data, graph a receiver operating characteristics (ROC) curve or impulse-response function (IRF), perform a case-control analysis, estimate a random-effects count-data model or a Cox proportional hazards model, or compute marginal effects from a nonlinear estimator. You can even access the dialog boxes for each command directly from the online help system. T his is a great way to explore all of the capabilities of Stata.

Stata Software is available in 3 different flavors

Whether you’re a student or a seasoned research professional, we have a package designed to suit your needs:

Stata/MP: The fastest version of Stata (for quad-core, dual-core, and multicore/multiprocessor computers) that can analyze the most data
Stata/SE: Stata for large datasets
Stata/BE: Stata for mid-sized datasets
Numerics by Stata: Stata for embedded and web applications

Stata/MP is the fastest and largest version of Stata. Virtually any current computer can take advantage of the advanced multiprocessing of Stata/MP. This includes the Intel i3, i5, i7, i9, Xeon, and Celeron, and AMD multi-core chips. On dual-core chips, Stata/MP runs 40% faster overall and 72% faster where it matters, on the time-consuming estimation commands. With more than two cores or processors, Stata/MP is even faster. Find out more about Stata/MP.

Stata/MP, Stata/SE, and Stata/BE all run on any machine, but Stata/MP runs faster. You can purchase a Stata/MP license for up to the number of cores on your machine (maximum is 64). For example, if your machine has eight cores, you can purchase a Stata/MP license for eight cores, four cores, or two cores.

Stata/MP can also analyze more data than any other flavor of Stata. Stata/MP can analyze 10 to 20 billion observations given the current largest computers, and is ready to analyze up to 1 trillion observations once computer hardware catches up.

Stata/SE and Stata/BE differ only in the dataset size that each can analyze. Stata/SE and Stata/MP can fit models with more independent variables than Stata/BE (up to 10,998). Stata/SE can analyze up to 2 billion observations.

Stata/BE allows datasets with as many as 2,048 variables and 2 billion observations. Stata/BE can have at most 798 independent variables in a model.

Numerics by Stata can support any of the data sizes listed above in an embedded environment.

All the above flavors have the same complete set of features and include PDF documentation.

Product features	Stata/BE	Stata/SE	Stata/MP
Maximum number of variables	2,048	32,767	120
Maximum number of observations	2.14 billion	2.14 billion	Up to 20 billion
Maximum number of independent variables	798	10,998	10,998
Multicore support (Time to run logistic regression with 5 million obs and 10 covariates )	1-core/ 10.0 sec	1-core/ 10.0 sec	2- core (5.0 sec), 4-core (2,6 sec), 4+ core (even faster)
Complete suite of statistical features	Yes!	Yes!	Yes!
Publication-quality graphics	Yes!	Yes!	Yes!
Matrix programming language	Yes!	Yes!	Yes!
Complete PDF documentation	Yes!	Yes!	Yes!
Exceptional technical support	Yes!	Yes!	Yes!
Includes within-release updates	Yes!	Yes!	Yes!
64-bit version available	Yes!	Yes!	Yes!
Windows, macOS, and Linux	Yes!	Yes!	Yes!
Memory requirements	1 GB	2 GB	4 GB
Disk space requirements	1 GB	1 GB	1 GB

* The maximum number of observations is limited only by the amount of available RAM on your system.

Stata scripting language

Stata's scripting language is easy to learn and helps you to get the most out of your data. It allows not only to use and modify the existing routines to generate standard reports, but can easily be extended with newly created statistical functions.

Efficient Datamanagent with Stata

Datamanagement with Stata is easy and efficient. Joining datasets, creating new variables or producing summary tables is done in no time.

Professional Graphics with Stata

Stata provides professional graphics that can directly be used for documents and publications. This includes not only pre-defined standard graphs but although highly customizeable graphics.

Further Information:

https://www.stata.com/why-use-stata/

Trialversion of Stata

The producer provides a free 30-day trialversion on their website. The trialversion contains all the features of Stata. You can register for this license simply by visiting the following link: http://www.stata.com/customer-service/evaluate-stata/

Compatible operating systems

Stata will run on the platforms listed below. While Stata software is platform-specific, your Stata license is not; therefore, you need not specify your operating system when placing your order for a license.

running Stata on a dual-core, multicore, or multiprocessor computer.

Platforms

Stata for Windows®

Windows 10 *
Windows 8 *
Windows Server 2019, 2016, 2012 *

* Stata requires 64-bit Windows for x86-64 processors made by Intel® or AMD

Stata for Mac®

Mac with Apple Silicon or 64-bit Intel processor
macOS 11.0 (Big Sur) or newer for Macs with Apple Silicon and macOS 10.12 (Sierra) or newer for Macs with 64-bit Intel processors

Stata for Linux

Any 64-bit (x86-64 or compatible) running Linux
For xstata, you need to have GTK 2.24 installed

Hardware requirements

Package	Memory	Disk space
Stata/MP	4 GB	2 GB
Stata/SE	2 GB	2 GB
Stata/BE	1 GB	2 GB

Stata for Linux requires a video card that can display thousands of colors or more (16-bit or 24-bit color)

What's new in Stata?

Tables

Customize your tables of

Summary statistics
Results from hypothesis tests
Regression results
LR and Wald tests, GOF statistics, ...
Results from any Stata command

Export to

Word, Excel
LaTeX
HTML, Markdown
PDF
and more

Bayesian econometrics

Bayesian

VAR models
IRF and FEVD analysis
Dynamic forecasting
Panel/longitudinal-data models
Linear and nonlinear DSGE models

PyStata—Python and Stata

Call Python from Stata.
Call Stata from Python.
Exchange data, metadata, and results seamlessly.
Use Stata from Jupyter Notebook, Spyder, PyCharm IDE, and more.

Jupyter Notebook with Stata

Invoke Stata and Mata from Jupyter Notebook.
Easily reproduce your work and collaborate with others.
Access results from Stata analyses within Python.
Stata output, graphs, and tables seamlessly integrate with your Jupyter Notebook.

Difference-in-differences (DID) and DDD models

Evaluate the effect of a policy, a treatment, or an intervention.
Control for confounding unobserved group and time characteristics.
Use panel data or repeated cross-sections.
Use DID. In vogue since 1855.

Faster Stata

Stata is fast, and keeps getting faster.

Faster sort and collapse
Faster mixed models
Faster estimation commands
Faster import delimited
And more

Interval-censored Cox model

You want to model time to an event.

But you don't know the exact event times—only the intervals in which events happen.

And you don't want to make parametric assumptions.

Try an interval-censored Cox model.

Multivariate meta-analysis

Do you have multiple effect sizes?

Do they share a common control group?

Do they share the same group of subjects?

Multivariate meta-analysis can help.

Bayesian VAR models

You fit your VAR models with var.

You fit your Bayesian regression models with bayes:.

Now fit your Bayesian VAR models with bayes: var.

Bayesian multilevel modeling

Nonlinear, joint, SEM-like, and more.

More multilevel models.

More powerful.

Easier to use.

Treatment-effects lasso estimation

When you want:

Causal inference, average treatment effects, potential-outcome means, double-robust estimation

And you have:

Many (maybe hundreds or thousands of) potential covariates

Use treatment-effects estimation with lasso variable selection.

New functions for dates and times

Calculate durations, such as ages and other differences between datetimes.
Calculate relative dates, or dates from other dates, such as the previous or next birthday or anniversary relative to a reference date.
Extract individual components from datetime values and variables.

Leave-one-out meta-analysis

Are there influential studies in your data?

Use leave-one-out meta-analysis to find out.

Galbraith plots

Graphically summarize meta-analysis results

Study-specific effect sizes
Precision of effect sizes
Overall effect size

Detect potential outliers

Assess heterogeneity

Panel-data multinomial logit model

You can model categorical outcomes with mlogit.

You can model panel data with xt.

Now you can do both!

Stata's new xtmlogit command models categorical outcomes that change over time.

Bayesian panel-data models

Bayesian analysis lets you answer probabilistic questions with panel-data models.

How likely is it that an extra year of schooling will increase wages?
What is the probability of default for a low-risk portfolio?

Incorporate prior knowledge, see posterior distributions of random effects, compute Bayesian predictions, and more.

Zero-inflated ordered logit model

Need to model an ordinal outcome?

Have excess zeros (or responses in the lowest category)?

ziologit is the answer.

Nonparametric tests for trend

Do responses have an increasing or decreasing trend? Find out using one of four nonparametric tests for trend:

Cochran–Armitage test
Jonckheere–Terpstra test
Linear-by-linear test
Cuzick's test with ranks

Bayesian IRF and FEVD analysis

What is the effect of a shock over time?

What is the mean or median of the effect for a distribution of probable scenarios?

Bayesian IRF analysis answers these and more.

Bayesian dynamic forecasting

After VAR, you want a dynamic forecast.

After Bayesian estimation, you want statistics of posterior distributions.

Estimate both. Visualize both.

Lasso with clustered data

Your data have ...

many variables.

Your data have ...

clusters of observations.

Your lasso for prediction, model selection, or inference can now select variables while accounting for clustering.

BIC for lasso penalty selection

Which variables should lasso include?

BIC for lasso penalty selection can tell you.

Bayesian linear and nonlinear DSGE models

Forming rational expectations
of the future is hard.

DSGE models include
these expectations.

Prior information helps.

Do-file Editor enhancements

Persistent bookmarks
Navigation Control
Syntax highlighting for Java, XML, and more
Auto-completion for quotes, parentheses, and brackets

Stata on Apple Silicon

Native M1 processor support
Universal application for both Intel and Apple Silicon Macs
One license, both kinds of hardware

Intel Math Kernel Library (MKL)

Mata functions and operators use heavily optimized LAPACK routines underpinned by the Intel Math Kernel Library.

Use your favorite Stata commands like always; underlying functions are faster, so you get results faster.

Java integration

Use Java interactively (like JShell) from within Stata.
Embed Java code in do-files.
Embed Java code in ado-files.
Compile and execute Java code "on the fly" without external programs.

H2O integration

Start a new H2O cluster or connect to an existing one.
Manipulate data on an H2O cluster.
Access the capabilities of H2O directly in Stata.

JDBC

Connecting Stata to databases is now easier.

Want to access data from Oracle, MySQL, Amazon Redshift, Snowflake, Microsoft SQL Server, and others?

Use jdbc.

Want one driver that works on Windows, Mac, and Linux?

Use jdbc.

Search, browse, and import FRED data

The St. Louis Federal Reserve makes available over 470,000 U.S. and international economic and financial time series. You can now easily search, browse, and import these data.

Multilevel regression for interval-measured outcomes

Incomes are sometimes recorded in groupings, as are people's weights, insect counts, grade-point averages, and hundreds of other measures. Often we have repeated measurements for individuals, or schools, or orchards, etc. So ... we need multilevel regression for interval-measured (interval-censored) outcomes.

Multilevel tobit regression for censored outcomes

Left-censoring, right-censoring, both
Censoring that varies by observation
Make inferences about either the uncensored or the censored outcome
Robust and clustered SEs
Support for survey data

Panel-data cointegration tests

Tests
- Kao
- Pedroni
- Westerlund
Total of nine variants of tests

Tests for multiple breaks in time series

Cumulative sum (CUSUM) test for parameter stability
- CUSUM of recursive residuals
- CUSUM of OLS residuals
Plots with CIs

Multiple-group generalized SEM

Generalized SEM now supports multiple-group analysis. Easily specify groups and test parameter invariance across groups. GSEM models include

continuous, binary, ordinal, count, categorical, and even survival outcomes
multilevel models

ICD-10-CM/PCS

NCHS's ICD-10-CM diagnosis codes
CMS's ICD-10-PCS procedure codes
Verify codes are valid
Create new variables based on codes

Power for cluster randomized designs

Power analysis for comparing

One- and two-sample means
One- and two-sample proportions
Two-sample survivor curves

when you randomize clusters instead of individuals

Power for linear regression models

Solve for
- Power
- Sample size
- Effect size
Specify lists of
- Alpha values
- Power levels
- Beta values
- Sample sizes
- And more
Automated tables and graphs

Heteroskedastic linear regression

Model for the variance
Robust and cluster SEs
Survey-data support

Poisson models with sample selection

Counts are common. How many:

Fish did you catch?
Accidents occurred?
Patents does a firm generate?

Outcomes are not always seen.

Folks evade the game warden.
Accidents are not always reported.
Some firms prefer trade secrets to patents.

So you need Poisson models with sample selection.

More in panel data

Nonlinear models with random effects, including random coefficients Bayesian panel-data models Interval regression with random intercepts and random coefficients

More in graphics

Transparency in graphs SVG export

More in statistics

Bayesian survival models Zero-inflated ordered probit Add your own power and sample-size methods Bayesian sample-selection models And yet more

More in the interface

Stata in Swedish

Stata in Chinese

Improvements to the Do-file Editor

And, even more

Stream random-number generator Improvements for Java plugins

The whole feature list you will find under the following link:

https://www.stata.com/features/

Stata Features

Data management

data transformations, match-merge, ODBC, XML, by-group processing, append files, sort, row–column transposition, labeling, saving results

Basic statistics

summaries, cross-tabulations, correlations, t tests, equality-of-variance tests, tests of proportions, confidence intervals, factor variables

Linear models

regression; bootstrap, jackknife, and robust Huber/White/sandwich variance estimates; instrumental variables; three-stage least squares; constraints; quantile regression; GLS

Multilevel mixed-effects models

generalized linear models;continuous, binary, and count outcomes; two-, three-, and higher-level models; random-intercepts; random-slopes; crossed random effects; BLUPs of effects and fitted values; hierarchical models; residual error structures; support for survey data in linear models

Binary, count, and discrete outcomes

logistic, probit, tobit; Poisson and negative binomial; conditional, multinomial, nested, ordered, rank-ordered, and stereotype logistic; multinomial probit; zero-inflated and left-truncated count models; selection models; marginal effects

Longitudinal data/panel data

random and fixed effects with robust standard errors; linear mixed models, random-effects probit, GEE, random- and fixed-effects Poisson, dynamic panel-data models, and instrumental-variables regression; panel unit-root tests; AR(1) disturbances

Generalized linear models (GLMs)

ten link functions, user-defined links, seven distributions, ML and IRLS estimation, nine variance estimators, seven residuals

Nonparametric methods

Wilcoxon-Mann-Whitney, Wilcoxon signed ranks and Kruskal-Wallis tests; Spearman and Kendall correlations; Kolmogorov-Smirnov tests; exact binomial CIs; survival data; ROC analysis; smoothing; bootstrapping

Exact statistics

exact logistic and Poisson regression, exact case-control statistics, binomial tests, Fisher's exact test for r × c tables

ANOVA/MANOVA

balanced and unbalanced designs; factorial, nested, and mixed designs; repeated measures; marginal means; contrasts

Multivariate methods

factor analysis, principal components, discriminant analysis, rotation, multidimensional scaling, Procrustean analysis, correspondence analysis, biplots, dendrograms, user-extensible analyses

Cluster analysis

hierarchical clustering; kmeans and kmedian nonhierarchical clustering; dendrograms; stopping rules; user-extensible analyses

Resampling and simulation methods

bootstrapping, jackknife and Monte Carlo simulation; permutation tests

Tests, predictions, and effects

Wald tests; LR tests; linear and nonlinear combinations, predictions and generalized predictions, marginal means, least-squares means, adjusted means; marginal and partial effects; forecast models; Hausman tests

Graphics

line charts, scatterplots, bar charts, pie charts, hi-lo charts, regression diagnostic graphs, survival plots, nonparametric smoothers, distribution Q-Q plots

Survey methods

multistage designs; bootstrap, BRR, jackknife, linearized, and SDR variance estimation; poststratification; DEFF; predictive margins; means, proportions, ratios, totals; summary tables; regression, instrumental variables, probit, Cox regression

Survival analysis

Kaplan-Meier and Nelson-Aalen estimators,; Cox regression (frailty); parametric models (frailty); competing risks; hazards; time-varying covariates; left- and right-censoring, Weibull, exponential, and Gompertz analysis

Epidemiology

standardization of rates, case–control, cohort, matched case-control, Mantel-Haenszel, pharmacokinetics, ROC analysis, ICD-9-CM

Time series

ARIMA; ARFIMA; ARCH/GARCH; VAR; VECM; multivariate GARCH; unobserved components model; dynamic factors; state-space models; business calendars; correlograms; periodograms; forecasts; impulse-response functions; unit-root tests; filters and smoothers; rolling and recursive estimation

Multiple imputation

nine univariate imputation methods; multivariate normal imputation; chained equations; explore pattern of missingness; manage imputed datasets; fit model and pool results; transform parameters; joint tests of parameter estimates; predictions

Simple maximum likelihood

specify likelihood using simple expressions; no programming required; survey data; standard, robust, bootstrap, and jackknife SEs; matrix estimators

Programmable maximum likelihood

user-specified functions; NR, DFP, BFGS, BHHH; OIM, OPG, robust, bootstrap, and jackknife SEs; Wald tests; survey data; numeric or analytic derivatives

Other statistical methods

kappa measure of interrater agreement; Cronbach's alpha; stepwise regression; tests of normality

Programming features

adding new commands; command scripting; object-oriented programming; menu and dialog-box programming; Project Manager; plugins

Matrix programming-Mata

interactive sessions, large-scale development projects, optimization, matrix inversions, decompositions, eigenvalues and eigenvectors, LAPACK engine, real and complex numbers, string matrices, interface to Stata datasets and matrices, numerical derivatives, object-oriented programming

Internet capabilities

ability to install new commands, web updating, web file sharing, latest Stata news

Accessibility

Section 508 compliance, accessibility for persons with disabilities

Sample session

A sample session of Stata for Mac, Unix, or Windows.

Community-contributed commands

User-written commands for meta-analysis, data management, survival, econometrics

Graphical user interface

menus and dialogs for all features; Data Editor; Variables Manager; Graph Editor; Project Manager; Do-file Editor; Clipboard Preview Tool; multiple preference sets

Graphics

line charts; scatterplots; bar charts; pie charts; hi-lo charts; contour plots; GUI Editor; regression diagnostic graphs; survival plots; nonparametric smoothers; distribution Q-Q plots

Documentation

20 manuals20 manuals; 11,000+ pages; seamless navigation; thousands of worked examples; methods and formulas; references; 11,000+ pages; seamless navigation; thousands of worked examples; methods and formulas; references

Power and sample size

power; sample size; effect size; minimum detectable effect; means; proportions; variances; correlations; case-control studies; cohort studies; survival analysis; balanced or unbalanced designs; results in tables or graphs

Treatment effects

inverse probability weight (IPW); doubly robust methods; propensity score matching; regression adjustment; covariate matching; multilevel treatments; average treatment effects (ATEs); average treatment effects on the treated (ATETs); potential-outcome means (POMs)

SEM (Structural equation modeling)

graphical path diagram builder; standardized and unstandardized estimates; modification indices; direct and indirect effects; continuous, binary, count, and ordinal outcomes (GLM); multilevel models; random slopes and intercepts; factors scores, empirical Bayes, and other predictions; groups and tests of invariance; goodness of fit; handles MAR data by FIML; correlated data

Functions

statistical; random-number; mathematical; string; date and time

Embedded statistical computations

Numerics by Stata

Contrasts, pairwise comparisons, and margins

compare means, intercepts, or slopes; compare to reference category, adjacent category, grand mean, etc.; orthogonal polynomials; multiple comparison adjustments; graph estimated means and contrasts; interaction plots

GMM an nonlinear regression

generalized method of moments (GMM); nonlinear regression

Software selection

Stata BE

Stata/BE

Stata Software is available in 3 different flavors

Stata scripting language

Efficient Datamanagent with Stata

Professional Graphics with Stata

Further Information:

Trialversion of Stata

Compatible operating systems

Platforms

Hardware requirements

What's new in Stata?

More in panel data

More in graphics

More in statistics

More in the interface

And, even more

Stata Features

Data management

Basic statistics

Linear models

Multilevel mixed-effects models

Binary, count, and discrete outcomes

Longitudinal data/panel data

Generalized linear models (GLMs)

Nonparametric methods

Exact statistics

ANOVA/MANOVA

Multivariate methods

Cluster analysis

Resampling and simulation methods

Tests, predictions, and effects

Graphics

Survey methods

Survival analysis

Epidemiology

Time series

Multiple imputation

Simple maximum likelihood

Programmable maximum likelihood

Other statistical methods

Programming features

Matrix programming-Mata

Internet capabilities

Accessibility

Sample session

Community-contributed commands

Graphical user interface

Graphics

Documentation

Power and sample size

Treatment effects

SEM (Structural equation modeling)

Functions

Embedded statistical computations

Contrasts, pairwise comparisons, and margins

GMM an nonlinear regression