Stata's expertise lies in the analysis of time based data. Stata provides not only the basic time series models like ARIMA but even the multivariate equivalents (VAR/VEC-Models) as well. Further you are able to model volatility using GARCH-models in Stata. Kaplan-Meier-curves are the way to analyse survival times, while mixed models help to analyse panel data. A mighty scripting language completes the package.
Stata produces all kinds of classical statistics. You can use it for descriptive statistics, hypothesis testing and visualization of data. Typically Stata is used in research and development. The large amount of different statistical methods helps scientists in all fields of applications (Social science, econometrics, epidimiology, medical research).
No matter if you are a student or a senior researcher, there is always the right version of Stata available: Stata/BE, Stata/SE and Stata/MP
Arguments for Stata:
- Used in research and development
- Wide range of statistical and graphical methods
- Comprehensive statistical software
- Flexible and especially powerful for analysis of time series
- Easy to learn but mighty scripting language
Recommended products
Stata SE
STATA MP
EViews 14
Stata/BE
Stata statistical software is a complete, integrated statistical software package that provides everything you need for data analysis, data management, and graphics. Stata is not sold in modules, which means you get everything you need in one package.
Easy to learn yet fully programmable for the most demanding data management and statistical requirements.
With Stata's menus and dialogs, you can easily point and click or drag and drop your way to all of Stata's statistical, graphical, and data management features. You can completely reshape your data, create group-level variables for panel or longitudinal data, graph a receiver operating characteristics (ROC) curve or impulse-response function (IRF), perform a case-control analysis, estimate a random-effects count-data model or a Cox proportional hazards model, or compute marginal effects from a nonlinear estimator. You can even access the dialog boxes for each command directly from the online help system. T his is a great way to explore all of the capabilities of Stata.
Stata Software is available in 3 different flavors
Whether you’re a student or a seasoned research professional, we have a package designed to suit your needs:
- Stata/MP: The fastest version of Stata (for quad-core, dual-core, and multicore/multiprocessor computers) that can analyze the most data
- Stata/SE: Stata for large datasets
- Stata/BE: Stata for mid-sized datasets
- Numerics by Stata: Stata for embedded and web applications
Stata/MP is the fastest and largest version of Stata. Virtually any current computer can take advantage of the advanced multiprocessing of Stata/MP. This includes the Intel i3, i5, i7, i9, Xeon, and Celeron, and AMD multi-core chips. On dual-core chips, Stata/MP runs 40% faster overall and 72% faster where it matters, on the time-consuming estimation commands. With more than two cores or processors, Stata/MP is even faster. Find out more about Stata/MP.
Stata/MP, Stata/SE, and Stata/BE all run on any machine, but Stata/MP runs faster. You can purchase a Stata/MP license for up to the number of cores on your machine (maximum is 64). For example, if your machine has eight cores, you can purchase a Stata/MP license for eight cores, four cores, or two cores.
Stata/MP can also analyze more data than any other flavor of Stata. Stata/MP can analyze 10 to 20 billion observations given the current largest computers, and is ready to analyze up to 1 trillion observations once computer hardware catches up.
Stata/SE and Stata/BE differ only in the dataset size that each can analyze. Stata/SE and Stata/MP can fit models with more independent variables than Stata/BE (up to 10,998). Stata/SE can analyze up to 2 billion observations.
Stata/BE allows datasets with as many as 2,048 variables and 2 billion observations. Stata/BE can have at most 798 independent variables in a model.
Numerics by Stata can support any of the data sizes listed above in an embedded environment.
All the above flavors have the same complete set of features and include PDF documentation.
Product features | Stata/BE | Stata/SE | Stata/MP |
Maximum number of variables | 2,048 | 32,767 | 120 |
Maximum number of observations | 2.14 billion | 2.14 billion | Up to 20 billion |
Maximum number of independent variables | 798 | 10,998 | 10,998 |
Multicore support (Time to run logistic regression with 5 million obs and 10 covariates ) | 1-core/ 10.0 sec | 1-core/ 10.0 sec | 2- core (5.0 sec), 4-core (2,6 sec), 4+ core (even faster) |
Complete suite of statistical features | Yes! | Yes! | Yes! |
Publication-quality graphics | Yes! | Yes! | Yes! |
Matrix programming language | Yes! | Yes! | Yes! |
Complete PDF documentation | Yes! | Yes! | Yes! |
Exceptional technical support | Yes! | Yes! | Yes! |
Includes within-release updates | Yes! | Yes! | Yes! |
64-bit version available | Yes! | Yes! | Yes! |
Windows, macOS, and Linux | Yes! | Yes! | Yes! |
Memory requirements | 1 GB | 2 GB | 4 GB |
Disk space requirements | 1 GB | 1 GB | 1 GB |
* The maximum number of observations is limited only by the amount of available RAM on your system.
Stata scripting language
Stata's scripting language is easy to learn and helps you to get the most out of your data. It allows not only to use and modify the existing routines to generate standard reports, but can easily be extended with newly created statistical functions.
Efficient Datamanagent with Stata
Datamanagement with Stata is easy and efficient. Joining datasets, creating new variables or producing summary tables is done in no time.
Professional Graphics with Stata
Stata provides professional graphics that can directly be used for documents and publications. This includes not only pre-defined standard graphs but although highly customizeable graphics.
Further Information:
https://www.stata.com/why-use-stata/
Trialversion of Stata
The producer provides a free 30-day trialversion on their website. The trialversion contains all the features of Stata. You can register for this license simply by visiting the following link: http://www.stata.com/customer-service/evaluate-stata/
Compatible operating systems
Stata will run on the platforms listed below. While Stata software is platform-specific, your Stata license is not; therefore, you need not specify your operating system when placing your order for a license.
running Stata on a dual-core, multicore, or multiprocessor computer.
Platforms
- Windows 10 *
- Windows 8 *
- Windows Server 2019, 2016, 2012 *
* Stata requires 64-bit Windows for x86-64 processors made by Intel® or AMD
- Mac with Apple Silicon or 64-bit Intel processor
- macOS 11.0 (Big Sur) or newer for Macs with Apple Silicon and macOS 10.12 (Sierra) or newer for Macs with 64-bit Intel processors
- Any 64-bit (x86-64 or compatible) running Linux
- For xstata, you need to have GTK 2.24 installed
Hardware requirements
Package | Memory | Disk space |
---|---|---|
Stata/MP | 4 GB | 2 GB |
Stata/SE | 2 GB | 2 GB |
Stata/BE | 1 GB | 2 GB |
Stata for Linux requires a video card that can display thousands of colors or more (16-bit or 24-bit color)
What's new in Stata?TablesCustomize your tables of
Export to
Bayesian econometricsBayesian
PyStata—Python and Stata
Jupyter Notebook with Stata
Difference-in-differences (DID) and DDD models
Faster StataStata is fast, and keeps getting faster.
Interval-censored Cox modelYou want to model time to an event. But you don't know the exact event times—only the intervals in which events happen. And you don't want to make parametric assumptions. Try an interval-censored Cox model.
Multivariate meta-analysisDo you have multiple effect sizes? Do they share a common control group? Do they share the same group of subjects? Multivariate meta-analysis can help. Bayesian VAR modelsYou fit your VAR models with var. You fit your Bayesian regression models with bayes:. Now fit your Bayesian VAR models with bayes: var.
Bayesian multilevel modelingNonlinear, joint, SEM-like, and more. More multilevel models. More powerful. Easier to use. Treatment-effects lasso estimationWhen you want: Causal inference, average treatment effects, potential-outcome means, double-robust estimation And you have: Many (maybe hundreds or thousands of) potential covariates Use treatment-effects estimation with lasso variable selection.
New functions for dates and times
Leave-one-out meta-analysisAre there influential studies in your data? Use leave-one-out meta-analysis to find out.
Galbraith plotsGraphically summarize meta-analysis results
Detect potential outliers Assess heterogeneity Panel-data multinomial logit modelYou can model categorical outcomes with mlogit. You can model panel data with xt. Now you can do both! Stata's new xtmlogit command models categorical outcomes that change over time.
Bayesian panel-data modelsBayesian analysis lets you answer probabilistic questions with panel-data models.
Incorporate prior knowledge, see posterior distributions of random effects, compute Bayesian predictions, and more. Zero-inflated ordered logit modelNeed to model an ordinal outcome? Have excess zeros (or responses in the lowest category)? ziologit is the answer.
Nonparametric tests for trendDo responses have an increasing or decreasing trend? Find out using one of four nonparametric tests for trend:
Bayesian IRF and FEVD analysisWhat is the effect of a shock over time? What is the mean or median of the effect for a distribution of probable scenarios? Bayesian IRF analysis answers these and more.
Bayesian dynamic forecastingAfter VAR, you want a dynamic forecast. After Bayesian estimation, you want statistics of posterior distributions. Estimate both. Visualize both. Lasso with clustered dataYour data have ... many variables. Your data have ... clusters of observations. Your lasso for prediction, model selection, or inference can now select variables while accounting for clustering. BIC for lasso penalty selectionWhich variables should lasso include? BIC for lasso penalty selection can tell you. Bayesian linear and nonlinear DSGE modelsForming rational expectations DSGE models include Prior information helps.
Do-file Editor enhancements
Stata on Apple Silicon
Intel Math Kernel Library (MKL)Mata functions and operators use heavily optimized LAPACK routines underpinned by the Intel Math Kernel Library. Use your favorite Stata commands like always; underlying functions are faster, so you get results faster. Java integration
H2O integration
JDBCConnecting Stata to databases is now easier. Want to access data from Oracle, MySQL, Amazon Redshift, Snowflake, Microsoft SQL Server, and others? Use jdbc. Want one driver that works on Windows, Mac, and Linux? Use jdbc. Search, browse, and import FRED dataThe St. Louis Federal Reserve makes available over 470,000 U.S. and international economic and financial time series. You can now easily search, browse, and import these data. |
Multilevel regression for interval-measured outcomesIncomes are sometimes recorded in groupings, as are people's weights, insect counts, grade-point averages, and hundreds of other measures. Often we have repeated measurements for individuals, or schools, or orchards, etc. So ... we need multilevel regression for interval-measured (interval-censored) outcomes. |
Multilevel tobit regression for censored outcomes
|
Panel-data cointegration tests
|
Tests for multiple breaks in time series
|
Multiple-group generalized SEMGeneralized SEM now supports multiple-group analysis. Easily specify groups and test parameter invariance across groups. GSEM models include
|
ICD-10-CM/PCS
|
Power for cluster randomized designsPower analysis for comparing
when you randomize clusters instead of individuals |
Power for linear regression models
|
Heteroskedastic linear regression
|
Poisson models with sample selectionCounts are common. How many: Fish did you catch?
Accidents occurred? Patents does a firm generate? Outcomes are not always seen. Folks evade the game warden.
Accidents are not always reported. Some firms prefer trade secrets to patents. So you need Poisson models with sample selection. |
More in panel dataNonlinear models with random effects, including random coefficients Bayesian panel-data models Interval regression with random intercepts and random coefficients |
More in graphicsTransparency in graphs SVG export |
More in statisticsBayesian survival models Zero-inflated ordered probit Add your own power and sample-size methods Bayesian sample-selection models And yet more |
More in the interfaceStata in Swedish Stata in Chinese Improvements to the Do-file Editor |
And, even more
Stream random-number generator Improvements for Java plugins
The whole feature list you will find under the following link:
https://www.stata.com/features/
Stata Features
Data management
data transformations, match-merge, ODBC, XML, by-group processing, append files, sort, row–column transposition, labeling, saving results
Basic statistics
summaries, cross-tabulations, correlations, t tests, equality-of-variance tests, tests of proportions, confidence intervals, factor variables
Linear models
regression; bootstrap, jackknife, and robust Huber/White/sandwich variance estimates; instrumental variables; three-stage least squares; constraints; quantile regression; GLS
Multilevel mixed-effects models
generalized linear models;continuous, binary, and count outcomes; two-, three-, and higher-level models; random-intercepts; random-slopes; crossed random effects; BLUPs of effects and fitted values; hierarchical models; residual error structures; support for survey data in linear models
Binary, count, and discrete outcomes
logistic, probit, tobit; Poisson and negative binomial; conditional, multinomial, nested, ordered, rank-ordered, and stereotype logistic; multinomial probit; zero-inflated and left-truncated count models; selection models; marginal effects
Longitudinal data/panel data
random and fixed effects with robust standard errors; linear mixed models, random-effects probit, GEE, random- and fixed-effects Poisson, dynamic panel-data models, and instrumental-variables regression; panel unit-root tests; AR(1) disturbances
Generalized linear models (GLMs)
ten link functions, user-defined links, seven distributions, ML and IRLS estimation, nine variance estimators, seven residuals
Nonparametric methods
Wilcoxon-Mann-Whitney, Wilcoxon signed ranks and Kruskal-Wallis tests; Spearman and Kendall correlations; Kolmogorov-Smirnov tests; exact binomial CIs; survival data; ROC analysis; smoothing; bootstrapping
Exact statistics
exact logistic and Poisson regression, exact case-control statistics, binomial tests, Fisher's exact test for r × c tables
ANOVA/MANOVA
balanced and unbalanced designs; factorial, nested, and mixed designs; repeated measures; marginal means; contrasts
Multivariate methods
factor analysis, principal components, discriminant analysis, rotation, multidimensional scaling, Procrustean analysis, correspondence analysis, biplots, dendrograms, user-extensible analyses
Cluster analysis
hierarchical clustering; kmeans and kmedian nonhierarchical clustering; dendrograms; stopping rules; user-extensible analyses
Resampling and simulation methods
bootstrapping, jackknife and Monte Carlo simulation; permutation tests
Tests, predictions, and effects
Wald tests; LR tests; linear and nonlinear combinations, predictions and generalized predictions, marginal means, least-squares means, adjusted means; marginal and partial effects; forecast models; Hausman tests
Graphics
line charts, scatterplots, bar charts, pie charts, hi-lo charts, regression diagnostic graphs, survival plots, nonparametric smoothers, distribution Q-Q plots
Survey methods
multistage designs; bootstrap, BRR, jackknife, linearized, and SDR variance estimation; poststratification; DEFF; predictive margins; means, proportions, ratios, totals; summary tables; regression, instrumental variables, probit, Cox regression
Survival analysis
Kaplan-Meier and Nelson-Aalen estimators,; Cox regression (frailty); parametric models (frailty); competing risks; hazards; time-varying covariates; left- and right-censoring, Weibull, exponential, and Gompertz analysis
Epidemiology
standardization of rates, case–control, cohort, matched case-control, Mantel-Haenszel, pharmacokinetics, ROC analysis, ICD-9-CM
Time series
ARIMA; ARFIMA; ARCH/GARCH; VAR; VECM; multivariate GARCH; unobserved components model; dynamic factors; state-space models; business calendars; correlograms; periodograms; forecasts; impulse-response functions; unit-root tests; filters and smoothers; rolling and recursive estimation
Multiple imputation
nine univariate imputation methods; multivariate normal imputation; chained equations; explore pattern of missingness; manage imputed datasets; fit model and pool results; transform parameters; joint tests of parameter estimates; predictions
Simple maximum likelihood
specify likelihood using simple expressions; no programming required; survey data; standard, robust, bootstrap, and jackknife SEs; matrix estimators
Programmable maximum likelihood
user-specified functions; NR, DFP, BFGS, BHHH; OIM, OPG, robust, bootstrap, and jackknife SEs; Wald tests; survey data; numeric or analytic derivatives
Other statistical methods
kappa measure of interrater agreement; Cronbach's alpha; stepwise regression; tests of normality
Programming features
adding new commands; command scripting; object-oriented programming; menu and dialog-box programming; Project Manager; plugins
Matrix programming-Mata
interactive sessions, large-scale development projects, optimization, matrix inversions, decompositions, eigenvalues and eigenvectors, LAPACK engine, real and complex numbers, string matrices, interface to Stata datasets and matrices, numerical derivatives, object-oriented programming
Internet capabilities
ability to install new commands, web updating, web file sharing, latest Stata news
Accessibility
Section 508 compliance, accessibility for persons with disabilities
Sample session
A sample session of Stata for Mac, Unix, or Windows.
Community-contributed commands
User-written commands for meta-analysis, data management, survival, econometrics
Graphical user interface
menus and dialogs for all features; Data Editor; Variables Manager; Graph Editor; Project Manager; Do-file Editor; Clipboard Preview Tool; multiple preference sets
Graphics
line charts; scatterplots; bar charts; pie charts; hi-lo charts; contour plots; GUI Editor; regression diagnostic graphs; survival plots; nonparametric smoothers; distribution Q-Q plots
Documentation
20 manuals20 manuals; 11,000+ pages; seamless navigation; thousands of worked examples; methods and formulas; references; 11,000+ pages; seamless navigation; thousands of worked examples; methods and formulas; references
Power and sample size
power; sample size; effect size; minimum detectable effect; means; proportions; variances; correlations; case-control studies; cohort studies; survival analysis; balanced or unbalanced designs; results in tables or graphs
Treatment effects
inverse probability weight (IPW); doubly robust methods; propensity score matching; regression adjustment; covariate matching; multilevel treatments; average treatment effects (ATEs); average treatment effects on the treated (ATETs); potential-outcome means (POMs)
SEM (Structural equation modeling)
graphical path diagram builder; standardized and unstandardized estimates; modification indices; direct and indirect effects; continuous, binary, count, and ordinal outcomes (GLM); multilevel models; random slopes and intercepts; factors scores, empirical Bayes, and other predictions; groups and tests of invariance; goodness of fit; handles MAR data by FIML; correlated data
Functions
statistical; random-number; mathematical; string; date and time
Embedded statistical computations
Numerics by Stata
Contrasts, pairwise comparisons, and margins
compare means, intercepts, or slopes; compare to reference category, adjacent category, grand mean, etc.; orthogonal polynomials; multiple comparison adjustments; graph estimated means and contrasts; interaction plots
GMM an nonlinear regression
generalized method of moments (GMM); nonlinear regression