SUDAAN Procedures

SUDAAN is a single program comprising a family of nine analytic and two new pre-analytic procedures. The two pre-analytic procedures include one that will compute weight adjustments using a model-based, weight calibration methodology (WTADJUST) and a second procedure that will imputed data using the Cox-Iannacchione Weighted Sequential Hot Deck method (HOTDECK). SUDAAN procedures are used to analyze data from complex sample surveys and other observational and experimental studies involving repeated measures and cluster-correlated data. Included in SUDAAN are procedures for descriptive statistics and regression modeling.

Weighting and Imputation Procedures

WTADJUSTNEW in SUDAAN 10: Produces nonresponse and poststratification sample weight adjustments using a model-based calibration approach. A weight truncation option is available that can be used to trim extreme weights. Any loss/gain in the weight sum is accounted for in the subsequent computation of the weight adjustments.

HOTDECKNEW in SUDAAN 10: Performs the Cox-Iannacchione Weighted Sequential Hot Deck method of imputation. This hot deck imputation methodology yields weighted means/percentages using the imputed data that are equal, in expectation, to the weighted means/percentages of the respondent data within user-specified imputation classes. This procedure also has the advantage of controlling the number of times a particular donor can be used.

Descriptive Procedures

CROSSTAB—Computes frequencies, percentage distributions, odds ratios, relative risks, and their standard errors (or confidence intervals) for user-specified cross-tabulations, as well as tests of independence for single and stratified two-way tables. NEW in SUDAAN 10: Additional Cochran-Mantel-Haenszel hypothesis tests for single and stratified two-way tables, additional test statistics for all hypotheses (analogous to regression procedures), and a goodness-of-fit test for categorical count data against a set of known proportions. Also includes a new method for computing confidence intervals for extreme proportions.

RATIO—Computes estimates, standard errors, and confidence limits of generalized ratios of the form Σi wix / Σi wiyi. Also computes standardized estimates and tests single-degree-of-freedom contrasts among levels of a categorical variable.

DESCRIPT—Computes estimates of means, totals, proportions, percentages, geometric means, quantiles, and their standard errors and confidence limits. Also computes standardized estimates and tests of single-degree-of-freedom contrasts among levels of a categorical variable. NEW in SUDAAN 10: Enhanced method for estimating quantiles and their standard errors.

Survival Procedures

SURVIVAL—Fits discrete and continuous proportional hazards models to failure time data; also estimates hazard ratios and their confidence intervals for each model parameter. Includes facilities for time-dependent covariates, the counting process style of input, stratified baseline hazards, and Schoenfeld and Martingale residuals. Estimates conditional and predicted marginals and tests hypotheses about the marginals. NEW in SUDAAN 10: SURVIVAL provides estimates of exponentiated contrasts among model parameters (with confidence intervals), which are useful for the computation of user-specified hazard ratios.

KAPMEIER—Fits the Kaplan-Meier model, also known as the product limit estimator, to survival data from sample surveys and other clustered data applications. KAPMEIER uses either discrete or continuous time variables to provide point estimates for the survival curve for failure time outcomes that may contain censored observations.

Regression Procedures

REGRESS—Fits linear regression models and performs hypothesis tests concerning the model parameters. Uses Generalized Estimating Equations (GEE) to efficiently estimate regression parameters with robust and model-based variance estimation. Estimates conditional and predicted marginals and tests hypotheses about the marginals.

LOGISTIC—Fits logistic regression models to binary data and computes hypothesis tests for model parameters; also estimates odds ratios and their confidence intervals for each model parameter. Uses GEE to efficiently estimate regression parameters with robust and model-based variance estimation. Estimates conditional and predicted marginals and tests hypotheses about the marginals. NEW in SUDAAN 10: LOGISTIC provides estimates of exponentiated contrasts among model parameters (with confidence intervals), which are useful for the computation of user-specified odds ratios. Also provides estimates of model-adjusted risks, risk differences, and risk ratios.

MULTILOG—Fits logistic and multinomial logistic regression models to ordinal and nominal categorical data and computes hypothesis tests for model parameters; estimates odds ratios and their confidence intervals for each model parameter; uses GEE to efficiently estimate regression parameters with robust and model-based variance estimation. Estimates conditional and predicted marginals and tests hypotheses about the marginals. NEW in SUDAAN 10: MULTILOG provides estimates of exponentiated contrasts among model parameters (with confidence intervals), which are useful for the computation of user-specified odds ratios. Also provides estimates of model-adjusted risks, risk differences, and risk ratios.

LOGLINK—Fits log-linear regression models to count data not in the form of proportions. Uses GEE to efficiently estimate regression parameters, with robust and model-based variance estimation. Estimates conditional and predicted marginals and tests hypotheses about the marginals. NEW in SUDAAN 10: LOGLINK provides estimates of exponentiated contrasts among model parameters (with confidence intervals), which are useful for the computation of user-specified incidence density ratios.

Utility Procedure

RECORDS—Prints observations from the input data set, obtains the contents of the input data set, converts an input data set from one type to another, and sorts a data set.

Also NEW in SUDAAN 10

  • Enhanced memory manager allows SUDAAN to process very large data sets. Use the USEVMEM option available on every procedure to control programming efficiency.
  • Use the new NOTSORTED option to specify input data sets that are not sorted by the required design variables (e.g., the NEST variables).
  • All print output can be requested in ASCII or in the new RTF format.