IBM SPSS Statistics, AMOS. Regression, Base, Premium, Professional, Standard
About IBM SPSS Statistics
WHAT IS SPSS STATISTICS?
SPSS Statistics is a software program for statistical data analysis. Commands can be executed using the menu system or using command syntax. It is available for both Windows and Macintosh operating systems.
Note: SPSS originally stood for "Software Package for Social Sciences". SPSS has also gone by the name PASW Statistics, which stood for "Predictive Analytics Software".
WHY USE SPSS?
- Intuitive drop-down menu system is easy to learn for beginners.
- Syntax adds flexibility, customization, and automation options.
- Supported on Windows, Macintosh, and Linux operating systems.
- Widely used in the social sciences.
WHAT FILE TYPES ARE SPECIFICALLY ASSOCIATED WITH SPSS?
- *.sav - An SPSS data file. Contains the raw data and any associated labels and type definitions.
- *.sps - An SPSS syntax file. Allows you to execute a specific set of commands on a chosen data file.
- *.spv - A 'viewer' file that was created in SPSS version 16 or later. Displays all of the logs and the output created during an SPSS session.
- *.spo - A 'viewer' file that was created in SPSS version 15 or earlier. Displays all of the logs and the output created during an SPSS session.
- *.jnl - A 'journal' file. Records all syntax executed during an SPSS session.
Note: Viewer files with *.spo extensions were created in SPSS Statistics version 15 or earlier. These files can not be opened in SPSS Statistics versions 16 and later.
About IBM SPSS Amos
WHAT IS SPSS AMOS?
IBM SPSS Amos is a software program used to fit structural equation models (SEM). Unlike SPSS Statistics, SPSS Amos is only available for the Windows operating system.
Amos is technically "standalone" - it can be used without having SPSS Statistics installed.
WHAT FILE TYPES ARE SPECIFICALLY ASSOCIATED WITH SPSS AMOS?
- *.amw - An Amos model file, which contains the model diagram.
IBM SPSS Statistics Standard Edition
Fundamental analytical capabilities for a wide variety of business and research questions.
The IBM SPSS Statistics Standard Edition offers the core statistical procedures business managers and analysts need to address fundamental business and research questions. This software provides tools that allow users to quickly view data, formulate hypotheses for additional testing, and carry out procedures to clarify relationships between variables, create clusters, identify trends and make predictions.
The IBM SPSS Statistics Standard edition includes the following key capabilities:
Linear models
- Statistics Standard includes generalized linear mixed models (GLMM) for use with hierarchical data.
- This software has general linear models (GLM) and mixed models procedures.
- It includes generalized linear models (GENLIN), including widely used statistical models such as linear regression for mormally distributed responses, logistic models for binary data, and loglinear models for count data. GENLIN also offers many useful statistical models through its very general model formulation.
- Generalized estimating equations (GEE) procedures extend generalized linear models to accommodate correlated longitudinal data and clustered data.
Nonlinear models
- Multinomial logistic regression (MLR) predict categorical outcomes with more than two categories.
- Binary logistic regression classifies data into two groups.
- Nonlinear regression (NLR) and constrained nonlinear regression (CNLR) estimate parameters of nonlinear models
- Probit analysis evaluates the value of stimuli using a logit or probit transformation of the proportion responding
Simulation capabilities
- Monte Carlo techniques provide the ability to simulate data according to parameters you specify, and then use that simulated data as input for predicting an outcome.
- The parameters used can be modified to simulate the data and compare outcomes.
- Specifications for a simulation can be saved to a simulation plan file.
- Simulations may be run using specifications from a loaded simulation plan file. Users can also provide specifications in the user interface and run the simulation from the interface.
Customized tables
- Means or proportions are compared for demographic groups, customer segments, time periods or other categorical variables when including inferential statistics.
- The software creates summary statistics - from simple counts for categorical variables to measures of dispersion – and sorts categories by any summary statistic used.
- It includes three significance tests: Chi-square test of independence, comparison of column means (t test), or comparison of column proportions (z test).
- An interactive table builder provides drag and drop capabilities for creating pivot tables.
- It excludes specific categories, displays missing value cells and can add subtotals to tables.
- Tables can be previewed in real time and modified as they are created.
- Tables are exportable to Microsoft® Word, Excel®, PowerPoint® or HTML for use in reports.
IBM SPSS Statistics Professional Edition
Tools to address the challenges of the entire analytic life cycle
The IBM SPSS Statistics Professional Edition goes beyond the core statistical capabilities offered in the Standard Edition to address issues of data quality, data complexity, automation and forecasting. It is designed for users who perform many types of in-depth and non-standard analyses and who need to save time by automating data preparation tasks.
The IBM SPSS Statistics Professional edition includes the following key capabilities:
Linear models
- Statistics Standard includes generalized linear mixed models (GLMM) for use with hierarchical data.
- This software has general linear models (GLM) and mixed models procedures.
- It includes generalized linear models (GENLIN), including widely used statistical models such as linear regression for mormally distributed responses, logistic models for binary data, and loglinear models for count data. GENLIN also offers many useful statistical models through its very general model formulation.
- Generalized estimating equations (GEE) procedures extend generalized linear models to accommodate correlated longitudinal data and clustered data.
Nonlinear models
- Multinomial logistic regression (MLR) predict categorical outcomes with more than two categories.
- Binary logistic regression classifies data into two groups.
- Nonlinear regression (NLR) and constrained nonlinear regression (CNLR) estimate parameters of nonlinear models
- Probit analysis evaluates the value of stimuli using a logit or probit transformation of the proportion responding
Simulation capabilities
- Monte Carlo techniques provide the ability to simulate data according to parameters you specify, and then use that simulated data as input for predicting an outcome.
- The parameters used can be modified to simulate the data and compare outcomes.
- Specifications for a simulation can be saved to a simulation plan file.
- Simulations may be run using specifications from a loaded simulation plan file. Users can also provide specifications in the user interface and run the simulation from the interface.
Customized tables
- Means or proportions are compared for demographic groups, customer segments, time periods or other categorical variables when including inferential statistics.
- The software creates summary statistics - from simple counts for categorical variables to measures of dispersion – and sorts categories by any summary statistic used.
- It includes three significance tests: Chi-square test of independence, comparison of column means (t test), or comparison of column proportions (z test).
- An interactive table builder provides drag and drop capabilities for creating pivot tables.
- It excludes specific categories, displays missing value cells and can add subtotals to tables.
- Tables can be previewed in real time and modified as they are created.
- Tables are exportable to Microsoft® Word, Excel®, PowerPoint® or HTML for use in reports.
Data preparation
- SPSS Statistics Professional identifies suspicious or invalid cases, variables and data values.
- The software lets you view patterns of missing data and summarize variable distributions.
- Optimal Binning finds the best possible outcome for algorithms designed for nominal attributes.
- The Automated Data Preparation (ADP) tool detects and corrects quality errors and imputes missing values in one efficient step.
- Recommendations and visualizations help you determine which data to use.
Data validity and missing values
- SPSS Statistics Professional examines data from several different angles using one of six diagnostic reports, then estimates summary statistics and imputes missing values.
- It diagnoses serious missing data imputation problems.
- The software replaces missing values with estimates.
- It displays a snapshot for each type of missing value and any extreme values for each case.
- Hidden bias is removed by replacing missing values with estimates to include all groups—even those with poor responsiveness.
Decision trees
- SPSS Statistics Professional visually determines how your model flows so you can find specific subgroups and relationships.
- The software creates classification trees directly within IBM SPSS Statistics so you can use results to segment and group cases directly within the data.
- It includes four established tree-growing algorithms:
- CHAID—A fast, statistical, multi-way tree algorithm that explores data quickly and efficiently, and builds segments and profiles with respect to the desired outcome.
- Exhaustive CHAID—A modification of CHAID, which examines all possible splits for each predictor.
- Classification and regression trees (C&RT)—A complete binary tree algorithm, which partitions data and produces accurate homogeneous subsets.
- QUEST—A statistical algorithm that selects variables without bias and builds accurate binary trees quickly and efficiently.
- Selection or classification/prediction rules are generated in IBM SPSS Statistics syntax, SQL statements or simple text (through syntax).
Forecasting
- SPSS Statistics Professional enables you to deliver information in ways that your organization’s decision-makers can understand and use.
- It automatically determines the best-fitting ARIMA or exponential smoothing model to analyze your historic data.
- Hundreds of different time series can be modeled at once, rather than one variable at a time.
- Models are saved to a central file so that forecasts can be updated when data changes without having to re-set parameters or re-estimate models.
- Scripts can be written to update models with new data automatically.
IBM SPSS Statistics Premium Edition
An “all-in-one” edition designed for enterprise businesses with multiple advanced analytics requirements
The IBM SPSS Statistics Premium Edition helps data analysts, planners, forecasters, survey researchers, program evaluators and database marketers – among others – to easily accomplish tasks at every phase of the analytical process. It includes a broad array of fully integrated Statistics capabilities and related products for specialized analytical tasks across the enterprise. The software will improve productivity significantly and help achieve superior results for specific projects and business goals.
The IBM SPSS Statistics Premium edition includes the following key capabilities:
Linear models
- Statistics Standard includes generalized linear mixed models (GLMM) for use with hierarchical data.
- This software has general linear models (GLM) and mixed models procedures.
- It includes generalized linear models (GENLIN), including widely used statistical models such as linear regression for mormally distributed responses, logistic models for binary data, and loglinear models for count data. GENLIN also offers many useful statistical models through its very general model formulation.
- Generalized estimating equations (GEE) procedures extend generalized linear models to accommodate correlated longitudinal data and clustered data.
Nonlinear models
- Multinomial logistic regression (MLR) predict categorical outcomes with more than two categories.
- Binary logistic regression classifies data into two groups.
- Nonlinear regression (NLR) and constrained nonlinear regression (CNLR) estimate parameters of nonlinear models
- Probit analysis evaluates the value of stimuli using a logit or probit transformation of the proportion responding
Simulation capabilities
- Monte Carlo techniques provide the ability to simulate data according to parameters you specify, and then use that simulated data as input for predicting an outcome.
- The parameters used can be modified to simulate the data and compare outcomes.
- Specifications for a simulation can be saved to a simulation plan file.
- Simulations may be run using specifications from a loaded simulation plan file. Users can also provide specifications in the user interface and run the simulation from the interface.
Customized tables
- Means or proportions are compared for demographic groups, customer segments, time periods or other categorical variables when including inferential statistics.
- The software creates summary statistics - from simple counts for categorical variables to measures of dispersion – and sorts categories by any summary statistic used.
- It includes three significance tests: Chi-square test of independence, comparison of column means (t test), or comparison of column proportions (z test).
- An interactive table builder provides drag and drop capabilities for creating pivot tables.
- It excludes specific categories, displays missing value cells and can add subtotals to tables.
- Tables can be previewed in real time and modified as they are created.
- Tables are exportable to Microsoft® Word, Excel®, PowerPoint® or HTML for use in reports.
Data preparation
- SPSS Statistics Professional identifies suspicious or invalid cases, variables and data values.
- The software lets you view patterns of missing data and summarize variable distributions.
- Optimal Binning finds the best possible outcome for algorithms designed for nominal attributes.
- The Automated Data Preparation (ADP) tool detects and corrects quality errors and imputes missing values in one efficient step.
- Recommendations and visualizations help you determine which data to use.
Data validity and missing values
- SPSS Statistics Professional examines data from several different angles using one of six diagnostic reports, then estimates summary statistics and imputes missing values.
- It diagnoses serious missing data imputation problems.
- The software replaces missing values with estimates.
- It displays a snapshot for each type of missing value and any extreme values for each case.
- Hidden bias is removed by replacing missing values with estimates to include all groups—even those with poor responsiveness.
Categorical and numeric data
- This software discovers underlying relationships through perceptual maps, bi plots and tri plots.
- It uses procedures similar to conventional regression, principal components and canonical correlation to predict outcomes and reveal relationships – helping you work with and understand nominal (e.g. salary) and ordinal (e.g. education level) data.
- Statistics Premium lets you visually interpret datasets and see how rows and columns relate in large tables of scores, counts, ratings, rankings or similarities.
- The software handles non-normal residuals in numeric data or nonlinear relationships between predictor variables (e.g. customer or product attributes) and the outcome variable (e.g. purchase/non-purchase).
- Techniques include Ridge Regression, the Lasso, the Elastic Net, variable selection and model selection for both numeric and categorical data.
Decision trees
- SPSS Statistics Professional visually determines how your model flows so you can find specific subgroups and relationships.
- The software creates classification trees directly within IBM SPSS Statistics so you can use results to segment and group cases directly within the data.
- It includes four established tree-growing algorithms:
- CHAID—A fast, statistical, multi-way tree algorithm that explores data quickly and efficiently, and builds segments and profiles with respect to the desired outcome.
- Exhaustive CHAID—A modification of CHAID, which examines all possible splits for each predictor.
- Classification and regression trees (C&RT)—A complete binary tree algorithm, which partitions data and produces accurate homogeneous subsets.
- QUEST—A statistical algorithm that selects variables without bias and builds accurate binary trees quickly and efficiently.
- Selection or classification/prediction rules are generated in IBM SPSS Statistics syntax, SQL statements or simple text (through syntax).
Forecasting
- SPSS Statistics Professional enables you to deliver information in ways that your organization’s decision-makers can understand and use.
- It automatically determines the best-fitting ARIMA or exponential smoothing model to analyze your historic data.
- Hundreds of different time series can be modeled at once, rather than one variable at a time.
- Models are saved to a central file so that forecasts can be updated when data changes without having to re-set parameters or re-estimate models.
- Scripts can be written to update models with new data automatically.
Structural equation modeling
- Statistics Premium tests hypotheses and confirms relationships among observed and latent variables – moving beyond regression to gain additional insight.
- It lets you builds models that more realistically reflect complex relationships because any numeric variable, whether observed (such as non-experimental data from a survey) or latent (such as satisfaction and loyalty) can be used to predict any other numeric variable.
- The software’s visual framework compares, confirms and refines models.
- Multivariate analysis encompasses and extends standard methods – including regression, factor analysis, correlation and analysis of variance.
- This product includes three data imputation methods: regression, stochastic regression and Bayesian.
Bootstrapping
- Statistics Premium estimates the sampling distribution of an estimator by re-sampling with replacement from the original sample.
- It estimates the standard errors and confidence intervals of a population parameter such as the mean, median, proportion, odds ratio, correlation coefficient, regression coefficient and many others.
- The software lets you create thousands of alternate versions of your datasets for more accurate analysis.
Advanced sampling assessment and testing
- Statistics Premium provides the specialized planning tools and statistics needed to work with complex sample designs, such as stratified, clustered or multistage sampling.
- It helps you achieve better results because it incorporates the sample design into survey analysis.
- Users can more accurately work with numerical and categorical outcomes in complex sample designs using algorithms for analysis and prediction, including predicting time to an event.
- Wizards make it easy to create plans, analyze data and interpret results.
Direct marketing and product decision-making tools
- Statistics Premium segments customers or contacts by creating clusters of those who are like each other, and distinctly different from others.
- The software profiles customers or contacts with shared characteristics to improve the targeting of marketing offers and campaigns.
- It develops propensity scores to identify those who are most likely to purchase.
- Test package performance can be compared to control packages.
- Responses to campaigns are identified by postal code.
- Campaign response data integrates with Salesforce.com to track leads and report on sales pipeline.
High-end charts and graphs
- Statistics Premium has dozens of built-in visualization templates to communicate analytic results.
- "Drag-and-drop" graph creation eliminates the need for programming skills.
- Style sheets and graph templates can be customized to set new graphic standards across your enterprise or match branding.
- Graphs are deployed in operational systems using IBM SPSS Collaboration and Deployment Services, IBM SPSS Statistics and IBM SPSS Modeler.
- The software supports a wide array of data sources, including delimiter-separated, IBM SPSS Statistics data files and common database sources such as DB2, SQL Server, Oracle and Sybase.