STATISTICA
7 -
New Features and Enhancements
Enhancements to Existing
Functionality
Extended
Import/Export
Enhanced
Graph Updating
Interactive
Graphs, Brushing
"By
Group" Analysis for Statistics and Graphs
Variable
and Case Metadata
Case
Metadata
Variable
Metadata
Workbook
Multi-Item Display
New
Recording / Reporting Options for Case Selection Conditions
Web
Browser Document Type
Enhanced
Text Importing
Automatic
Variable Classification
Licensing
Changes
Sorting
Merge
Data
Stacking/Unstacking
Enhanced
Spreadsheet Formulas and Case Selection Conditions
Further
Expanded STATISTICA Visual Basic Functionality
"All
Values" Categorization Method
Basic
Statistics
Quality
Control
Process
Analysis
SEWSS Enhancements
Aggregated
Data
Sets
21 CFR
PART 11 Compliance
New Products and Analysis
Modules
STATISTICA
NIPALS Algorithm (PCA/PLS)
STATISTICA
Sequence, Association and Link Analysis
STATISTICA
Multivariate Statistical Process Control (MSPC)
Random
Forests
Enhancements
to Existing Functionality:
Extended Import/Export
Added support for importing and exporting to/from:
- SAS Data Files (binary)
- SAS Transport Files
- SPSS Data Files (binary)
- SPSS Portable Files (Replaces V6 SPSS POR import
functionality)
- Minitab Data Files (binary)
- JMP Data Files (binary)
Enhanced Graph Updating
- Support for maintaining integrated
"data-graphs" exploratory environments.
- STATISTICA Graphs will update when the source
Spreadsheet data change even after the respective STATISTICA
analyses are closed.
- Graphs can be re-linked to new Spreadsheets and
Variables, allowing currently customized graphs (titles,
scaling, embedded objects, bar shading, etc.) to be used
as "Templates" for deployment to different
data sets.
Interactive Graphs
- Tight integration between Graphs and their source
Spreadsheets.
- Brush points on Scatterplots and the Cases will
automatically become marked in the respective
Spreadsheet, so the subsets can be used in subsequent
analyses.
- Brushing states will propagate to the source
Spreadsheet and then to all other open Graphs based on
the same Spreadsheet; this feature enables the user to
brush points on one graph and view the corresponding
Cases highlighted on other open Graphs.
- Brushing events will update the Spreadsheet marking
Cases as Labeled/Unlabeled, Excluded/Included,
Marked/Unmarked. Related graphs tied to the same data
can then be updated to reflect the brushing events
performed on the first graph.
"By Group" Analysis for
Statistics and Graphs
- All STATISTICA Analyses and Graphs now support
the selection of one or more "By Variables."
The specified analysis is repeated for each unique level
(value) of the "By Variables." For example, a
Multiple Linear Regression model can be specified and
calculated independently for subsets of cases defined by
each unique value of variable City (e.g., Dallas,
Atlanta, Pittsburgh, Chicago...).
Variable and Case Metadata
Metadata can now be defined for Cases and Variables to offer
new analytic options and simplify and speed up specifying
new analyses.
Case Metadata:
- Marker Type: Defines the point marker shape to be used
for the respective Case(s); used in Graph Types such as
Scatterplots (for example, one particular case can be
assigned a "red star" marker, and it will
appear as such in all scatterplots).
- Marker Color: Defines the point marker color to be
used for the respective Case(s).
- Excluded: User can mark a case as Excluded. An
Excluded case will be omitted from calculations, but
will still be present in graphical displays.
- Hidden: User can turn off a point in graph, i.e., the
point will still be used in computations, but will not
be displayed in a graph.
- Label: User can select to label individual cases
within graphs.
Variable Metadata:
- Measurement Type (Auto, Continuous, Categorical,
Ordinal): Used for automatic variable classification in
Analyses and, optionally, automatically populating
variable selection list boxes only with variables of the
appropriate types.
- Excluded: Prevents display in Variable selection
dialogs.
- Label: The User can define a variable as a Label
variable. The values of a Label variable will be used as
point labels within appropriate graphs.
- Case state: User can save case states to a specified
variable.
- Properties: User can create custom metadata fields
(name-value pairs) to be stored and associated with a
Variable. For example, a User can define an "Upper
Control Limit" property for a variable assigned a
value of "2.6". A STATISTICA Visual
Basic (SVB) macro can query the Variable Properties,
including the custom Upper Control Limit Property, to
apply it to Quality Control Charts based on this
Variable. With this approach, the same SVB macro can be
applied to different data and dynamically use
appropriate QC Chart limits and specifications.
Workbook Multi-Item Display
- New default behavior in Workbooks is to display the
contents of a folder, when the folder is selected, as a
pane of the respective Spreadsheets and Graphs from that
folder displayed in form of a grid of items (adjacent to
each other).
- Workbooks now support the ability to View/Print the
contents of a Workbook folder in a user-defined grid
configuration.
New Recording / Reporting
Options for Case Selection Conditions
- Currently specified case selection conditions can now
be automatically displayed in title areas of all
respective graphs (generated from the case selected
subsets) and in the header areas of all result
spreadsheets.
Web Browser Document Type
- Support for Integrated Internet Explorer (IE) Windows
in the STATISTICA Application.
- The integrated IE Window offers one more method
supported in STATISTICA to easily build custom
User Interfaces, in this case, using the standard HTML
scripting.
- IE Windows supports HTML applications that can include
native STATISTICA Spreadsheet and Graph objects
for interactive editing, brushing, etc.
- IE Windows support hosting of native STATISTICA
Spreadsheet and Graph objects for interactive editing,
brushing, etc.
- Seamless integration of desktop STATISTICA and WebSTATISTICA
running on a remote server.
Enhanced Text Importing
- The import of text files has been enhanced through the
"auto" text import method. Users can now have
the system automatically determine which columns should
be imported as variables of type Text (instead of
variables of type Double with text labels), or users can
manually specify which columns are to be imported as
text.
Automatic Variable
Classification
- To speed up and simplify the process of selecting
variables for analyses, Variable selection in Analyses
and Graphs will (optionally) limit the display of
Variables to the types that are appropriate for their
respective roles in the Analyses. For example, in
"By Group" Analyses, by default only
Categorical Variables will be displayed for selection as
the By Variables.
Licensing Changes
- STATISTICA Concurrent Licensing has been
enhanced to allow for more granular licensing of modules
and offline usage while a STATISTICA User is
disconnected from the network (as well as supporting
"trial period" usage of individual modules).
Sorting
- Improved user interface to define complex sorting
scenarios with very many keys.
- Support via automation for up to 14 sort keys.
Merge Data
- Merging from both open and disk-based Spreadsheets.
- Addition of "Cartesian-join" merge.
- Enhanced user interface.
Stacking/Unstacking
- Added ability to Interleave output when stacking.
- Stacking - Unstacked variables can be
included/excluded from results.
- Unstacking - Added options for handling multiple cross
tab values.
Enhanced Spreadsheet Formulas
and Case Selection Conditions
STATISTICA now provides an even broader selection of
regular expression (including so-called fuzzy text
searching) functions that can be used in spreadsheet and
case selection formulas. For example:
- RE_SEARCH - search for text in a variable using
regular expressions.
- RE_MATCH - compare text using regular expressions.
- RE_REPLACE - text replacement in a variable using
regular expressions.
- LIKE - compare text using an operator similar to SQL's
LIKE keyword.
Further Expanded STATISTICA
Visual Basic Functionality
- Go even further using STATISTICA as an
efficient programming platform for developing highly
interactive custom graphics applications.
- Embed a wide variety of ActiveX controls within STATISTICA
graphs.
"All Values"
Categorization Method
- A new method of categorization in graphs allows for up
to 255 distinct categories of integer or non-integer
values.
Basic Statistics
- Enhanced breakdown tables generated with elimination
of empty rows in generated tables.
Quality Control
- STATISTICA QC Charts support aggregated data
(means, ranges, standard deviations) as input. This
capability is particularly useful when automated data
collection equipment and instruments output only
aggregated data for each sample.
Process Analysis
- Gage Linearity analysis.
- 5,000 cases limit has been removed.
SEWSS
Enhancements:
Aggregated Data
SEWSS supports aggregated data (means, ranges,
standard deviation) as input. This capability is important
and useful when automated data collection equipment and
instruments output only aggregated data for each sample.
Sets
Samples can be grouped together using labels which allow for
unique specifications and limits per category in the label.
This label, also called Set Name, will provide the user with
the following options:
- Set Names are for labeling only.
- Calculate limits and sigma from the samples in a set
and apply to the same Set.
- Calculate limits and sigma from the first X samples in
a set and apply to all future samples in the same Set.
- Use the values returned in the Query with
specification and limit column types assigned to them
and apply them to the same Set.
21 CFR Part 11 Compliance
There are extensions to SEWSS offer options to better
keep track of SEWSS users' activities and to increase
administrator's control over the way in which SEWSS
is being used. These features are also required for complete
compliance in a 21 CFR Part 11 environment. This includes
logging of system changes, implementing a Windows integrated
logon environment, and locking Spreadsheets and Graphs from
modifications.
New Products
and Analysis Modules:
STATISTICA NIPALS
Algorithm (PCA/PLS) - an implementation of a number of
techniques known as Principal Component Analysis
(PCA) and Partial Least Squares (PLS). In STATISTICA,
PCA and PLS are implemented using the state of the art
NIPALS algorithm (Nonlinear Iterative Partial Least
Squares) a mathematical procedure designed to extract
systematic variations, relationships, and information in
datasets. STATISTICA NIPALS simplifies the analysis
at hand while effectively combating the curse of high
dimensionality (typically present when the number of
variables is large). STATISTICA NIPALS is also
particularly suited for use in data diagnostics, making it
an ideal tool for use in Quality Control in many areas of
science and technology. A few examples are pharmaceuticals,
biochemicals and semiconductor industry. Important features
include:
- Scalability: The ability to handle datasets with very
large number of variables.
- Data diagnostics and inter-variable relations: Capable
of applying PCA to data diagnostics, while also using
Partial Least Squares for relating a number of
predictors to a set of outcome variables (whether in a
classification or a regression problem).
- Integrated Graphical Analysis: Wide selection of
integrated graphical techniques including batches
plotted in the component space, importance plot of
components, and univariate and multivariate QC Charts.
- Cross-validation. Integrated options for
cross-validation to evaluate the number of components to
extract.
- Quality Control. Wide selection of univariate and
multivariate QC Charts for offline analysis or
automatically-updated as new data are collected.
STATISTICA
Sequence, Association and Link Analysis - this
new, stand-alone product addresses the needs of clients in
retailing, banking, insurance, etc., industries by
implementing the fastest known, highly scalable sequence
analysis algorithm with the ability to drive Association and
Sequence rules in one single analysis. Furthermore, the
program represents a stand-alone application that can be
used for model building and deployment.
STATISTICA
Multivariate Statistical Process Control (MSPC)
- this new, stand-alone product (available in enterprise,
client-server versions) is designed for advanced process
control applications in many industries, including
pharmaceutical, chemical and bio-chemical, food production
and others; it provides the widest selection of univariate
and multivariate techniques for statistical process control
applications. Analytic capabilities include, among many
others:
- Partial Least Squares - comprehensive implementation
of NIPALS algorithm for partial least squares regression
including hierarchical PLS and multi-way PLS.
- Principal Components - comprehensive implementation of
NIPALS algorithm for Principal Components Analysis
including hierarchical PCA and multi-way PCA.
- Scalable to hundreds of thousands of parameters, both
process parameters, in-process tests, and finished
product tests.
- Integrated Graphical Analysis - wide selection of
integrated graphical techniques including batches
plotted in the component space, importance plot of
components, and univariate and multivariate QC Charts.
- Cross-validation - integrated options for
cross-validation to evaluate the number of components to
extract.
- Quality Control - wide selection of univariate and
multivariate QC Charts for offline analysis or
automatically-updated as new data are collected.
Random Forests - this new
module of STATISTICA Data Miner applications offers
cutting-edge techniques for building flexible models for
classification and regression; particularly well-suited for
extremely large numbers of predictor variables.
Go
to STATISTICA Products Overview page.
|
|