NEWS |
|
May 8, 2012 - Provalis Research releases three three sentiment analysis dictionaries in WordStat format.
More...
December 12, 2011 - Provalis Research releases new version 4.0 of QDA Miner qualitative data analysis software
More...
August 30, 2010 - Provalis Research announces the release of a WordStat 6.1 content analysis and text mining software
More...
April 21, 2010 - Provalis Research announces the relese of WordStat 6 content analys and text mining software .
More...
|
|
 |
v6.1
Content Analysis & Text Mining Software
|
LIST OF
FEATURES
TEXT
PROCESSING CAPABILITIES
- Content analysis on collection of large documents and short alphanumeric variables (up to 255 characters).
- Dictionary moderated lemmatization and stemming (English, French,
Italian, German and Spanish; contact us for other languages).
- Ability to call external text pre-processing
EXE or DLL (sample English porter stemmer and n-grams transformation
are include)
- Optional
exclusion of pronouns, conjunctions, etc, by the use of user-defined
exclusion lists (or stop list).
- Categorization of words or phrases using existing or user-defined
content analysis dictionaries.
- Word
categorization based on Boolean (AND, OR, NOT) and proximity rules
(NEAR, AFTER, BEFORE)
- Word and phrase substitution and scoring using wildcards and
weighting.
- Frequency analysis on keywords, phrases, derived categories
or concepts, or user-defined codes entered manually within a text.
- Interactive development and easy maintenance of hierarchical
content analysis dictionaries, taxonomies, or categorization schema.
- Drag and drop editor for easy assignments of words, phrases
into categories.
- Ability to restrict the analysis to specific portions of a text
or to exclude comments and annotations.
- Ability to perform a word frequency analysis on a random sample of cases (useful for large projects).
- Integrated spell-checking with support for more than 20 languages
such as English, French, Spanish, etc.
- Integrated thesauruses to assist the creation of
taxonomies and comprehensive categorization schemas (English, French, Spanish, Italian, Portuguese and German).
- Powerful case filtering on any numeric or alphanumeric field
and on code occurrence.
- Prints presentation quality tables and graphics
- Imports ANSI and Unicode text files, MS Word, WordPerfect, RTF and HTML, PDF.
- Exports any table to Excel,SPSS, ASCII, Tab separated or comma separated
value files, or HTML files.
- Flexible keyword highlighting (the text
editor can display all categories using different colors).
UNIVARIATE WORD FREQUENCY ANALYSIS
VOCABULARY AND PHRASE EXTRACTION
- Vocabulary
finder extracts technical terms, product and company names as
well as common misspellings.
- Phrase extraction tool allows one to easily identify recurring phrases
and idioms

NORM CREATION AND COMPARISON
- Ability
to create norm files based on word frequency analysis or on frequencies of content
categories.
- Comparison
of concept or word frequencies to previously saved norm files.
KEYWORD RETRIEVAL FUNCTION
KEYWORD CO-OCCURRENCE
ANALYSIS
- Integrated clustering and dendrogram display of keyword co-occurrence.
- First-
and second-order proximity analysis.
- Proximity plot to easily identify all keywords co-occurring
with a target word or content category.
- 2D and 3D multidimensional scaling on either joint frequency
or co-occurrence of words or categories.
- Flexible keyword co-occurrence criteria (within a case, a sentence,
a paragraph, a window of n words, a user-defined segment) as well
as clustering methods (first- and second-order proximity, choice
of similarity measures).
- Easy
text retrieval from dendrogram or proximity plots allow one to drill down to the original sources.
CASE OR
DOCUMENT SIMILARITY ANALYSIS
- Hierarchical clustering, multidimensional scaling and proximity
plot may be used to explore the similarity between documents or
cases.
MULTIPLE RESPONSES AND COMPARISONS
- Univariate word frequency analysis and crosstabulation
on information stored in several text fields.
- Comparison of keyword frequency or occurrence between variables.
- Computes inter-raters agreement measures (pct. of agreement,
Cohen's Kappa, Scott's Pi, Krippendorff's R and r-bar, free marginal)
based on codes manually entered in different variables.
COMPARISONS BETWEEN SUBGROUPS AND TEMPORAL TREND ANALYSIS
- Comparison between any textual field and any nominal
or ordinal variable (such as gender, age groups, etc.,),
- Automatic transformation of date variables into week days, months, quarters of year, years, or decades for identification of temporal trends.
- Choice between 11 different association measures to assess the
relationship between word occurrence and nominal or ordinal variables
(Chi-square, Likelihood ratio, Tau-a, Tau-b, Tau-c, symmetric
Somers' D, asymmetric Somers' Dxy and Dyx, Gamma, Person's R,
Spearman's Rho)
- Computation statistics on either absolute or relative frequency
- Ability to sort matrix in alphabetic order of words, by word
frequency or word occurrence, on the obtained statistics or on
its probability.
- Visually compare items between subgroups using bar charts and
line charts.
- Correspondence analysis (statistics, 2D & 3D joint plots).
This feature is accessible from the crosstab page and allows
one to see graphically the relationship between nominal variables
and codes resulting from a content analysis.
- Heatmap plot (with dual-clustering of keywords and variables)

AUTOMATED DOCUMENT CLASSIFICATION
KEYWORD-IN-CONTEXT (KWIC)
- Ability to display a KWIC table to examine the textual context
of a word, word pattern, or content category.
- Ability to sort the KWIC table on any independent (numeric or categorical) variables.
- Ability to jump to the document in order to view the full context or edit the original document.
- KWIC list can be saved in data files (Excel, SPSS or delimited files) for further processing.
- Customizable KWIC display (paragraph, sentence or user defined
segment).
- Concordance report (displays all hits as a list of paragraphs,
sentences or user defined segments)
FULL INTEGRATION WITH A STATISTICAL
SOFTWARE AND A QUALITATIVE ANALYSIS SOFTWARE
- Alphanumeric variables can be stored in the same file as all
other numeric and categorical variables.
- Variable selection, statistical analysis and content analysis
are performed within the same application program.
- Concept, keyword or word frequency analysis can be transformed into into numerical or polynomial variables in the existing project or exported to disk into a new data file (Excel, SPSS, delimited files, etc.) for further statistical analysis (such as factor analysis, multiple regression,
time series and other predictive modeling techniques, etc.).
- Ability to perform numeric and alphanumeric transformation or
to apply filters on records of the data file to restrict the analysis
to specific subgroups. .
UTILITY PROGRAMS
- Document Conversion Wizard- Utility program to easily import
documents. Various file
formats may be directly imported such as:
- Plain text (ANSI, Unicode) HTML, RTF, MS Word, WordPerfect, Adobe PDF
- Optional removal of leading and trailing
spaced and hard returns in text files.
- Extraction of numeric, alphanumeric
and date variables from structured documents.
- Extraction options may be saved on disk
and later retrieved.
|