Change log¶

0.10.0 “ICAME38” May 23, 2017¶

This release is a considerable update over earlier versions. These are the main fields that have seen large changes:

the interface has been redesigned and streamlined
a completely new data management system, including a revised way of handling functions
the new visualization designer allows interactive construction of figures from the query results

Here is an incomplete list of most of the things that have changed or have been added.

Interface¶

interface resdesign:
- move column selection to the left as ‘Data features’
- place Query button more prominently in the middle, and rename it
- replace Aggregation widget by ‘Data management’ toolbox
add new dialogs to the ‘Help’ menu:
- regular expression tester
- ‘How to cite’ dialog
- module information dialog
simplify query file widget
collect hidden columns in a side bar
there is now a search widget
change ‘Query’ button to ‘New query’
make keyboard shortcuts more consistent
add value substitutions
improve TextGrid export features
use Icons8 icon set
add user data columns
external links are now persistent when changing the corpus or quitting the program
group and summary functions are now saved on quitting the program
greatly improve the speed of browsing the results table
placeholder for empty cells is now configurable

Data management¶

the displayed context can now be changed after the query
add option to restrict contexts to sentence boundaries
functions are now added to columns in the results table
add completely revised filter dialog
add completely revised functions dialog
introduce data groups, which can split the results into subsets and allow functions to act only on the subsets
introduce Group and Summary functions (the latter replaces the Statistics special table)
add new data functions (Subcorpus size, Frequency ptw, Normalized frequency, Number of matches, Number of unique matches, Row number, Type-token ratio)
add new string functions (CHAIN, UPPER, LOWER), use regex for COUNT
make regular expression function generally more robust
add logical functions (ADD, EQUAL, GREATER, GREATEREQUAL, LESS, LESSTHAN, NOTEQUAL, AND, OR, XOR)
add G-test matrix (using corrected probabilities for highlighting)
add test statistics (log-likelihood test, chi-square test) and effect sizes (phi coefficient, odds ratio)
show both left and right conditional probabilities in Collocations aggregation
add stopword lists for many languages

Visualization¶

introduce the Visualization designer
add new visualizations: heatbar plot, regression plot, scatter plot, violin plot, box-whisker plot
allow vertical plots where sensible

Corpora¶

add reference corpus support
provide functions that use the reference corpus (Keyness LL, Keyness %DIFF, as well as frequency functions in the reference corpus)
big change for corpora that provide audio (currently, only Buckeye is supported):
- add spectrogram and waveform contexts
- store audio in databases
- allow audio playback
improve segment lookup in corpora that contain segments
allow to build ‘corpora’ from CSV files
add support for encoding detection when reading plain text files

Queries¶

internal change: rewrite SQL code generator, which speeds up multi-word queries
allow regular expression queries (can be activated in the settings)
introduction of _NULL special query item (issue #97)
introduction of _PUNCT special query item as a placeholder for any punctuation mark
add query cache that can speed up repeated queries (experimental)

Test coverage¶

(only the core modules are reported)

corpus.py 44% corpusbuilder.py 19% filters.py 78% functionlists.py 62% functions.py 59% links.py 68% managers.py 36% queries.py 28% session.py 26% tables.py 19% textgrids.py 54% tokens.py 89%

0.9.2a September 1, 2017¶

fix issue with desktop icon on Windows

0.9.2 May 1, 2016¶

fix issue with NLTK module detection
add support for .docx, .odt, and HTML files when building a corpus
allow query results from speech corpora to be saved as Praat TextGrids
Brown installer: added
Buckeye installer: now use lemma transcripts for lemma query items
COHA installer: fix issue with file names
Switchboard installer: provide full conversation and speaker information

0.9.1 March 22, 2016¶

fix issue in Buckeye and CELEX2 installers
add module information to About dialog

0.9 March 21, 2016¶

initial public release