Change log

0.10.0 “ICAME38” May 23, 2017

This release is a considerable update over earlier versions. These are the main fields that have seen large changes:

  • the interface has been redesigned and streamlined
  • a completely new data management system, including a revised way of handling functions
  • the new visualization designer allows interactive construction of figures from the query results

Here is an incomplete list of most of the things that have changed or have been added.


  • interface resdesign:
    • move column selection to the left as ‘Data features’
    • place Query button more prominently in the middle, and rename it
    • replace Aggregation widget by ‘Data management’ toolbox
  • add new dialogs to the ‘Help’ menu:
    • regular expression tester
    • ‘How to cite’ dialog
    • module information dialog
  • simplify query file widget
  • collect hidden columns in a side bar
  • there is now a search widget
  • change ‘Query’ button to ‘New query’
  • make keyboard shortcuts more consistent
  • add value substitutions
  • improve TextGrid export features
  • use Icons8 icon set
  • add user data columns
  • external links are now persistent when changing the corpus or quitting the program
  • group and summary functions are now saved on quitting the program
  • greatly improve the speed of browsing the results table
  • placeholder for empty cells is now configurable

Data management

  • the displayed context can now be changed after the query
  • add option to restrict contexts to sentence boundaries
  • functions are now added to columns in the results table
  • add completely revised filter dialog
  • add completely revised functions dialog
  • introduce data groups, which can split the results into subsets and allow functions to act only on the subsets
  • introduce Group and Summary functions (the latter replaces the Statistics special table)
  • add new data functions (Subcorpus size, Frequency ptw, Normalized frequency, Number of matches, Number of unique matches, Row number, Type-token ratio)
  • add new string functions (CHAIN, UPPER, LOWER), use regex for COUNT
  • make regular expression function generally more robust
  • add G-test matrix (using corrected probabilities for highlighting)
  • add test statistics (log-likelihood test, chi-square test) and effect sizes (phi coefficient, odds ratio)
  • show both left and right conditional probabilities in Collocations aggregation
  • add stopword lists for many languages


  • introduce the Visualization designer
  • add new visualizations: heatbar plot, regression plot, scatter plot, violin plot, box-whisker plot
  • allow vertical plots where sensible


  • add reference corpus support
  • provide functions that use the reference corpus (Keyness LL, Keyness %DIFF, as well as frequency functions in the reference corpus)
  • big change for corpora that provide audio (currently, only Buckeye is supported):
    • add spectrogram and waveform contexts
    • store audio in databases
    • allow audio playback
  • improve segment lookup in corpora that contain segments
  • allow to build ‘corpora’ from CSV files
  • add support for encoding detection when reading plain text files


  • internal change: rewrite SQL code generator, which speeds up multi-word queries
  • allow regular expression queries (can be activated in the settings)
  • introduction of _NULL special query item (issue #97)
  • introduction of _PUNCT special query item as a placeholder for any punctuation mark
  • add query cache that can speed up repeated queries (experimental)

Test coverage

(only the core modules are reported) 44% 19% 78% 62% 59% 68% 36% 28% 26% 19% 54% 89%

0.9.2a September 1, 2017

  • fix issue with desktop icon on Windows

0.9.2 May 1, 2016

  • fix issue with NLTK module detection
  • add support for .docx, .odt, and HTML files when building a corpus
  • allow query results from speech corpora to be saved as Praat TextGrids
  • Brown installer: added
  • Buckeye installer: now use lemma transcripts for lemma query items
  • COHA installer: fix issue with file names
  • Switchboard installer: provide full conversation and speaker information

0.9.1 March 22, 2016

  • fix issue in Buckeye and CELEX2 installers
  • add module information to About dialog

0.9 March 21, 2016

  • initial public release