About Coquery
Features
Coquery is a free corpus query tool for linguists, lexicographers,
translators, and anybody who wishes to search and analyse a text corpus.
Corpora
- Use the corpus manager to install one of the supported corpora
- Build your own corpus from PDF, MS Word, OpenDocument, HTML, or plain text files
- Filter your query for example by year, genre, or speaker gender
- Choose which corpus features will be included in your query results
- View every token that matches your query within its context
Queries
- Match tokens by orthography, phonetic transcription, lemma, or gloss, and restrict your query by part-of-speech
- Use string functions e.g. to test if a token contains a letter sequence
- Use the same query syntax for all installed corpora
- Automate queries by reading them from an input file
- Store your results as CSV files or Praat TextGrid files (for time-annotated corpora)
Analysis
- Summarize the query results as frequency or contingency tables
- Create a G-test matrix for query results to detect statistically significant differences
- Run statistical tests of independence, and estimate the effect sizes
- Calculate entropies and relative or normalized frequencies
- Fetch collocations, and calculate association statistics like mutual information scores or conditional probabilities
Visualizations
- Use bar charts, heat maps, or bubble charts to
visualize frequency distributions
- Illustrate diachronic changes by using time series plots
- Show the distribution of tokens within a corpus in a barcode or a beeswarm plot
Databases
- Either use easy-to-use internal databases, or connect to a powerful MySQL server
- Access large corpora on a MySQL server over the network
- Link data tables from different corpora, e.g. to include phonetic transcriptions in a corpus that does not contain them.
Supported corpora
Coquery already has installers for the following linguistic corpora:
Note that in order to use these corpora, you first need to obtain the corpus
data from the linked websites.
If you are missing a corpus from the list of supported corpora, you can
either program a custom installer for your corpus, or you can contact
the Coquery developer whether an installer for your corpus may be included
in a future release of Coquery.
License
Coquery is free software released under the terms of the
GNU General Public License (version 3). This license gives you
the freedom to use Coquery for any purpose. It also allows you to copy,
modify, and redistribute the software for as long as the modified software is
also licensed under the GNU GPL.