Searching the Corpora: Example Queries
The search and visualization tool ANNIS is the most powerful way to use the texts for research purposes. We've provided some sample queries below to demonstrate some of the kinds of searches you may construct. ANNIS queries use either regular expressions or the ANNIS query language. If you are familiar with ANNIS or regular expressions, jump right in. If not, you may wish to try some of the sample queries and then substitute terms or search parameters to adapt them to your needs and learn the system. After clicking on the magnifying glass, you will be taken to a new page with the ANNIS query and results. The query will appear in the box on the upper left. The corpus/corpora you are searching will be selected on the lower left. And your search results will appear in the panel on the right.
Note: our language tagger tags for the oldest language of origin, including Hebrew. For *all* loan words, use the following query to capture all language tags:
Also, a compound word containing both Greek and Coptic contains a language tag only for the Greek morph within the compound. Hence, we use syntax for finding overlapping search fields ("_o_") rather than equivalent fields ("_=_").
- Search for focalizing converters in Besa's letters:
pos="CFOC" - Search for the lemma "ⲥⲟⲛ" in Shenoute's writings:
lemma="ⲥⲟⲛ"
Compare to a search for the (normalized) word "ⲥⲟⲛ" in Shenoute's writings:
norm="ⲥⲟⲛ"
- Look for locational expressions in the Apophthegmata Patrum corpus (currently a subset of the corpus is annotated for locational expressions):
entity="place" - Find some mentions of the following terms of kinship in the translation of Abraham our Father:
translation=/.*([Mm]other|[Bb]rother|[Ff]ather|[Ss]ister|[Ss]on|[Dd]aughter).*/ - Search for the normalized word ϯ in the Coptic Treebank but not as a verb:
pos!="V" _=_ norm="ϯ" - Search for lines ending with a letter written in small print in Besa's letters:
hi_rend=/.*small.*/ _r_ lb_n - Filter by metadata to see how many lines of Abraham Our Father don't come from the manuscript MONB.YA:
lb_n @* msName!="MONB.YA" - Find words with the morpheme ⲙⲛⲧ- in Besa's letters and Shenoute's Acephalous Work 22:
morph="ⲙⲛⲧ" - Find common nouns referring back to proper names in the Apophthegmata Patrum corpus (currently a subset of the corpus is annotated for to coreference these entities):
pos="N" & pos="NPROP" & entity & entity & #3 ->coref[type=/diff|appos/] #4 & #3 _r_ #1 & #4 _r_ #2