XXQ - an informal introduction
1. Searching for words
There are a couple of options you can use with lemma queries: in XXQ as in many XML vocabularies attributes are used to carry options. The convention with XXQ is that attributes that are either on or off take values true and false.
ignorecase tells Xaira to search for any word with the spelling supplied irrespective of capitalisation. This option is true by default so if you want to search just for the form exactly as spelt you must turn it off as in:
pattern tells Xaira to treat the search value as a regular expression; by default it is false. Regular expressions are a topic in their own right and there isn’t space to discuss them here. For many purposes all you need know is that a dot matches any character, a character followed by * can be repeated 0 or more times, a character followed by a + 1 or more times and a character followed by ? 0 or 1 times. To search for singular and plural parrots the following query would work:
Remember that if you want to use one of the magic regular expression characters as a character in its own right you must escape it by preceding it with \. The following two queries find the same words:
(b) the person who builds a corpus may decide to override the normal rules for word breaking using special tags. In the British National Corpus (BNC), for example, where words are tagged to show their part of speech, words that are run together orthographically are indexed separately: the word can’t, for example, is indexed as ca and n’t to show that ca (=can) is a verb and n’t (=not) is the negative particle. So