Corpus Linguistics

[Please note: these pages are no longer maintained and may be out of date.]


INTRODUCTION

 

GLOSSARY

 

CORPORA

 

COURSES

 

BIBLIOGRAPHY

 

RELATED SITES

 

SOFTWARE

 

SEARCH ENGINE

 

TUTORIAL

COMMENTS




These pages have been created as part of the
W3-Corpora Project
at the
University of Essex.

 


Annotated Corpora

Apart from the pure text, a corpus can also be provided with additional linguistic information, called 'annotation'. This information can be of different nature, such as prosodic, semantic or historical annotation. The most common form of annotated corpora is the grammatically tagged one. In a grammatically tagged corpus, the words have been assigned a word class label (part-of-speech tag). The Brown Corpus, the LOB Corpus and the British National Corpus (BNC) are examples of grammatically annotated corpora. The LLC Corpus has been prosodically annotated. The Susanne Corpus is an example of a parsed corpus, a corpus that has been syntactically analysed and annotated.

Annotated corpora constitute a very useful tool for research. In the Tutorial you can find examples of how to make use of the annotation when searching a corpus.

Further information about corpus annotation and annotated corpora can be found, for example, in the book Corpus Annotation: Linguistic Information from Computer Text Corpora (external link), or by using the following links:

W3-Corpora project: 1996-98. This page is no longer maintained.