Corpora are collections of texts of naturally occuring language that linguists use to validate their hypotheses about language use.
The British National Corpus consists of 100 million words of spoken and written language.
The Childes Corpus is a Child Language Corpus for research on language acquisition.
The Corpus of Contemporary American English contains 440 million words of American English.
The Scottish Corpus of Texts and Speech (SCOTS) contains texts in Scots and Standard Scottish English.