Today, large corpora consisting of hundreds of millions or even billions of words,along with new empirical and statistical methods for organizing and analyzing these data, promisenew insights into the use of language. Already, the data extracted from these large corpora revealthat language use is more flexible and complex than most rule-based systems have tried to accountfor, providing a basis for progress in the performance of Natural Language Processing...