|
|
libcats.org
Natural language processing using very large corporaSusan Armstrong, Kenneth Ward Church, S. Armstrong, Kenneth W. Church, Pierre Isabelle, Sandra Manzi, Evelyne Tzoukermann, David YarowskyThe 1990s have been an exciting time for researchers working with large collections of text. Text is available like never before. It was not all that long ago that researchers referred to the Brown Corpus as a `large' corpus. The Brown Corpus, a `mere' million words collected at Brown University in the 1960s, is about the same size as a dozen novels, the complete works of William Shakespeare, the Bible, a collegiate dictionary or a week of a newswire service. Today, one can easily surf the web and download millions of words in no time at all. What can we do with all this data? It is better to do something simple than nothing at all. Researchers in large corpora are using basically brute force methods to make progress on some of the hardest problems in natural language processing, including part-of-speech tagging, word sense disambiguation, parsing, machine translation, information retrieval, and discourse analysis. They are overcoming the so-called knowledge-acquisition bottleneck by processing vast quantities of data, more text than anyone could possibly read in a lifetime, and estimating all sorts of `central and typical' facts that any speaker of the language would be expected to know, e.g. word frequencies, word associations and typical predicate--argument relations. Much of this work has been reported at a series of annual meetings, known as the Workshop on Very Large Corpora (WVLC) and related meetings sponsored by ACL/SIGDAT (Association for Computational Linguistics' special interest group on data). Subsequent meetings have been held in Asia (1994, 1997), America (1995, 1996, 1997) and Europe (1995, 1996). The papers in this book represent much of the best of the first three years of this workshop/conference as selected by a competitive review process.
EPUB | FB2 | MOBI | TXT | RTF
* Конвертация файла может нарушить форматирование оригинала. По-возможности скачивайте файл в оригинальном формате.
Популярные книги за неделю:
Проектирование и строительство. Дом, квартира, садАвтор: Петер Нойферт, Автор: Людвиг Нефф
Размер книги: 20.83 Mb
Система упражнений по развитию способностей человека (Практическое пособие)Автор: Петров Аркадий НаумовичКатегория: Путь к себе
Размер книги: 818 Kb
Сотворение мира (3-х томник)Автор: Петров Аркадий НаумовичКатегория: Путь к себе
Размер книги: 817 Kb
Радиолюбительские схемы на ИС типа 555Автор: Трейстер Р.Категория: Электротехника и связь
Размер книги: 13.64 Mb
Только что пользователи скачали эти книги:
Бессонница моих странствийАвтор: Алдан-Семенов Андрей ИгнатьевичКатегория: История
Размер книги: 25 Kb
ХимияАвтор: А.А. Гуров, Автор: Ф.З. Бадаев, Автор: Л.П. Овчаренко, Автор: В.Н. Шаповал
Размер книги: 7.66 Mb
Code: The Hidden Language of Computer Hardware and SoftwareАвтор: Charles Petzold
Размер книги: 6.08 Mb
Savor: Mindful Eating, Mindful LifeАвтор: Hanh Thich Nhat, Автор: Cheung LilianКатегория: fiction
Размер книги: 2.47 Mb
|
|
|