|
|
libcats.org
High Performance Multidimensional Analysis and Data MiningGoil S., Choudhary A.Summary information from data in large databases is used to answer queries in On-Line Analytical Processing (OLAP) systems and to build decision support systems over them. The Data Cube is used to calculate and store summary information on a variety of dimensions, which is computed only partially if the number of dimensions is large. Queries posed on such systems are quite complex and require different views of data. These may either be answered from a materialized cube in the data cube or calculated on the fly. Further, data mining for associations can be performed on the data cube. Analytical models need to capture the multidimensionality of the underlying data, a task for which multidimensional databases are well suited. Also, they are amenable to parallelism, which is necessary to deal with large (and still growing) data sets. Multidimensional databases store data in multidimensional structure on which analytical operations are performed. A challenge for these systems is how to handle large data sets in a large number of dimensions. These techniques are also applicable to scientific and statistical databases (SSDB) which employ large multidimensional databases and dimensional operations over them.In this paper we present (1) A parallel infrastructure for OLAP multidimensional databases integrated with association rule mining. (2) Introduce Bit-Encoded Sparse Structure (BESS) for sparse data storage in chunks. (3) Scheduling optimizations for parallel computation of complete and partial data cubes. (4) Implementation of a large scale multidimensional database engine suitable for dimensional analysis used in OLAP and SSDB for (a) large number of dimensions (20-30) (b) large data sets (10s of Gigabyte)Our implementation on the IBM SP-2 can handle large data sets and a large number of dimensions by using disk I/O. Results are presented showing its performance and scalability.
Скачать книгу бесплатно (pdf, 295 Kb)
Читать «High Performance Multidimensional Analysis and Data Mining» EPUB | FB2 | MOBI | TXT | RTF
* Конвертация файла может нарушить форматирование оригинала. По-возможности скачивайте файл в оригинальном формате.
Популярные книги за неделю:
Проектирование и строительство. Дом, квартира, садАвтор: Петер Нойферт, Автор: Людвиг Нефф
Размер книги: 20.83 Mb
Система упражнений по развитию способностей человека (Практическое пособие)Автор: Петров Аркадий НаумовичКатегория: Путь к себе
Размер книги: 818 Kb
Сотворение мира (3-х томник)Автор: Петров Аркадий НаумовичКатегория: Путь к себе
Размер книги: 817 Kb
Радиолюбительские схемы на ИС типа 555Автор: Трейстер Р.Категория: Электротехника и связь
Размер книги: 13.64 Mb
Момент истины (В августе сорок четвертого...)Автор: Богомолов Владимир ОсиповичКатегория: О войне
Размер книги: 1.83 Mb
Только что пользователи скачали эти книги:
Палтэргейст (на белорусском языке)Автор: Ракитина КатеринаКатегория: Детская литература
Размер книги: 9 Kb
McGraw-Hill's ACT with CD-ROM, 2008 Edition (Mcgraw Hill's Act (Book & CD Rom))Автор: Steven Dulan
Размер книги: 6.78 Mb
Variational Analysis in Sobolev and BV Spaces: Applications to PDEs and Optimization (MPS-SIAM Series on Optimization)Автор: Hedy Attouch, Автор: Giuseppe Buttazzo, Автор: Gerard MichailleКатегория: Математика
Размер книги: 3.92 Mb
Advances in Food Research, Volume 23Автор: C O ChichesterКатегория: Наука (общее)
Размер книги: 20.84 Mb
Tomita-Takesaki Theory in Algebras of Unbounded Operators (Lecture Notes in Mathematics)Автор: Atsushi InoueКатегория: Математика
Размер книги: 9.07 Mb
A Philosophical Novelist: George Santayana and the Last PuritanАвтор: H. T. Kirby-Smith
Размер книги: 455 Kb
Getting Away With Murder: How Politics Is Destroying the Criminal Justice SystemАвтор: Susan Estrich
Размер книги: 475 Kb
|
|
|