Italiano (Italian) English (Inglese)
Thursday, 5 December 2024

Technical Reports

Technical Report Details
Authors:Paolo Ferragina
Travis Gagie
Giovanni Manzini
Scientific Area:Text Compression and Indexing
Title:Space-Conscious Data Indexing and Compression in a Streaming Model
Published on:TR-INF-2008-02-01-UNIPMN
Publisher:Computer Science Department, UPO
Year:2008
URL:http://www.di.unipmn.it...R-INF-2008-02-01-UNIPMN.pdf
Abstract:Compressed full-text indexes, data compressors, and streaming algorithms are three powerful tools for dealing with massive datasets. Surprisingly, algorithms for building compressed indexes (or files) are space-inefficient and/or incompatible with a streaming setting. Space-conscious streaming algorithms are crucial because current disk features make sequential disk accesses at least two orders of magnitude faster than random accesses. For the first time in the literature, we investigate string problems in the streaming model and prove upper and lower bounds on the complexity of building compressed indexes and compressing data in terms of the classic product: (internal- memory space) x (number of passes).