From f5708470d3df7ea06a0d58da7821d1f004379ca5 Mon Sep 17 00:00:00 2001 From: Manos Athanassoulis Date: Fri, 21 Jun 2024 11:11:53 -0400 Subject: [PATCH] minor fixes --- hdms/2024/program.html | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/hdms/2024/program.html b/hdms/2024/program.html index a58e85c..b4b43d2 100644 --- a/hdms/2024/program.html +++ b/hdms/2024/program.html @@ -366,10 +366,10 @@

Download the program 1:10 - 1:20pm Chimp: Efficient Lossless Floating Point Compression for Time Series Databases
Liakos, Panagiotis; Papakonstantinopoulou, Katia; Kotidis, Yannis - + Applications in diverse domains such as astronomy, economics and industrial monitoring, increasingly press the need for analyzing massive collections of time series data. The sheer size of the latter hinders our ability to efficiently store them and also yields significant storage costs. Applying general purpose compression algorithms would effectively reduce the size of the data, at the expense of introducing significant computational overhead. Time Series Management Systems that have emerged to address the challenge of handling this overwhelming amount of information, cannot suffer the ingestion rate restrictions that such compression algorithms would cause. Data points are usually encoded using faster, streaming compression approaches. However, the techniques that contemporary systems use do not fully utilize the compression potential of time series data, with implications in both storage requirements and access times. In this work, we propose a novel streaming compression algorithm, suitable for floating point time series data. We empirically establish properties exhibited by a diverse set of time series and harness these features in our proposed encodings. Our experimental evaluation demonstrates that our approach readily outperforms competing techniques, attaining compression ratios that are competitive with slower general purpose algorithms, and on average around 50% of the space required by state-of-the-art streaming approaches. Moreover, our algorithm outperforms all earlier techniques with regards to both compression and access time, offering a significantly improved trade-off between space and speed. The aforementioned benefits of our approach - in terms of all space requirements, compression time and read access - significantly improve the efficiency in which we can store and analyze time series data. + @@ -382,7 +382,7 @@

Download the program 3:30 - 4:00pm - Keynote 2: Verena Kantere + Keynote 2: Verena Kantere (Univeristy of Ottawa) + More often than not, spatial objects are associated with some context, in the form of text, descriptive tags (e.g., points of interest, flickr photos), or linked entities in semantic graphs (e.g., Yago2, DBpedia). Hence, location-based retrieval should be extended to consider not only the locations but also the context of the objects, especially when the retrieved objects are too many and the query result is overwhelming. In this article, we study the problem of selecting a subset of the query result, which is the most representative. We argue that objects with similar context and nearby locations should proportionally be represented in the selection. Proportionality dictates the pairwise comparison of all retrieved objects and hence bears a high cost. We propose novel algorithms which greatly reduce the cost of proportional object selection in practice. In addition, we propose pre-processing, pruning, and approximate computation techniques that their combination reduces the computational cost of the algorithms even further. We theoretically analyze the approximation quality of our approaches. Extensive empirical studies on real datasets show that our algorithms are effective and efficient. A user evaluation verifies that proportional selection is more preferable than random selection and selection based on object diversification. +