Scalable and Interactive Real-Time Visualization of Time Series Data PDF Free Download

Name: Scalable and Interactive Real-Time Visualization of Time Series Data PDF
Author: alyons

1 / 59

3 views•59 pages

Scalable and Interactive Real-Time Visualization of Time Series Data PDF Free Download

Scalable and Interactive Real-Time Visualization of Time Series Data PDF free Download. Think more deeply and widely.

Scalable and Interactive

Real-Time Visualization

of Time Series Data

Bachelor’s Thesis

Tim Koch

September 28, 2020

Kiel University

Department of Computer Science

Software Engineering Group

Advised by: Prof. Dr. Wilhelm Hasselbring

Sören Henning, M.Sc.

Eidesstattliche Erklärung

Hiermit erkläre ich an Eides statt, dass ich die vorliegende Arbeit selbstständig verfasst

und keine anderen als die angegebenen Quellen und Hilfsmittel verwendet habe.

Kiel, 28. September 2020

iii

Abstract

Time series data is everywhere. Everything that can be measured can be measured over

time, which forms the deﬁnition of time series data. Visualization helps in perceiving this

data by reducing the cognitive load. Interaction with the data can lead to an even better

experience for users.

This thesis evaluates an approach of an interactive and scalable visualization of time

series data. The approach consists of enhancing an existing visualization library and adding

new features to meet the requirements. The library already supports real-time visualization

and the postulated support for user interactions. To enable its scalability, a cache to reduce

bandwidth load is integrated. Moreover, prefetching algorithms increase the performance

of the solution.

After introducing the approach, this thesis also describes its integration into an existing

web application. On this integration a feasibility and performance evaluation is conducted,

which shows its advantages over the current version of the web application. Additionally,

it proves the support for interactions like panning and zooming in the chart, as well as

a real-time functionality. Moreover, the presented solution is scalable and does not show

performance weaknesses in our test scenarios.

Contents

1 Introduction 1

1.1 Motivation....................................... 1

1.2 Goals .......................................... 2

1.2.1 G1: Enhanced Interactivity by Dynamic Data Loading . . . . . . . . . 2

1.2.2 G2: Real-Time Functionality . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2.3 G3: Scalability and Performance . . . . . . . . . . . . . . . . . . . . . . 2

1.2.4 G4: Integrate Approach into the Titan Control Center Frontend . . . . 3

1.2.5 G5: Rewrite CanvasPlot as TypeScript Code . . . . . . . . . . . . . . . 3

1.3 DocumentStructure ................................. 3

2 Foundations and Technologies 5

2.1 IndustrialDevOps .................................. 5

2.2 The Industrial DevOps Platform Titan . . . . . . . . . . . . . . . . . . . . . . . 5

2.2.1 Titan Control Center Architecture . . . . . . . . . . . . . . . . . . . . . 6

2.2.2 Titan Control Center Frontend . . . . . . . . . . . . . . . . . . . . . . . 7

2.3 The Web Frontend Framework Vue.js . . . . . . . . . . . . . . . . . . . . . . . 9

2.4 The Visualization Library D3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.5 The Plotting Library CanvasPlot . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.6 The TypeScript Programming Language . . . . . . . . . . . . . . . . . . . . . . 11

3 Evaluation of a Base Approach 13

3.1 EvaluationCriteria .................................. 13

3.1.1 FrameworkCriteria ............................. 13

3.1.2 Conceptual Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.2 EvaluatedApproaches ................................ 15

3.2.1 The System Management Visualization Tool LiveRAC . . . . . . . . . 15

3.2.2 Interactive Visualization Tool ATLAS for Large Time Series Data . . . 15

3.2.3 The Visual Exploration System ForeCache . . . . . . . . . . . . . . . . 16

3.2.4 The Customized D3-Wrapper CanvasPlot . . . . . . . . . . . . . . . . . 17

3.3 Evaluation of Possible Base Approaches . . . . . . . . . . . . . . . . . . . . . . 18

4 Visualization Approach 21

4.1 Requirements for Dynamic Data Loading . . . . . . . . . . . . . . . . . . . . . 21

4.1.1 Features to Maintain in the Titan Control Center . . . . . . . . . . . . 21

4.1.2 Features yet to be Implemented into the Titan Control Center . . . . . 21

4.2 Approach for a Scalable and Interactive Visualization . . . . . . . . . . . . . . 22

vii

Contents

4.2.1 Architectural Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.2.2 DataFlow ................................... 22

4.2.3 Conceptual Approach to Meet the Requirements from Section 4.1 . . 23

4.2.4 Approach to Enhance Scalability . . . . . . . . . . . . . . . . . . . . . . 23

4.3 Approach for Migrating to TypeScript . . . . . . . . . . . . . . . . . . . . . . . 24

5 Implementation 25

5.1 Migration of CanvasPlot to TypeScript . . . . . . . . . . . . . . . . . . . . . . . 25

5.1.1 Split CanvasPlot into Two Files . . . . . . . . . . . . . . . . . . . . . . . 25

5.1.2 Introduce TypeScript’s Class Inheritance for CanvasPlot . . . . . . . . 25

5.1.3 Add Types to CanvasPlot . . . . . . . . . . . . . . . . . . . . . . . . . . 26

5.2 Implementation of a Scalable and Interactive Solution . . . . . . . . . . . . . . 26

5.2.1 Implementation of the Class and File Structure . . . . . . . . . . . . . 26

5.2.2 Implementation of the DownloadManager . . . . . . . . . . . . . . . . 27

5.2.3 Implementation of the Zoom Handler . . . . . . . . . . . . . . . . . . . 28

5.2.4 CachingtheData............................... 28

5.2.5 Retrieving Data Points from the Cache . . . . . . . . . . . . . . . . . . 30

5.3 Integration of Jest as a Test Framework . . . . . . . . . . . . . . . . . . . . . . 30

5.3.1 Testing the Injection of Non-Overlapping Intervals . . . . . . . . . . . 31

5.3.2 Testing the Calculation of Uncached Time Periods . . . . . . . . . . . 31

6 Evaluation 33

6.1 EvaluationEnvironment............................... 33

6.2 EvaluationofFeasibility ............................... 33

6.2.1 Methodology ................................. 33

6.2.2 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

6.2.3 ThreatstoValidity .............................. 37

6.3 Evaluation of Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

6.3.1 Methodology ................................. 38

6.3.2 Results..................................... 38

6.3.3 Discussion................................... 40

6.3.4 ThreatstoValidity .............................. 41

7 Related Work 43

8 Conclusions and Future Work 45

8.1 Conclusions ...................................... 45

8.2 FutureWork...................................... 46

8.2.1 Future Work within our Solution . . . . . . . . . . . . . . . . . . . . . . 46

8.2.2 Future Work within the Titan Control Center . . . . . . . . . . . . . . 46

8.2.3 Future Work within the Field of Scalable Visualizations . . . . . . . . 46

viii

Contents

Bibliography 47

Chapter 1

Introduction

Time series data is a key component in engineering, business, and science. Providing

visual representations for this type of data helps people to better interpret the inherent

information. Line charts are widely used as such visualizations of data. To automate the

process of converting time series data into charts, a programmatical approach can be

valuable. Many frameworks and libraries have been published for this reason. Enabling

the user to interact with the charts can improve the user experience and thus enhance the

beneﬁt of those charts. This interaction can be, for example, zooming through multiple

detail levels and temporal resolutions.

The informativeness can be further improved by using real-time data. Therefore, visual-

ization tools have to cope with the inherent challenges of processing real-time data. While

providing this feature, such tools should still scale well with big amounts of data. This

includes handling scaled data performantly as well as reducing the amount of displayed

data to a human-perceivable level.

1.1 Motivation

The Titan project is a research project on how to transfer the DevOps approach into

industrial production environments such as factories [Hasselbring et al. 2019]. DevOps is

the concept of bundling development and operation of software closer [Lwakatare et al.

2015]. It especially aims at reducing release cycle times.

Within this project, the Titan Control Center serves as a platform for monitoring and

analyzing the Industrial DevOps processes [Henning et al. 2019]. It provides an information

dashboard [Few 2006] built as a single-page application with Vue.js [You 2014]. Among

others, it uses time series charts to visualize power consumption data from the connected

industrial environment. The charts are rendered using the CanvasPlot

visualization library.

The current implementation meets most of the requirements for informative time series

charts mentioned above. Still, there are possible improvements to introduce. In this thesis

we are aiming at the scalability and interactivity. The charts used in the Control Center are

already interactive in a way that the user can zoom into the chart to see different detail

levels. Despite that, the system fetches the data points to display only once, which happens

on the initial page load. Because downloading all data points from the underlying database

1https://github.com/a-johanson/canvas-plot

1. Introduction

would be impossible for large data sets, only data points for a speciﬁc interval are fetched.

By default conﬁguration, this interval includes the last 60 minutes. That is why older data

can only be made visible by selecting a past date and time in a dedicated input form. This

conﬁguration could of course be modiﬁed to a bigger interval, but this approach would

not be scalable.

The motivation of this thesis is to ﬁnd and implement an approach to dynamically load

data when needed. This includes loading data for other periods when zooming out, as well

as loading a higher resolution of data for the given period when zooming in. The objective

is to enable this feature without decreasing the performance of the chart tool.

1.2 Goals

The overall goal of this thesis is to evaluate an approach to enhance the interactivity of time

series charts. Given that, the approach is to be implemented into the Titan Control Center.

1.2.1 G1: Enhanced Interactivity by Dynamic Data Loading

The existing solution including CanvasPlot is missing a functionality to dynamically load

additional data points. The ﬁrst goal of this thesis is to implement this feature. There are

two main reasons for this goal. It enhances the user experience since the user will be able to

see data for a time period longer than one hour. With that, it also supports data comparison

over a longer period, which is an essential part of analyzing time series data.

1.2.2 G2: Real-Time Functionality

The current state of implementation already enables real-time visualization. While imple-

menting 1.2.1, this behavior shall not be discarded because of its beneﬁt for monitoring

production environments in real-time.

1.2.3 G3: Scalability and Performance

When implementing the dynamic loading feature, scalability is important to keep the

solution interactive under massive loads. Missing performance could render the achieve-

ments in interactivity useless. We have to assure that only a reasonable amount of data

is (dynamically) loaded into the application. For example, it would inevitably lead to

decreased performance, if we loaded data points for each second over periods of weeks or

months.

1.3. Document Structure

1.2.4 G4: Integrate Approach into the Titan Control Center Frontend

After fulﬁlling the goals G1, G2, and G3, we will integrate a prototype into the Titan

Control Center. Therefore, the prototype is integrated into a git fork of the Control Center

Frontend, whereafter we create a merge request.

1.2.5 G5: Rewrite CanvasPlot as TypeScript Code

The code of CanvasPlot uses ECMAScript 5 (short ES5), a JavaScript version that does not

support classes. There are multiple reasons to convert this code to a TypeScript class with

ES6. Firstly, almost all of the frontend components within the Titan project are written

in TypeScript. So converting the CanvasPlot would strongly contribute to a consistent

code base of the whole project. Secondly, TypeScript provides the advantage of static

type-checking. Moreover, the code can be cleaned up using classes and arrow functions

from ES6 [Saboury et al. 2017].

1.3 Document Structure

Following this introductory chapter, we present the technologies and foundations used in

this thesis in Chapter 2. Thereafter, suitable base approaches to build our solution upon are

evaluated in Chapter 3. This includes identifying reasonable criteria the candidates have to

meet, as well as introducing the approaches, and the ﬁnal appraisal.

Then, we describe the approach for our solution in Chapter 4. We therefore determine

requirements the ﬁnal solution has to meet and then derive an approach from them. In

Chapter 5 we describe how we integrate our approach into the Titan Control Center fron-

tend. Following the implementation, we conduct a feasibility and performance evaluation

in Chapter 6. Related work is presented in Chapter 7 before we end with drawing a

conclusion and reviewing possible future work in Chapter 8.

Chapter 2

Foundations and Technologies

The following sections introduce foundations and technologies, which are used in the

thesis. Section 2.1 describes the Industrial DevOps approach and Section 2.2 details the

Titan platform. Moreover, the web framework Vue.js is introduced in Section 2.3. Then, the

visualization library D3 and the plotting library CanvasPlot are introduced in Section 2.4

and Section 2.5 respectively. Finally, we present the TypeScript programming language in

Section 2.6.

2.1 Industrial DevOps

Traditionally, the development of software is separated from its operation. This approach,

however, complicates the communication, collaboration, and integration. DevOps is the

concept of bundling development and operation of software closer [Lwakatare et al. 2015].

That helps developers and maintainers to work effectively and seamlessly.

Arising from the DevOps concept, Industrial DevOps is the same concept brought to

industrial production environments [Hasselbring et al. 2019]. It proposes to transfer the

methods and culture of DevOps and apply them to environments such as factories. To

accomplish the goals of this concept, the development process in Industrial DevOps should

be a cyclic, continuous adaptation process. Within the process, the system is monitored and

analyzed during its operation. Then, new requirements are identiﬁed based on the analysis

result. Finally, adaptions to meet the new requirements are implemented and monitored

again. Crucial to the implementation phase is continuous integration to react ﬂexibly on

new requirements [Henning et al. 2019].

2.2 The Industrial DevOps Platform Titan

The goal of the Titan research project is to build a software platform for integrating and

monitoring industrial production environments using Industrial DevOps [Hasselbring et al.

2019]. Skilled employees shall be empowered to modify the work ﬂows using ﬂow-based

programming [Morrison 2010]. Thus, no dedicated programmer is needed to apply small

changes in the system [Hasselbring et al. 2019].

2. Foundations and Technologies

<<microservice>>

Sensor

Management

Visualization I

<<microservice>>

Forecasting

<<microservice>>

Anomaly

Detection

<<microservice>>

Statistics

<<microservice>>

History

<<microservice>>

Aggregation

<<edge component>>

Integration

Visualization II

REST

Figure 2.1. The architecture of the Titan Control Center [Henning et al. 2020]

2.2.1 Titan Control Center Architecture

Within the Titan research project, the Titan Control Center is a system for integrating,

analyzing, and visualizing power consumption data from various sources within industrial

production [Henning et al. 2020]. Its architecture is presented in Figure 2.1.

The system is built using the microservice pattern [Hasselbring and Steinacker 2017].

Loosely coupled services are responsible for handling different tasks that run in isolated

containers and do not share any state. Power consumption data is imported using the

Integration component, which converts the data into a standardized format used in the

whole approach [Henning 2018].

The Aggregation service is used to group the power consumption data of multiple

devices and thus provide aggregated data. Three more services conduct their own analyses

on the data. While the Statistics service provides statistical information like trends in the

data, the Anomaly Detection service is used for monitoring possible anomalies. Boguhn

[2020] also implemented a Forecasting service.

In order to publish the recorded data within the system, the History service provides

access to the database in the form of an API. Currently, four different endpoints exist. One

is used for power consumption data of single devices and the second is used for group of

devices. The two other endpoints are used to provide temporal aggregated data for both

single and grouped devices. Those two endpoints provide minutely and hourly power

consumption data. More endpoints for other temporal aggregation levels can be added.

2.2. The Industrial DevOps Platform Titan

Figure 2.2. Screenshot of the Titan Control Center Frontend

2.2.2 Titan Control Center Frontend

In addition to the microservices used in the Titan Control Center and described in Sec-

tion 2.2.1, the system also contains two visualization components. In this thesis, we con-

centrate on the ﬁrst component, which we refer to as the Titan Control Center Frontend.

It is a web based frontend to monitor and analyze the recorded power consumption data

[Henning et al. 2019]. The single-page application built with Vue.js displays multiple charts

with data for the power consumption of production environments. A screenshot of the

Titan Control Center Frontend is shown in Figure 2.2.

The Current Architecture in the Titan Control Center

This paragraph introduces the architecture of the Titan Control Center Frontend. It consists

of multiple pages, where three of them display time series charts. Those three pages are

the landing page, a sensor details page, and a page providing a comparison chart.

The landing page and the sensor details page both delegate the chart drawing to the

SensorHistoryPlot

component, while the comparison page uses the

ComparisonPlot

compo-

nent. Both the

SensorHistoryPlot

and the

ComparisonPlot

are wrapper components for the

visualization library CanvasPlot, which is introduced in Section 2.5. The

SensorHistoryPlot

additionally uses the

MovingTimeSeriesPlot

as a manager to handle updates of the data

displayed by CanvasPlot. An excerpt of the software architecture within the Titan Control

2. Foundations and Technologies

Landing Page

Comparison Page

Sensor Details Page

SensorHistoryPlot

ComparisonPlot

CanvasPlot

MovingTimeSeriesPlot

<<use>>

Figure 2.3. Excerpt of the Control Center Frontend architecture

Center Frontend is outlined in Figure 2.3.

The Data Flow of the ComparisonPlot

Both the

SensorHistoryPlot

and the

ComparisonPlot

described in the paragraph above

handle the download of data points themselves, without any delegation to dedicated

components.

The

ComparisonPlot

fetches data whenever a new device or device group is added to the

comparison. It then issues a HTTP GET request against the backend. The exact endpoint

depends on whether the added entity is an device or a group of devices. The query contains

a ﬁlter, so that only data points from within the last hour are fetched. After the response

containing the data points arrives,

ComparisonPlot

calls CanvasPlot to let it draw the new

data points. This sequence is displayed in Figure 2.4.

2.3. The Web Frontend Framework Vue.js

Comparison

Plot

Backend

Services

CanvasPlot

Actor

HTTP Get Request

addDataSet

HTTP Response

fetchNewData

addDataSet

Add Device

Figure 2.4. Sequence diagram for adding a device to the ComparisonPlot

The Data Flow of the SensorHistoryPlot

More logic is involved in the

SensorHistoryPlot

. Loading the page, it fetches data points

similar to the

ComparisonPlot

. Which device the data is fetched for is deﬁned externally.

On the root page, aggregated data for all devices are fetched, while the device can be

set speciﬁcally on the sensor details page. When the response from the backend arrives,

SensorHistoryPlot

calls its manager

MovingTimeSeriesPlot

, which then saves the data

points and calls CanvasPlot to display the data.

In order to support real-time analyses of the data,

SensorHistoryPlot

uses a repeater

which fetches new data points every second. These queries contain a ﬁlter that forces

the backend to only respond with the data points recorded since the last fetch. The

responses on those repeated queries then invoke an update of the stored data points in the

MovingTimeSeriesPlot, which again updates CanvasPlot.

2.3 The Web Frontend Framework Vue.js

Vue.js (or just Vue) is a JavaScript framework aiming at building user interfaces [You 2014].

It does so by building single-page applications (SPA). SPAs are applications that run in

2. Foundations and Technologies

the user’s web browser in the form of a single web page. User interactions that lead to

displaying a new page are not realized by loading a new page from a server, but instead by

rendering it directly in the browser. That is why SPAs are client-side rendered applications

by deﬁnition. However, these applications can still communicate with a server. As an

example, this can be used to fetch more data to display. In sophisticated applications, the

client can even load more chunks of JavaScript that were not included in the ﬁrst response.

This enables smaller and thus faster frontends.

Vue.js encourages a component-based code structure with largely independent com-

ponents. That means, a Vue component includes its own logic as well as its own layout

and design. This encapsulation helps making the components vastly reusable and allows

a structure of composed Vue components. Components may contain other components,

which leads to a hierarchical structure with one root component containing all other

components [Ehrenstein 2019].

The framework also provides a sophisticated data ﬂow model. Data exchange and thus

communication is only allowed between a parent and a child component. A data ﬂow

between siblings is impossible to prevent cyclic dependencies. Moreover, children can only

emit events to their parent components but are not allowed to change the parent’s state.

This data ﬂow model is also used in Vue’s concept of directives. Directives allow

developers to bind logic to layout elements. Thus, the state of elements can be bound to

variables for example.

2.4 The Visualization Library D3

The Titan Control Center uses the data-visualization framework D3 for its charts [Bostock

2020]. Its approach is to bind data to a Document Object Model (DOM) and make it accessible

for data-driven transformations. By making the visualization objects accessible, D3 provides

ﬂexibility and enables the control with the full tool set of web standards such as HTML,

CSS and SVG. Beside that, D3 is also considered a very fast visualization tool [Bostock et al.

2011]. It also supports dynamic interactions, large data sets, and animations.

2.5 The Plotting Library CanvasPlot

CanvasPlot was built following a research on how to visualize large data sets efﬁciently

[Johanson et al. 2016]. It is a library that extends the visualization library D3. CanvasPlot

provides a simpliﬁed API to efﬁciently use a subset of the D3 plot types including a time

series plot. Rendering underlying elements needed to display a sophisticated chart (e.g.,

labels or a grid) are managed by the library, which further simpliﬁes its usage. Moreover it

handles user interactions and assures a performant displaying process.

The code of CanvasPlot implements the JavaScript-typical prototypal inheritance [Eshke-

vari et al. 2017]. One base function CanvasDataPlot provides base functionality. Several

2.6. The TypeScript Programming Language

Figure 2.5. Screenshot of the ComparisonPlot within the Titan Control Center Frontend

children implement different visual representations like a time series chart or a vector

diagram.

In the Titan Control Center Frontend introduced in Section 2.2.2, CanvasPlot is cur-

rently used for two different purposes. Firstly, it displays a graph representing the power

consumption of either aggregated devices or single devices, depending on what the user

enters in an external input ﬁeld. Secondly, CanvasPlot can be used to display comparisons

of either aggregated or single devices. Therefore, two or more graphs are plotted into the

same chart. A screenshot of this ComparisonPlot is shown in Figure 2.5.

2.6 The TypeScript Programming Language

TypeScript is a typed superset of JavaScript [Bierman et al. 2014]. Because of this superset

relation, every JavaScript program is a TypeScript program. TypeScript enriches JavaScript

by adding a static type system, classes, interfaces, and a module system. However, the type

system’s types are optional, in order to keep the ﬂexibility of JavaScript and to migrate

from JavaScript to TypeScript in a gradual way [Feldthaus and Møller 2014]. It is to note

that TypeScript’s static type system does not provide strict type safety at run time. This is

also to keep the system nonrestrictive.

TypeScript code can be transpiled to JavaScript code, which can run in all ecosystems

that handle JavaScript. This shallow translation process converts each TypeScript expression

into its representation in JavaScript.

Chapter 3

Evaluation of a Base Approach

The main goal of this thesis is to develop a solution for scalable and interactive real-time

visualization of time series data. In order to fulﬁll this goal, we have to assess related work

in this ﬁeld and identify one framework to build our approach upon.

For this purpose, we elaborate evaluation criteria in order to choose the best suiting

framework for our needs. Every solution is then reviewed based on these criteria.

3.1 Evaluation Criteria

We identiﬁed two groups of criteria to evaluate frameworks with. The ﬁrst group consists

of framework criteria, which allow us to assess the ability of the solutions to serve as a code

framework for our goal. The second group describes the conceptual criteria of the given

solutions. This enables us to adapt concepts from one or more previous solutions without

necessarily using their code frameworks.

3.1.1 Framework Criteria

The framework criteria are used to assess frameworks based on their ability to serve as a

base framework for our goals. The following criteria are summarized in Table 3.1.

License represents the license under which the framework is provided. This license should

enable us to use and modify the framework free of charge.

Open Source indicates whether the source code of the framework is public. This can be

useful in order to conduct minor changes, in the case of a license that allows modifying

the code base.

Language deﬁnes the language of the framework. In order to ﬁt into the code base of the

Titan Control Center, a TypeScript framework is recommended.

Release Date represents the release date of the latest corresponding scientiﬁc paper. Not

taking the software’s release date but the release date of the paper occurred to be useful,

because all of the evaluated frameworks were developed for scientiﬁc use only.

3. Evaluation of a Base Approach

Table 3.1. Evaluation criteria

Criterion Value Range Requirements

Framework Criteria

License License name non-proprietary

Open source yes / no yes

Language Language name TypeScript

Release date Date newer than 2015

Conceptual Criteria

Time series capability yes / no yes

Interactivity Description of supported actions Zooming, Panning

Real-time capability yes / no yes

Scalability Description of scalability goal Support large time series data

3.1.2 Conceptual Criteria

In order to provide a more granular ranking for the base solutions, conceptional criteria

are used in addition to the framework criteria.

Time Series Capability indicates whether the base approach supports the visualization of

time series data.

Interactivity determines the ability of the evaluated approach to provide interactive charts.

Especially the possibility to zoom through multiple resolution levels and to pan in the

visual representation of the data is a key requirement for our approach. This criterion

supports goal G1 (Section 1.2.1).

Real-Time Capability indicates, whether the solution is able to display real-time data. Visu-

alizing real-time data of power consumption is a desired feature in the Titan Control

Center, which is already implemented and shall be kept. This criterion supports Goal

G2 (Section 1.2.2).

Scalability represents the scalability of the solution under evaluation. It is to note that this

shall only decide, whether a solution aims at providing scalability. Evaluating up to

which exact limit a solution scales well is not in the scope of this thesis. Scalability is

3.2. Evaluated Approaches

here meant in terms of analyzed data, not the amount of users the system is capable to

serve concurrently. This criterion supports goal G3 (Section 1.2.3).

3.2 Evaluated Approaches

In this section we present four solutions that have been used for similar use cases.

3.2.1 The System Management Visualization Tool LiveRAC

LiveRAC is a visualization system that was built to analyze large collections of system

management time series data [McLachlan et al. 2008]. Its goal is to provide high information

density, while allowing the users to zoom into group of charts in order to obtain a

detailed insight. This zooming from condensed information into fully qualiﬁed visual

data representation is called semantic zooming. Semantic zooming is not scaling views

proportionally, but instead adding or removing properties of the visual representation. It

thus balances the view on a good level of information density, between being nearly empty

and being overloaded.

This technique is already used in the current implementation of the Titan Control Center.

When users zoom into the provided time series charts, it begins to show circles for each

data point, which disappear again on a lower zoom level. Hovering over those circles with

the mouse, boxes show the x and y value of the data point. Thus, semantic zooming allows

to provide more information when enough space is given, and prevents overloaded UI

elements.

Unfortunately, LiveRAC does not implement semantic zooming as zooming by scrolling

with a mouse or a touch pad. It allows the user to choose the shown time interval by using

two date input ﬁelds. Moreover, it is scalable in a way that the application never freezes

and always shows indicators when data is fetching. But it neither implements caching

algorithms nor other techniques to minimize data fetching time when zooming through

different detail levels.

This is why LiveRAC aims to solve the same goals as this thesis, but does not serve

well as a base framework for our solution. Additionally, the solution was built using Java,

which does not ﬁt into the TypeScript code base of the Titan Control Center.

3.2.2 Interactive Visualization Tool ATLAS for Large Time Series Data

ATLAS is a visualization tool for displaying temporal data [Chan et al. 2008]. It uses several

techniques to accomplish its main goal of enabling smooth interactions in massive time

series data sets. The corresponding paper deﬁnes smooth interactions as ﬂuid behaviour

for operations like panning and zooming.

The researchers propose a solution consisting of three components. The database system

has to store massive data sets and, more importantly, has to enable data fetching with

3. Evaluation of a Base Approach

the lowest possible latency. Every latency would impede the goal of smooth interactions.

ATLAS uses kdb+, an in-memory database with high performance in time series data

[Kapadiya 2018]. As the second component, a query distribution server handles the load

balancing of database queries. Its job is to minimize both the query time in the database

and the transfer time to the frontend.

The third component is the visual interface, which displays charts with the data fetched

from the database system. Within one chart the user can zoom in, zoom out, pan left, and

pan right.

Crucial to the ATLAS concept is its technique of predictive caching. It effectively

hides the system latency arising from querying and fetching data from the backend. The

researchers observed that interactions in the charts are likely to carry momentum, which

means, for example, that users are likely to continue panning left once started. This

observation is key to their predictive caching algorithm. Setting a maximum operation

speed for panning and zooming, the algorithm can simply calculate when the end of the

prefetched interval is reached. Based on this calculation it issues new queries to fetch data

before the user reaches areas where no data is fetched yet. ATLAS also approximates the

query time, which enables the system to fetch the data not earlier than it is needed, which

again serves well in scalability and performance.

Because ATLAS is not a framework, we can not build our solution upon it. But in-

stead we will take its predictive caching algorithm into consideration to beneﬁt from its

performance.

3.2.3 The Visual Exploration System ForeCache

ForeCache is a visualization system for exploring aggregate views of data [Battle et al.

2014]. The system consists of a three-tier architecture containing a visual interface, a

middleware layer for caching, and a backend layer with a database management system.

The corresponding paper describes how the system can be used for spatial data.

ForeCache’s core concept is the usage of so-called chunks, which represent non-

overlapping subsets of data. Multiple layers of chunks enable users to zoom through

different resolution levels. Each layer spans over the complete data set, which means that

every point in the data set is available in all possible resolution levels.

The visual interface provides a viewer for the spatial data, which allows the user to pan

and zoom. Panning is possible in the four directions up, down, left, and right. Zooming

in allows to display a higher resolution, while zooming out displays less details but at a

greater scale. When the user conducts an operation in the viewer, the frontend issues data

queries to the middleware in order to receive the data to display in the requested areas.

ForeCache’s middleware receives the data queries from the frontend and again dis-

patches the query to the backend layer. It also serves as a cache for the system to speed up

repeated queries. The backend computation layer handles the data storage of the system

using a database management system. It also computes the chunks from the raw data in

multiple layers.

3.2. Evaluated Approaches

In order to keep its interactivity, ForeCache uses prefetching to dynamically load chunks

of data. It therefore works with multiple models to predict which chunks to fetch and in

which order. The ﬁrst model is the momentum model, which assumes that a user is likely

to continue its operation in the spatial view. ATLAS uses a similar assumption [Chan et al.

2008]. Secondly, ForeCache integrates the hotspot model. This model uses tracking data

of past user sessions and identiﬁes popular chunks of data that were queried more often

than others. Using this data, a hotspot data chunk to the right of the current view could be

fetched earlier than a data chunk below of the current view, even if the momentum of the

operation points downwards.

As a third model, ForeCache applies the n-gram model. It tries to identify probable

interaction patterns by storing and analyzing past interactions as word sequences, or

n-grams. ForeCache currently uses 2-length and 3-length n-grams, which could be ("down",

"right") and ("down", "down", "right") respectively. Note that ForeCache is built to handle

spatial data, which is why the directions represent the directions of panning in the data.

The n-gram model thus tries to predict popular interaction patterns and load chunks of

data accordingly.

Two more models used by ForeCache are the normal model and the histogram model.

Their goal is to predict how users can move between clusters of chunks, which is not fully

handled by the previous models. Both of them use statistical methods to achieve this goal.

We will not discuss those models in detail because they speciﬁcally aim at spatial data and

not at time series data.

Which chunks are prefetched at a speciﬁc point of time is determined by combining the

results of all ﬁve prediction models. Each model ranks the possible chunks to load in a list.

These lists then get merged into a ﬁnal ranking, which determines the chunks to be loaded

from the backend. Users may assign weights to the models, which specify their impact on

the ﬁnal ranking. Thus, the system provides functionality to modify the prefetch process

based on personal experience.

3.2.4 The Customized D3-Wrapper CanvasPlot

CanvasPlot is introduced in Section 2.5. The scope of the following text is to evaluate it

based on the criteria described above.

Users can navigate in the displayed time series charts. Zooming by mouse wheel allows

them to analyze different temporal resolutions of data, however, data is not dynamically

fetched based on the zoom level. It also enables semantic zooming like LiveRAC (see

Section 3.2.1). In this case, CanvasPlot shows boxes with detail information on each data

point, whenever there are very few data points displayed. If there are too many data points

displayed to show detail boxes on each point, then those boxes disappear. Panning left and

right helps to navigate within one speciﬁc temporal resolution level.

Real-time functionality is added by another wrapper, built into the Titan Control Center.

The Wrapper is requesting the latest data from the server in constant intervals. This interval

is currently set to 1000 milliseconds. If new data points are available they are inserted into

3. Evaluation of a Base Approach

visualization graph. The graph then moves forward if the latest data points from before the

download were in the displayed interval. This ensures that the graph always shows the

real-time data if the user watches it, but does not move when the user analyzes historical

data. Unless otherwise stated, we refer to CanvasPlot as the core JavaScript library in

conjunction with its Titan wrapper.

3.3 Evaluation of Possible Base Approaches

Table 3.2 shows an overview of the presented frameworks and solutions. Because Canvas-

Plot is the only open source framework among the evaluated solutions, we will use it as a

base framework for our thesis.

It is then to determine, which concepts of the other solutions shall be adapted and

implemented into the Titan Control Center. Three of the four conceptual criteria are already

met by CanvasPlot. It provides time series capability as well as real-time capability. The

interactivity goals from Section 1.2.1 are also accomplished by CanvasPlot.

This leads to the conclusion that only those concepts have to be adopted that provide

scalability enhancements to the system. Therefore, the sophisticated predictive caching

models of ForeCache appear to be most-suitable for our needs. Because ForeCache uses

spatial data, we will have to modify the models use time series data. This again is very

similar to the ATLAS concepts, which can be adopted as well.

3.3. Evaluation of Possible Base Approaches

Table 3.2. Framework and conceptual comparison

Criterion LiveRAC ATLAS ForeCache CanvasPlot

License no information free to use no information

Apache License

2.0

Open

Source

no no no yes

Language Java no information

JavaScript with

D3.js

JavaScript with

D3.js

Release 2008 2008 2014 2016

Time

Series

Capability

yes yes no yes

Interactivity

Navigation by

date input ﬁelds

Navigation by

panning, zoom-

ing by mouse

wheel

Navigation by

panning, zoom-

ing by mouse

wheel

Navigation by

panning, zoom-

ing by mouse

wheel

Real-Time

Capability

yes no no yes

Scalability

Built to analyze

large time series

data collections

Built to enable

smooth interac-

tions with large

data sets

Built to analyze

large data sets

using minimal

amount of data

chunks

Built to efﬁ-

ciently display

large time series

data sets

Chapter 4

Visualization Approach

The following sections address the goals from Section 1.2. We start by presenting an

approach for loading data dynamically based on the state of the diagram. After that, a

way to maintain the real-time functionality is shown. We also describe how we manage to

migrate the code base to TypeScript.

4.1 Requirements for Dynamic Data Loading

In order to provide a valid approach for the goals of this thesis, the requirements of the

solution have to be identiﬁed. These requirements consist of both features existing in the

current Control Center, and features yet to be implemented.

4.1.1 Features to Maintain in the Titan Control Center

Section 2.2 introduces the Titan Control Center including its frontend, while Section 3.2.4

describes the features of CanvasPlot. Our solution should maintain the current features in

the Titan Control Center. In detail, this includes the device comparison chart and the chart

for single or aggregated devices. Both charts should sustain their interactivity regarding

the zoom and pan actions. This partly supports goal G1 (Section 1.2.1).

Additionally, the real-time capability of the

SensorHistoryPlot

should be preserved.

This supports goal G2 (Section 1.2.2).

4.1.2 Features yet to be Implemented into the Titan Control Center

The requirements of the paragraph above aim at maintaining the current interactivity and

real-time capability of CanvasPlot as implemented in the Titan Control Center. In order to

support the remaining goals of this thesis described in Section 1.2, more features have to

be implemented.

The main task is to enhance the interactivity of the current version of CanvasPlot,

because it only loads data for a constant period of time. Users that analyze data from larger

time periods have to set this period by hand in dedicated ﬁelds as described in Section 2.2.2.

This leads to a limited efﬁciency in analyzing the data using the current version of the

Titan Control Center.

4. Visualization Approach

A major requirement for our solution hence is to mitigate these limits of interactiveness

within the chart. Therefore, the user has to be enabled to pan through all the data available.

If this is not possible to achieve, then the user has to have at least the impression of panning

limitlessly through the data.

CanvasPlot should also allow users to view data points aggregated over time. The

current implementation already allows semantic zooming in the chart as described in

Section 3.2.4. This however only enhances the charts clear look for very few data points,

because it does not consolidate data points when possible. Especially in charts with

thousands of data points the view becomes overloaded. Therefore, the solution has to

provide a feature to choose a reasonable temporal resolution of the data displayed in

CanvasPlot.

4.2 Approach for a Scalable and Interactive Visualization

After identifying the requirements in Section 4.1, this section presents the ﬁnal approach

of the thesis. It is divided into the architectural design, the data ﬂow, an approach for

implementing the new features, and the approach to enhance the scalability.

4.2.1 Architectural Design

Chapter 2 describes the current architecture of CanvasPlot within the Titan Control Center.

This architecture is modiﬁed in order to meet the requirements made in Section 4.1. The

ComparisonPlot

is not modiﬁed and continues to serve as a visualization component for

comparisons of multiple devices.

The

SensorHistoryPlot

, however, is modiﬁed substantially. It now only serves as a parent

component for the CanvasPlot chart. A new class

TimeSeriesPlotManager

is introduced,

which handles the data ﬂow instead. This supports the idea of separating visualization

from data handling.

The new

DataSet

class manages the available time series data in a given resolution.

Multiple

DataSet

instances are stored by another new class, the

MultiResolutionData

This class thus represents a multi-resolution data set. One instance of it is held by the

TimeSeriesPlotManager in order to access the data.

4.2.2 Data Flow

The current data ﬂow related to CanvasPlot within the Titan Control Center is introduced in

Section 2.2.2. Between frontend and backend, the existing data ﬂow is maintained. Henning

et al. [2020] added new endpoints in the backend, in order to support temporal aggregated

data. This enables the frontend to request data from the minutely and hourly endpoint.

4.2. Approach for a Scalable and Interactive Visualization

4.2.3 Conceptual Approach to Meet the Requirements from Section 4.1

The goals of this paper are mainly accomplished by the new

TimeSeriesPlotManager

. Be-

cause it is only applied to the history plot and not to the comparison plot, it does not

harm the operation of the latter. This accomplishes the requirement of maintaining the

comparison plot described in Section 4.1.

In order to maintain the real-time feature of the system, the

TimeSeriesPlotManager

takes over this task from the

SensorHistoryPlot

. As the latter did before, it fetches new

data every second and applies it to the chart. It then moves the chart, in the case it was set

to show the latest data. This accomplishes the real-time requirement from Section 4.1 and

G2 from Section 1.2.2.

These two requirements were accomplished without adding new features. So far, only

refactoring took place. To meet the requirement of enabling the user to pan limitlessly

through the data, dynamic data loading is necessary. Loading all data at once would not

scale well, as evaluated in Section 1.2.3. That is why the

TimeSeriesPlotManager

enables a

listener on the chart, which is invoked on zoom and pan actions. It then fetches the data

for the displayed period of time from the backend and calls CanvasPlot to display it.

With this approach, however, CanvasPlot displays empty areas, when the user zooms

out or pans left or right. This is why the time period which data is requested for, is tripled.

Thus, the user can pan a whole screen length to the left or to the right without seeing

empty areas. This supports interactive and efﬁcient analyzes on the data.

The last feature requirement is to support multiple temporal resolutions. This is made

possible by the new temporal aggregated endpoints implemented in the backend, as men-

tioned in the paragraph above. If a zoom or pan action occurs, the

TimeSeriesPlotManager

determines a reasonable resolution level and issues a query accordingly to the backend.

4.2.4 Approach to Enhance Scalability

The features required in Section 4.1 are all accomplished with the modiﬁcations described

above. However, the goal of this thesis is not only to provide an interactive time series

visualization, but a scalable solution as well. Requesting temporal aggregated data already

reduces the amount of transferred data to an extent. This amount can be reduced even

further using caching algorithms. Those algorithms enable our system to be interactive

even with low bandwidths, while the load on the backend decreases as well. This supports

the scalability goal of this thesis described in Section 1.2.3.

The caching algorithms of our solution are responsible for two tasks. They have to store

the fetched data in a way that it is accessible for further requests. Secondly, they determine

whether requested data is already cached and can be displayed without further requests

against the backend. Moreover, they are able to identify requests for time periods of which

subperiods are already cached. Thus, the cache can modify requests to the backend in a

way that data is fetched for only those time periods, where no data is cached yet.

4. Visualization Approach

4.3 Approach for Migrating to TypeScript

TypeScript was introduced in Section 2.6. Because CanvasPlot has been written in plain

JavaScript, it does not integrate well in the code base of the Titan Control Center. That is

why we chose to migrate it to TypeScript. In order to adapt CanvasPlot to the rest of the

code base, we decided to replace its prototypal inheritance with the inheritance system of

TypeScript. It is to note that TypeScript’s class system only hides the prototypal inheritance,

because it is transpiled to exact this prototypal inheritance.

CanvasPlot was originally developed to be included in a script tag in an HTML response.

That is why its code was bundled into one single ﬁle, which should be changed as well. We

propose to split the code into two ﬁles based on the inheritance structure of CanvasPlot.

One main ﬁle should contain the base class of CanvasPlot, while other ﬁles contain the

different visual representations. Thus, the lines of code in the main ﬁle can be decreased,

while the readability increases.

Chapter 5

Implementation

One major goal of this thesis is to integrate the approach described in Chapter 4 into

the Titan Control Center. Therefore, we will present the implementation of our solution

in a bottom-up direction. First, the migration of CanvasPlot to TypeScript is described

in Section 5.1. After this, the approach for a scalable and interactive visualization from

Section 4.2 is implemented in Section 5.2.

5.1 Migration of CanvasPlot to TypeScript

The approach for migrating CanvasPlot to TypeScript id described in Section 4.3. It should

use the inheritance system of TypeScript and be split into two ﬁles for a better readability.

Therefore, three steps of modiﬁcations are necessary, which are described in the following

sections.

5.1.1 Split CanvasPlot into Two Files

The ﬁrst step of migrating CanvasPlot to TypeScript is to split its code base into two

separate ﬁles. This increases the readability and thus mitigates the probability of errors in

the following refactoring process. In this step, we also introduce TypeScript ﬁles, because

the CanvasPlot library was originally written in JavaScript. Also, the

CanvasDataPlot

has to

be imported in the ﬁle containing the CanvasTimeSeriesPlot.

The ﬁrst ﬁle contains the

CanvasDataPlot

function, while the second one contains the

CanvasTimeSeriesPlot

. Not being required for our solution, the

CanvasVectorSeriesPlot

and CanvasDataPlotGroup functions are deleted.

5.1.2 Introduce TypeScript’s Class Inheritance for CanvasPlot

Introducing TypeScript’s class inheritance is the second step in the process of migrating

CanvasPlot to TypeScript. First, the

CanvasDataPlot

function is turned into a class. Therefore,

a class wrapper is built, the variables of

CanvasDataPlot

are turned into class variables and

the function itself is adapted to serve as a constructor for its class. The same is applied to

the CanvasTimeSeriesPlot function.

5. Implementation

After this, the prototype functions of

CanvasDataPlot

and

CanvasTimeSeriesPlot

are

migrated into class functions of the according class. Finally, the

CanvasTimeSeriesPlot

set to inherit from CanvasDataPlot.

5.1.3 Add Types to CanvasPlot

The main reason to use TypeScript is its sophisticated type system. It provides conﬁdence

in developing in larger code bases by checking type safety at compile time (see Section 2.6).

Providing type safety, the TypeScript transpiler will not compile the project before we add

types to our created classes.

Therefore, types are added to every class variable and to every function parameter.

Additionally, a new type

DataPoint

is introduced in order to describe the data format

CanvasPlot uses for its data points.

5.2 Implementation of a Scalable and Interactive Solution

We present an approach for a scalable and interactive solution in Section 4.2. After migrating

to TypeScript in Section 5.1, we describe the implementation of the approach in this section.

Therefore, the proposed class structure is implemented in Section 5.2.1. Then, a manager

class responsible for fetching the data from the Titan Control Center History service is

implemented in Section 5.2.2. In order to enable dynamic data loading, the necessary zoom

handler is described in Section 5.2.3. Lastly, the implementation of the caching algorithms

is detailed in Section 5.2.4.

5.2.1 Implementation of the Class and File Structure

In Section 4.2 we describe the architectural approach for our solution. The implementation

of the proposed class structure is very similar. As stated in the approach, the

ComparisonPlot

remains unchanged. The

SensorHistoryPlot

however only serves as a visual container for

CanvasPlot and loses its internal logic to manage and update the data in CanvasPlot. This

functionality is overtaken by the

TimeSeriesPlotManager

, which is instantiated and assigned

to the CanvasPlot by SensorHistoryPlot.

The

TimeSeriesPlotManager

uses multiple other classes. Firstly, it holds a read only

reference to a

MultiResolutionData

instance, which handles the data storage for different

temporal resolutions. It secondly stores a

DownloadManager

instance, which is responsible

for the communication with the backend. The

DownloadManager

is described in detail in

Section 5.2.2. Moreover, the

DataPoint

class is used to represent data points consisting of a

date and a value. A

TimeDomain

class serves as an abstract representation of the x domain

of CanvasPlot. This structure is visualized in Figure 5.1.

The

TimeSeriesPlotManager

and its used classes are bundled in a dedicated directory.

An index ﬁle in this folder exports the

TimeSeriesPlotManager

class and the

DataPoint

class.

5.2. Implementation of a Scalable and Interactive Solution

SensorHistoryPlot

-handleZoom()

-updateRealTimeData()

-determineResolutionLevel()

-injectDataPoints()

-updateDomains()

TimeSeriesPlotManager

+getDataPoints(resolutionLevel)

+setDataPoints(resolutionLevel, dataPoints)

+injectDataPoints(resolutionLevel, dataPoints)

+getUncachedIntervals(resolutionLevel, start, end)

MultiResolutionData

+fetchNewData(resolutionLevel, from, to)

-fetchNewWindowedData(resolutionLevel, from, to)

-fetchNewRawData(from, to)

DownloadManager

+toArray()

DataPoint

-start : integer

-end : integer

+getLength()

+toArray()

+toDateArray()

+shift(timeInMs : integer)

+of(domain : TimeInterval)

TimeDomain

-dataPoints : DataPoint[]

-cachedIntervals : TimeInterval[]

+getDataPoints()

+getUncachedIntervals()

+setDataPoints()

+injectDataPoints()

DataSet

-start : integer

-end : integer

TimeInterval

CanvasPlot

0..*

1..*

provide data

manages

<<use>>

downloads with

<<use>>

Figure 5.1. Excerpt of the implemented class structure

Only these two classes are used by other components. Index ﬁles enable developers to

import components from the directory the ﬁle is in rather than from the actual component

ﬁle. Thus, maintainers of this code do not have to know about the ﬁle structure within the

TimeSeriesPlotManager directory. This supports encapsulation in our implementation.

5.2.2 Implementation of the DownloadManager

The

DownloadManager

plays a major role in our solution. It is responsible for the connection

to the backend in order to fetch data points to display. As described in Section 5.2.1, the

DownloadManager is used by the TimeSeriesPlotManager.

The class only provides one public function which is

fetchNewData

. As arguments it

takes the temporal resolution level to fetch and the time period to fetch the data for.

Based on the resolution level, the function calls either the

fetchNewRawData

function or the

fetchNewWindowedData

function. Windowed data is the term for temporal aggregated data

used in the backend. Both functions take the time period to fetch data for as an argument

and return promises containing the requested data points. Promises are JavaScript’s way of

resolving asynchronous calls.

The

fetchNewWindowedData

function additionally takes the desired resolution level as an

argument. It then determines the corresponding backend endpoint for its request using a

5. Implementation

simple dictionary, which maps the numerical resolution level to URL strings.

For the

fetchNewRawData

requests the endpoint additionally differs depending on

whether the power consumption of a single device or a group of devices is displayed. In the

current implementation, ﬁve different endpoints are used in total by the

DownloadManager

There are three resolution levels supported, which results in three different windowed

endpoints. On these endpoints, there is no separation whether single device or a device

group is displayed. Additionally, two endpoints for the raw data are supported. Those are

the endpoints for separate single devices and device groups.

5.2.3 Implementation of the Zoom Handler

The zoom handler is a key part of our solution. Despite the name, it is not only called on

zoom actions in the chart, but on pan actions as well. Thus, it can react on changed time

periods being displayed in CanvasPlot. When the

TimeSeriesPlotManager

is instantiated

and assigned to a CanvasPlot, it sets the

onZoom

callback function of CanvasPlot to its own

zoom handler function.

Because a simple pan action would cause the zoom handler to be invoked once per

frame, the function is debounced. Debouncing is a technique to limit function calls by not

calling the original function, until a certain time passed after the last event. The currently

implemented debounce time is 100 milliseconds. This prevents the zoom handler from

being called more than once per user interaction.

When the zoom handler is called, it ﬁrst determines the x domain of the current view

in CanvasPlot. It then retrieves the start and the end of the displayed time period and

calculates the span of it. By the span of the x domain, it determines the resolution to fetch.

In the current implementation, three resolution levels are supported. Up to a span of 15

minutes, the zoom handler calls the

DownloadManager

to fetch raw data. If less than 11 hours

are displayed, 15-minute windows are requested. Otherwise, hourly aggregated data is

fetched.

After the resolution level has been determined, the

DownloadManager

is called to fetch

the data. Therefore, the time period to fetch data for is tripled as a simple prefetching

algorithm. When the data is received, CanvasPlot is called to display the new data points.

5.2.4 Caching the Data

In Section 5.2.2 we presented the

DownloadManager

, which is called on pan and zoom actions

by the zoom handler from Section 5.2.3. It is now to determine, how the data points retrieved

from the backend are added to the cache. Therefore, the

injectDataPoints

function is called

when data points are received from the backend. The function takes an array of the new

data points and the requested resolution level as arguments. Responsible for storing the

data points is an

MultiResolutionData

class instance, which

TimeSeriesPlotManager

stores

a reference to.

5.2. Implementation of a Scalable and Interactive Solution

1function inject (existingData: [Date, number][], toInject: [Date, number][]) {

2existingCounter, injectCounter = 0;

3resultArray = [];

5while (<resultArray not filled with all data points>) {

6existingDate = existingData[existingCounter] OR infinity;

7injectDate = toInject[injectCounter] OR infinity;

9if (existingDate < injectDate) {

10 resultArray.append(existingData[existingCounter++]);

11 }else {

12 resultArray.append(toInject[injectCounter++]);

13 }

14 }

15 return resultArray;

16 }

Listing 5.1. Pseudocode of the inject function

An instance of this data class again stores an array of

DataSet

instances. This

DataSet

represents all stored data points for a single resolution level. Hence, the multiple data sets

stored by MultiResolutionData represent data sets for different resolution levels.

The DataSet provides functions to retrieve the stored data points, to set them to a new

array of data points, and to inject data points into the existing array. Whenever data points

are retrieved from the backend, the

MultiResolutionData

is called to inject the data points

to the data set with the same resolution level. Thus, we build a multi-dimensional caching

system representing the different resolution levels.

The injection of data points is done by a separate, pure function. Pure functions are those

that do not produce observable side-effects [Nicolay et al. 2015]. We chose to implement a

pure function here in order to ease testing, which is further described in Section 5.3.

This

inject

function takes two arrays of data points as arguments as shown in Listing 5.1.

It then merges the arrays by providing two counters iterating over the arrays and choosing

the older data point to add it to the new array. The resulting array is returned and stored

by the data set.

In addition to storing the data points, the cached time periods are tracked as well.

Therefore, the data set holds an array of time periods representing those cached periods.

After new data points are injected, the

injectInterval

function is called with the stored

cached periods as well as the period of the new data points. This function again returns an

array of non-overlapping time periods representing the cached periods.

5. Implementation

5.2.5 Retrieving Data Points from the Cache

Caching data is useless without retrieving it afterwards to speed up requests and decrease

load on the server. The caching is detailed in Section 5.2.4. Here, we present how the data

retrieval is implemented in our solution for the Titan Control Center.

Whenever the

DownloadManager

is invoked to fetch new data points, it ﬁrst calls the

getUncachedIntervals

function on the data set. This function expects the time period to

fetch data for as an argument. Based on the periods that have been cached before, it

returns an array of intervals which represent yet uncached time periods. Afterwards, the

DownloadManager

issues GET requests only for the missing data to the backend. Thus, data

is not fetched twice, which supports smooth interactions in CanvasPlot.

Because there are multiple uncached time periods possible for a single request, one

DownloadManager

call may result in multiple GET requests to the backend. All requests are

promised (see Section 5.2.2) and merged into one array of data points after they resolve.

5.3 Integration of Jest as a Test Framework

Section 5.1 and Section 5.2 describe the implementation of a scalable and interactive

real-time visualization for the Titan Control Center. While implementing this, we took

advantage of a testing framework called Jest, in order to provide a tested and thus reliable

solution.

Jest is an open-source framework built at Facebook for testing JavaScript applications

[Moroz 2019]. It consists of a test runner, an assertion library, and a mocking library.

Additionally, Jest is shipped with a test-coverage tool. Prior to our solution, no unit

tests were used in the Titan Control Center Frontend. Tests that aim at monitoring the

dependencies and static code analysis are conducted within the continuous integration

process of the Titan Control Center Frontend [Latte et al. 2019].

In our implementation, Jest is not linked to any continuous integration pipeline, but it

can be implemented easily in future work. This means that the test suites only run when

started manually, which can be done by using a newly added script and typing

npm test

In watch mode, Jest monitors ﬁle changes and re-runs the related tests accordingly. This

mode can be invoked with npm run test:watch.

We used Jest in our implementation for a test-driven development regarding the

two most complicated operations related to the

TimeSeriesPlotManager

. The ﬁrst tested

function is

injectInterval

mentioned in Section 5.2.4. The second function is called

invertedIntervalIntersections

and is used for calculating the uncached time periods

in Section 5.2.5. Both functions and their tests will be presented in the following sections.

5.3. Integration of Jest as a Test Framework

5.3.1 Testing the Injection of Non-Overlapping Intervals

We introduce the

injectIntervals

function in Section 5.2.4. It is a pure function that helps

to track cached time periods. Two parameters are expected. The ﬁrst is an array of time

periods representing the currently cached time periods. The second parameter expects a

time period representing the newly cached data. Serving as a method to track the cached

time periods, its goal is to return an array of non-overlapping time periods consisting of

the prior cached periods merged with the new period.

Five test scenarios have been identiﬁed. In all scenarios, two cached periods exist before

merging. The details of the period to be merged are determined by the speciﬁc scenario.

The new period lays before or after the existing periods, in between them, or it spans over

them depending on the test case. Thus, the

injectIntervals

function is tested with the

most common inputs.

5.3.2 Testing the Calculation of Uncached Time Periods

In Section 5.2.5 we describe, how data is retrieved from the cache of our solution. The

DownloadManager

calls

getUncachedIntervals

of the data set, which returns those time peri-

ods that no data is yet cached for. Calculating these uncached time periods (or intervals)

means inverting the cached intervals in order to retrieve the uncached intervals.

The function that inverts the cached intervals is called

invertedIntervalIntersections

because it inverts the intersections of the intervals or time periods. It was developed using

a test-driven approach, which includes evaluating test scenarios ﬁrst. For this function,

six test scenarios have been identiﬁed. As in Section 5.3.1, two cached periods are stored

and the details of the period to be requested are determined by the speciﬁc scenario. The

requested period lays before or after the existing periods, in between them, or it spans

over them depending on the test case. Thus, the

invertedIntervalIntersections

function

is tested with the most common inputs.

Chapter 6

Evaluation

In Chapter 4 we evaluate an approach for the goals of this thesis, whereafter we describe

the implementation in Chapter 5. To support the scalability goal G3, the solution and its

performance must be assessed. This is the goal of the following chapter.

Section 6.1 introduces the common experimental setup for our analyses. Section 6.2 and

Section 6.3 will then conduct feasibility and performance evaluations respectively.

6.1 Evaluation Environment

The tests in the following sections are all conducted in the same environment, speciﬁed in

Table 6.1.

We start the Titan Control Center Frontend from the local webpack development server.

The used backend services are provided by a local docker-compose cluster [Sochat 2019].

In the feasibility evaluation, sample data is used to represent power consumption data of

an industrial environment. It is accessible as temporal non-aggregated data having one

value every 10 seconds. Additionally, minutely and hourly aggregated data is provided.

In the performance evaluation, power consumption data of a newspaper printing

company located in Kiel, Germany is used. It spans over more than 4 years and is addition-

ally aggregated into hourly and daily data. The underlying data has a resolution of one

measurement per 15 minutes which we upsampled to use minutely data.

6.2 Evaluation of Feasibility

The goal of this section is to evaluate whether the requirements described in Section 4.1 are

fulﬁlled. We provide test scenarios in Section 6.2.1, where each consists of integration tests.

Those scenarios are simulated in Section 6.2.2. Afterwards, possible threats to the validity

of our results are described in Section 6.2.3.

6.2.1 Methodology

In order to test the feasibility of our approach, we conduct analyses on our implementation

based on use cases. We call them scenarios in the following. A scenario consists of one or

6. Evaluation

Table 6.1. Evaluation environment

Element Speciﬁcation

CPU Intel(R) Core(TM) i7-7700HQ

Clock Rate 2.80 GHz

Cores 4 (8 virtual)

RAM 16 GB

OS Microsoft Windows 10 Education

OS Version 10.0.18363 Build 18363

Web Browser Google Chrome Version 85.0.4183.83 (Ofﬁcial Build) (64-bit)

more subsequent user interactions, which represent typical actions in analyzing a time

series chart.

The scenarios can be split in two main parts according to the requirements described

in Section 4.1. Scenarios 1 to 3 aim at evaluating the features that existed before our

solution was implemented and were to be maintained. Scenarios 4 to 6 shall prove that the

newly implemented features fulﬁll their requirements. We assume that the Control Center

Frontend is loaded ﬁrst in every of the following scenarios and the initial data points are

fetched as well.

Scenario 1: Comparing Power Consumption Data of More Than One Device

The ﬁrst scenario handles a feature, which was implemented before and has to be main-

tained by our solution. In this scenario, we use the comparison plot to compare the power

consumption of multiple devices.

To assure that graphs can be compared for multiple single devices as well as for multiple

groups of devices, the comparison is done for one single device and one device group.

Scenario 2: Interact With the Plot

In the second scenario, the interactivity of the chart is tested. Maintaining support for zoom

and pan actions is another requirement evaluated in Section 4.1. In this scenario, the user

conducts zoom and pan actions on the chart.

We therefore navigate to the landing page of the Titan Control Center. On the upcoming

chart displaying the total power consumption of the system, zoom and pan actions are

conducted.

6.2. Evaluation of Feasibility

Scenario 3: Analyze Real-Time Power Consumption Data

In order to fulﬁll goal G2 (see Section 1.2.2), this scenario evaluates the real-time analysis

capabilities of our solution. Therefore, we navigate to the landing page of the Titan Control

Center. We then zoom into the chart displaying the total power consumption of the system,

until the plot uses the highest-possible temporal resolution. Afterwards, we pan the chart

to move the current time into view.

The plot should now move with the time passing, in order to always show the latest

data.

Scenario 4: Panning Through the Data Inﬁnitely

Scenario 4 is the ﬁrst scenario to prove the feasibility of the newly implemented features

as described in Section 4.1. To simulate this scenario, we conduct the same steps as in

scenario 3. We then pan through the data backwards in time for at least more than 2 hours

in displayed time.

This was not possible before the implementation of our solution as described in Sec-

tion 2.2.2.

Scenario 5: Analyzing Data in Multiple Temporal Resolutions

In this scenario, we analyze the power consumption data in multiple temporal resolutions.

We therefore navigate to the landing page in order to view the total power consumption

plot of the system.

The graph should display the data in the second of three different temporal resolution

levels. We then zoom into the chart in order to view the highest temporal resolution and

zoom out to see the lowest resolution subsequently.

Scenario 6: Use Prefetching to Enhance User Experience

Prefetching is a technique we described in Section 5.2.3. Scenario 6 aims at testing the

feasibility of this approach in our solution. Therefore, scenario 4 is simulated again,

including the panning backwards in time. As long as the pan span does not exceed the

width of the chart, no areas without data points should appear. This would implicate that

data of a full chart length is prefetched.

6.2.2 Results and Discussion

The following section describes which observations we make while simulating the scenarios

from Section 6.2.1. It also discusses whether those observations match our expectations for

the given scenario.

6. Evaluation

Scenario 1: Comparing Power Consumption Data of More Than One Device

In order to simulate scenario 1, we navigate to the comparison page and click on Add Plot.

As comparands, we choose the root device group My Company and a single device called

Server 1.

Graphs for both data sets are displayed properly. Panning and zooming is supported as

well. This is the behaviour we expected for scenario 1. We conclude that comparing power

consumption data of more than one device is working as expected. The design did not

change, hence the screenshot in Figure 2.5 is still valid for the comparison plot.

Scenario 2: Interact With the Plot

To simulate scenario 2, we navigate to the landing page in order to view the total power

consumption chart. We then zoom into the chart using the mouse wheel, pan left, pan right,

and zoom out again.

The chart follows all of the four interactions. While zooming, it even changes the

resolution level, when a resolution limit was exceeded. This should be tested in scenario 5,

but does not corrupt scenario 2. Hence we conclude that we successfully maintained the

interactivity of CanvasPlot.

Scenario 3: Analyze Real-Time Power Consumption Data

Analyzing real-time data is a key requirement of this thesis presented in Section 4.1. The

steps to be taken to simulate scenario 3 are described in Section 6.2.1.

The chart moves itself with the time passing, meaning that the latest data is always

displayed. This is the behaviour we expected.

Scenario 4: Panning Through the Data Inﬁnitely

In order to simulate scenario 4, we navigate to the total power consumption chart again.

We then pan through the data backwards in time.

Data is fetched and displayed dynamically which results in a limitless panning back-

wards in time. It is only restricted by the beginning of data recorded. We conclude that

inﬁnitely panning through the data is made possible by our solution.

Scenario 5: Analyzing Data in Multiple Temporal Resolutions

As in the scenarios before, we simulate this scenario using the total power consumption

chart on the landing page. Initially, the chart loads the second of three resolution levels. By

zooming in, we retrieve the highest-available temporal resolution. Zooming out gives us

the lowest resolution level, which currently is hourly aggregated. We hence conclude that

analyzing data in multiple temporal resolutions is supported by our solution.

6.2. Evaluation of Feasibility

Figure 6.1. Screenshots from scenario 5: non-aggregated and minutely data

Figure 6.1 shows screenshots of scenario 5. After zooming in, the chart displays temporal

non-aggregated data as shown in the left screenshot. On the right, minutely data is shown.

Zooming out further would result in displaying hourly data.

Scenario 6: Use Prefetching to Enhance User Experience

Scenario 6 aims at proving the feasibility of prefetching in our solution. We therefore

simulate scenario 4 again, including panning through the data.

As long as the panning span of one single pan action does not exceed the width of the

chart, no empty areas are displayed. This behaviour is limited by the frequency of pan

actions, the server latency, and by the download bandwidth. When multiple pan actions

are conducted before the prefetched data arrives, empty areas will occur. Despite the limits,

we classify this scenario as feasible.

6.2.3 Threats to Validity

For the evaluation of feasibility in this section we only used the environment described in

Section 6.1. This includes the hardware environment as well as the software components

and especially the Chrome browser. As a consequence, our results are only conditionally

transferable to other environments since those can differ in their behaviour. Especially, the

visualization was not tested in arbitrary web browsers. In general, our evaluation only

comprises central scenarios and does not cover all speciﬁc edge cases.

6. Evaluation

6.3 Evaluation of Performance

After evaluating the feasibility of our solution in Section 6.2, the next section aims at

evaluating the performance and scalability of our solution. In Section 6.3.1 we describe the

methodology of our evaluation, followed by the presentation of the results in Section 6.3.2.

Those results are then discussed in Section 6.3.3. Section 6.3.4 assesses possible threats to

the validity of the evaluation.

6.3.1 Methodology

For the performance evaluation, the experimental setup is the same as in the feasibility

evaluation. It is described in Section 6.1. However, the data set for this evaluation is different

than the one described in other chapters of this thesis. In order to evaluate scalability and

performance, we needed more data to test our solution with. Hence, power consumption

data of a single device monitored from June 2015 to August 2019 with a temporal resolution

of 15 minutes is used. We upsample the data to one measurement value per minute, thus

our sample set contains 2,635,200 data points. Hourly and daily data is provided by

temporal aggregations as well.

In our evaluation, the performance of the current implementation of the Titan Control is

measured, as well as the performance of our new solution. We therefore chose to compare

the load times for the chart with different amounts of data. Both versions are tested with

data of 2

, 2

, ..., 2

days. For each time span, the chart is loaded 100 times. Then, the

processing time is measured and stored. This measurement contains preparing the fetch

request, the response time of the backend services, the handling of the response, and the

time to draw the resulting plot. Since both the frontend and the backend services run

locally, the response time is not affected by delays in external networks.

We modiﬁed both of the frontend versions to automatically reload the chart and

to double the requested time period after 100 measurements. They also calculate the

processing time themselves by storing two timestamps. The ﬁrst is set to when the fetch is

prepared and the second timestamp stores the current time when the chart is fully drawn.

Then, the difference is taken as the ﬁnal measurement. Those measurements are then sent

to a server speciﬁcally created to receive these values and to store them in CSV ﬁles for

later analyses.

6.3.2 Results

The measurements of the evaluation are plotted in Figure 6.2 and Figure 6.3, displaying

the results of the current implementation and the new implementation respectively. These

boxplot diagrams show the processing time in milliseconds for fetching and drawing data

for the given amount of days. Each box represents the measured processing times lying

between the ﬁrst and the third quartile. Thus, it represents 50% of the sample. Moreover,

the vertical lines originating from the boxes called whiskers indicate the total range of

6.3. Evaluation of Performance

100

1000

10000

212223242526272829210

Days of displayed data

Processing time in ms

Figure 6.2. Time to fetch and draw data for given number of days - current implementation

the sample. The upper end of the lines hence marks the maximum value of the sample,

while the lower end marks the minimum value. Points above the upper whisker represent

statistical outliers. Values are considered outliers, if they are greater than the 3rd quartile

plus 1.5 times the range between the second and third quartile.

It is to note that we use logarithmic scales in the charts, in order to enhance the

visualization of exponentially growing time spans to fetch data for. While the x axis uses a

base-two logarithmic scale, the y axis opts for a factor of ten.

Figure 6.2 visualizes the results for the current version. It shows that the processing

times increases when the number of days to display data for grows. While it takes less than

100 milliseconds to fetch and draw data points for 4 days and less, the same process takes

more than 10 seconds for 1024 days. We can presume that the growth of the processing

time is linear to the days of displayed data.

The results of the new solution, however, do not show linear growth, which is shown in

Figure 6.3. For 2, 4, and 8 days of displayed data, the median processing time lies around 20

milliseconds. Processing 16, 32, and 64 days of data takes around 14 milliseconds. For more

days to display, the processing time then rises until it reaches a median of 32 milliseconds

when displaying 1024 days.

6. Evaluation

100

1000

212223242526272829210

Days of displayed data

Processing time in ms

Figure 6.3. Time to fetch and draw data for given number of days - new implementation

6.3.3 Discussion

The performance results described in Section 6.3.2 for the current implementation and

our new solution both meet our expectations. Because the current version does not fetch

temporal aggregated data, the linear growth of processing time is reasonable. Two time

periods result in the same processing time, if the time span is equal. Hence, the processing

time grows according to the time period.

The results from the new solution shown in Figure 6.3 can also be explained by the

underlying approach. For the performance evaluation, the default resolution level for the

displayed data was changed in order to handle the hourly and daily data. Therefore, the

highest available resolution is only requested if the time span to display is less than one

day. If data for more than one day is fetched but for less than two weeks, than hourly data

is requested. Otherwise, daily data is downloaded from the backend services. This change

assures that only a reasonable amount of data is fetched and displayed.

This switching between resolutions explains why the measurements are smaller than

the ones from the current implementation. Minutely data as used by the current version

is never fetched by the new solution, because we always measure the processing time for

more than one day. The decrease between 8 and 16 days of displayed data is triggered by

6.3. Evaluation of Performance

the change from hourly to daily data, because data for more than two weeks is fetched.

Since no resolution level greater than daily data is provided, the processing time for more

days then rises as expected.

The decrease of processing time from 2

to 2

, however, is surprising, because there is

no possible change in resolution levels. This can be caused by measurement error, which

again could be caused by multiple reasons. Those reasons are described in the following

section.

6.3.4 Threats to Validity

The environment we run the performance evaluation in is not speciﬁcally designed to

support a microservice architecture like the Titan Control Center. Moreover, all components

of the Control Center are executed on the same physical machine. Those components hence

compete for computational resources on this machine. Additionally, the evaluation is run

in a desktop environment, which adds even more processes to run in parallel to our test

setup. These reasons can lead to measurement errors in our evaluation.

Chapter 7

Related Work

The solution in this thesis is built upon the Titan Control Center introduced by Henning

[2018]. It is responsible for monitoring the electrical power consumption within the Titan

platform (see Section 2.2). Our solution also uses CanvasPlot by Johanson et al. [2016]

for visualizing the power-consumption using line charts. CanvasPlot itself wraps the

open-source data visualization framework D3 by Bostock [2020]. The Titan Control Center,

CanvasPlot, and D3 are introduced in Chapter 2.

The information arising from monitoring industrial environments using the Control

Center is a key component to support the Industrial DevOps approach proposed by Has-

selbring et al. [2019]. A coherent circle consisting of monitoring and analyzing, identifying

requirements, adapting the system, and automatic testing before deploying the software

into production is introduced by the authors of the paper. Additionally, a concept for an

organizational structure is presented. It shall support quick adjustments to the system by

encouraging people from different domains to work together.

In Chapter 3 we evaluate existing approaches for scalable and interactive time series

visualization in order to identify a base approach for our solution. Thus, four related

approaches are presented in Section 3.2. LiveRAC, ATLAS, ForeCache, and CanvasPlot are

visualization libraries that all aim at supporting large sets of time series data.

LiveRAC is introduced in Section 3.2.1. It is a tool providing visualizations for system

management time series data [McLachlan et al. 2008]. The strengths of this library is its use

of semantic zooming which means not to zoom proportionally but adding or removing

properties of the visual representation. This technique helps balancing the information

density on a well-deﬁned level. A US patent for semantic zoom ﬁled is abandoned but

still assigned to Microsoft Technology Licensing LLC [Pittappily 2013]. In contrast to our

approach, LiveRAC does not implement any caching algorithms or other techniques to

minimize data exchange volume. Moreover, it does not support pan actions in its charts.

ATLAS however is not built for a speciﬁc type of time series data [Chan et al. 2008].

Its goal is to maintain interactivity of time series plots while displaying large data sets. In

detail, the authors of the corresponding paper deﬁne interactivity in supporting smooth

interactions like panning and zooming. In our thesis, we adopted their goal in conjunction

with its deﬁnition.

While we focus on presenting a frontend meeting our requirements, ATLAS additionally

sets up a high-performance server architecture. The system consists of the frontend itself, a

query distribution server acting as a load balancer, and a kdb+ database speciﬁcally built

7. Related Work

for time series data. ATLAS introduces the concept of predictive caching which hides the

latency of the system by prefetching data. It therefore anticipates which data might be

viewed after the next user interaction and then requests it from the database. For this thesis,

we adopted this concept to enhance the interactivity of our solution.

ForeCache as the third of the evaluated approaches adds the concept of using multiple

prediction models in the prefetching process [Battle et al. 2014]. Those models all make

assumptions about which data is viewed next and thus deﬁne which data is prefetched ﬁrst.

For example, one model assumes that a user action like panning will continue rather than

stop immediately and thus proposes to prefetch the data that lies in the direction of the pan

action. Another model works with recorded user sessions to identify so-called hotspots in

the data which are more likely to be viewed than other spots. For our implementation we

implemented the ﬁrst model only, which proves sufﬁcient in the performance evaluation in

Section 6.3. In contrast to the other evaluated libraries and our approach, ForeCache does

not display time series data but spatial data instead. Its approach to prediction models,

however, can be valuable for time series data as well.

Chapter 8

Conclusions and Future Work

8.1 Conclusions

In this thesis we presented an approach for scalable and interactive visualizations of time

series data. We therefore ﬁrst explained our motivation for this ﬁeld of work and identiﬁed

the goals for this thesis. Thereafter, we introduced the foundations and technologies that

were used in our solution. In order to build our implementation on a reliable base approach,

we assessed four visualization libraries based on framework and conceptual criteria. In this

evaluation, we identiﬁed CanvasPlot as such a base library for our solution.

We then constructed several requirements which our implementation had to meet in

order to support all goals described in the ﬁrst chapter. Based on them, we introduced

our approach for an interactive and scalable visualization of time series data. It consists of

using CanvasPlot as a base solution, maintaining its features like real-time capability and

support for zoom and pan actions, and adding more abilities. These include the support

for temporal aggregated data, caching, and prefetching. We also presented an approach for

migrating CanvasPlot to TypeScript.

After having ﬁnalized the approach, we began with the implementation by migrating to

TypeScript. Then, we introduced the new class and ﬁle structure within the Titan Control

Center Frontend to meet the requirements. A download manager was implemented to

fetch data from the Control Center backend services. The zoom handler was introduced in

order to react on zoom and pan actions conducted by the user. To meet the scalability goal

of this thesis, we also integrated caching algorithms.

Following the implementation and integration into the frontend, an evaluation of our

solution was conducted. We therefore evaluated its feasibility in six scenarios which all

showed positive results. Then, we tested the performance of the current version of the

frontend and our new approach. Therefore, we measured the time to fetch and draw data

points for numerous time spans, which clearly showed an advantage of the new solution.

We provide our implementation as a package [Koch 2020] that contains the Titan Control

Center Frontend including our visualization solution from Chapter 5. It also includes the

versions tested in the performance evaluation, as well as the used log server. In order to

provide backend services for those versions, we append a docker-compose ﬁle to start all

necessary Titan Control Center services.

8. Conclusions and Future Work

8.2 Future Work

During our work on this thesis, we identiﬁed multiple aspects that can be considered in

future work. These aspects can be classiﬁed by their proximity to our solution and the

Titan research project.

8.2.1 Future Work within our Solution

Our solution meets all the goals we deﬁne in Section 1.2. However, the evaluated approaches

encourage to go further. Especially ForeCache’s use of multiple prediction models to

determine which data is prefetched could be interesting for future enhancements of our

approach [Battle et al. 2014]. As described in Section 3.2.3, it predicts the data to be viewed

next by consulting ﬁve different prediction models.

Most of them aim at spatial data instead of time series data, but the so-called hotspot

model could yield interesting results in our solution too. It works by recording user sessions

and storing the most often displayed data. Then it assumes that those hotspots will be

viewed more often in future user sessions and will prioritize them in the prefetching process.

Future work can integrate this feature in order to enhance the prefetching proposed in this

thesis.

8.2.2 Future Work within the Titan Control Center

The evaluation in Chapter 6 shows that the scalability of our approach depends on the

temporal resolution levels of the time series data provided by the backend services. This

leads to the need of more such endpoints when the data set grows in order to maintain the

level of performance observed in this thesis.

In the current and in the new implementation, CanvasPlot repeatedly requests the

latest data from the backend services in constant intervals. Thus, the visualization latency

includes the time until the next logical tick of the timer when the next data fetch is

conducted. This latency hence can be diminished by introducing an implementation of the

WebSocket protocol [I. Fette and A. Melnikov 2011]. Using this protocol allows server-sent

messages containing the latest data which could be sent immediately after the values are

received and processed by the backend services. After implementing this technique we can

expect a vastly decreased latency [Pimentel and Nickerson 2012].

8.2.3 Future Work within the Field of Scalable Visualizations

Outside of the Titan context, a scientiﬁc evaluation could be conducted to test the scalability

limits of a solution combining the approaches of ATLAS and ForeCache. From ATLAS one

could adopt its query distribution server and the high-performance database kdb+ which

is speciﬁcally designed to store large time series data sets. Adding ForeCache’s prediction

models could result in a system with astonishing performance benchmarks.

Bibliography

[Battle et al. 2014] L. Battle, R. Chang, and M. Stonebraker. Dynamic generation and

prefetching of data chunks for exploratory visualization. IEEE InfoVis Posters Track

(2014). (Cited on pages 16, 44, 46)

[Bierman et al. 2014] G. Bierman, M. Abadi, and M. Torgersen. Understanding TypeScript.

In: ECOOP 2014 – Object-Oriented Programming. 2014. (Cited on page 11)

[Boguhn 2020] L. Boguhn. Forecasting power consumption of manufacturing industries

using neural networks. Bachelor’s Thesis. Kiel University, Department of Computer

Science, 2020. (Cited on page 6)

[Bostock et al. 2011] M. Bostock, V. Ogievetsky, and J. Heer. D

data-driven documents.

IEEE Transactions on Visualization and Computer Graphics (2011). (Cited on page 10)

[Bostock 2020] M. Bostock. D3.js - data-driven documents. Library Catalog: d3js.org. url:

https://d3js.org/ (visited on 05/02/2020). (Cited on pages 10, 43)

[Chan et al. 2008] S.

M. Chan, L. Xiao, J. Gerth, and P. Hanrahan. Maintaining interactivity

while exploring massive time series. In: 2008 IEEE Symposium on Visual Analytics Science

and Technology. 2008. (Cited on pages 15, 17, 43)

[Ehrenstein 2019] S. Ehrenstein. Distributed sensor management for an industrial DevOps

monitoring platform. Bachelor’s Thesis. Kiel University, Department of Computer

Science, 2019. (Cited on page 10)

[Eshkevari et al. 2017] L. Eshkevari, D. Mazinanian, S. Rostami, and N. Tsantalis. JSDeodor-

ant: class-awareness for JavaScript programs. In: 2017 IEEE/ACM 39th International

Conference on Software Engineering Companion (ICSE-C). 2017. (Cited on page 10)

[Feldthaus and Møller 2014] A. Feldthaus and A. Møller. Checking correctness of TypeScript

interfaces for JavaScript libraries. In: Proceedings of the 2014 ACM International Conference

on Object Oriented Programming Systems Languages & Applications - OOPSLA ’14. 2014.

(Cited on page 11)

[Few 2006] S. Few. Information dashboard design: the effective visual communication of data.

2006. (Cited on page 1)

[Hasselbring et al. 2019] W. Hasselbring, S. Henning, B. Latte, A. Mobius, T. Richter, S.

Schalk, and M. Wojcieszak. Industrial DevOps. In: 2019 IEEE International Conference on

Software Architecture Companion (ICSA-C). 2019. (Cited on pages 1, 5, 43)

[Hasselbring and Steinacker 2017] W. Hasselbring and G. Steinacker. Microservice archi-

tectures for scalability, agility and reliability in e-commerce. In: 2017 IEEE International

Conference on Software Architecture Workshops (ICSAW). 2017. (Cited on page 6)

Bibliography

[Henning 2018] S. Henning. Prototype of a scalable monitoring infrastructure for industrial

DevOps. Master’s Thesis. Kiel University, Department of Computer Science, 2018.

(Cited on pages 6, 43)

[Henning et al. 2020] S. Henning, W. Hasselbring, H. Burmester, A. Möbius, and M. Woj-

cieszak. Goals and measures for analyzing power consumption data in manufacturing

enterprises. arXiv:2009.10369 [cs] (2020). (Cited on pages 6, 22)

[Henning et al. 2019] S. Henning, W. Hasselbring, and A. Mobius. A scalable architecture

for power consumption monitoring in industrial production environments. In: 2019

IEEE International Conference on Fog Computing (ICFC). 2019. (Cited on pages 1, 5, 7)

[I. Fette and A. Melnikov 2011] I. Fette and A. Melnikov. The WebSocket protocol. RFC 6455.

RFC Editor, 2011. (Cited on page 46)

[Johanson et al. 2016] A. Johanson, S. Flögel, C. Dullo, and W. Hasselbring. OceanTEA:

exploring ocean-derived climate data using microservices. In: Proceedings of the Sixth

International Workshop on Climate Informatics (CI 2016). 2016. (Cited on pages 10, 43)

[Kapadiya 2018] A. A. Kapadiya. Visualisation of multi-service system net-work with d3.js

& kdb+/q using websocket. Global Journal of Computer Science and Technology (2018).

(Cited on page 16)

[Koch 2020] T. Koch. Thesis Artifacts for: Scalable and Interactive Real-Time Visualization of

Time Series Data. 2020. url:https://doi.org/10.5281/zenodo.4041682. (Cited on page 45)

[Latte et al. 2019] B. Latte, S. Henning, and M. Wojcieszak. Clean code: on the use of

practices and tools to produce maintainable code for long-living software. Living Systems

(2019). (Cited on page 30)

[Lwakatare et al. 2015] L. E. Lwakatare, P. Kuvaja, and M. Oivo. Dimensions of DevOps. In:

Agile Processes in Software Engineering and Extreme Programming. 2015. (Cited on pages 1,

[McLachlan et al. 2008] P. McLachlan, T. Munzner, E. Koutsoﬁos, and S. North. LiveRAC:

interactive visual exploration of system management time-series data. In: Proceedings of

the SIGCHI Conference on Human Factors in Computing Systems. 2008. (Cited on pages 15,

43)

[Moroz 2019] B. Moroz. Test automation of a react-redux application with jest and enzyme.

Bachelor’s thesis. South-Eastern Finland University of Applied Sciences, 2019. (Cited

on page 30)

[Morrison 2010] J. P. Morrison. Flow-based programming, 2nd edition: a new approach to

application development. 2010. (Cited on page 5)

[Nicolay et al. 2015] J. Nicolay, C. Noguera, C. De Roover, and W. De Meuter. Detecting

function purity in JavaScript. In: 2015 IEEE 15th International Working Conference on

Source Code Analysis and Manipulation (SCAM). 2015. (Cited on page 29)

Bibliography

[Pimentel and Nickerson 2012] V. Pimentel and B. G. Nickerson. Communicating and

displaying real-time data with WebSocket. IEEE Internet Computing (2012). (Cited on

page 46)

[Pittappily 2013] T. Pittappily. Semantic zoom. U.S. patent 2013/0067398 A1. 2013. (Cited

on page 43)

[Saboury et al. 2017] A. Saboury, P. Musavi, F. Khomh, and G. Antoniol. An empirical

study of code smells in JavaScript projects. In: 2017 IEEE 24th International Conference on

Software Analysis, Evolution and Reengineering (SANER). 2017. (Cited on page 3)

[Sochat 2019] V. Sochat. Singularity compose: orchestration for singularity instances. Journal

of Open Source Software (2019). (Cited on page 33)

[You 2014] E. You. Vue.js. Vue.js. 2014. url:

https://vuejs.org/

(visited on 05/01/2020).

(Cited on pages 1, 9)

3 views·59 pages

Scalable and Interactive Real-Time Visualization of Time Series Data PDF Free Download

Scalable and Interactive Real-Time Visualization of Time Series Data PDF free Download. Think more deeply and widely.

Uploaded by alyons on 4/18/2026

/59

100%