Project Visual Cluster Monitoring

Prof. Dr. Bernd Fröhlich
Dr. rer. nat. Patrick Riehmann
M.Sc. Michael Völske
B.Sc. Janek Bevendorff
M.Sc. Dora Kiesel

15 Credits Medieninformatik (B.Sc.)
15 Credits Computer Science for Digital Media (M.Sc.)
15 Credits Computer Science and Media (M.Sc.)
15 Credits Human-Computer Interaction (M.Sc.)

Description:

Modern data processing and storage clusters consist of hundreds of individual nodes or computing devices. Meaning, there are thousands of hardware components that may fail and impact the operation of the whole cluster. Monitoring all components is crucial, but it is even more important that critical failures do not get lost in the noise of regular status updates.
We aim at developing novel interactive visualization techniques for visually monitoring such large clusters  capable of presenting the specifics of thousands of hardware sensors and millions of log entries over time; both retrospectively and in real time. An appropriate depiction of such multivariate time series data provides general insights in the various dynamic aspects during the operation of large clusters and aid in detection of outliers and failures.
Based on the open source monitoring framework Grafana (grafana.org) we are going to build our views and visualizations, which will allow us to aggregate and depict as well as to interactively filter and explore the monitoring information received from the computing and storage cluster of the Webis Group at our University consisting of more than 5500 cores, 35 terabyte memory and 4.5[BF1] petabyte of hard disk storage.

Assignments:

Active participation in the project, intermediate talks, final presentation