WAW 2020

Plenary Speakers

Cliff Joslyn

Pacific Northwest National Laboratory, Seattle, USA

Title: Introduction to Hypergraph Analytics and Computational Topology

Abstract: In many areas of science, from cyber to biology and ecology to social systems like bibliometrics, systems with complex networked interactions produce challenging large-scale data sets. Complex systems and network science researchers are increasingly turning to higher levels of mathematical abstraction in order to faithfully capture the properties of complex systems. Specifically, hypergraphs and abstract simplicial complexes (along with labeled, directed, attributed, and ordered versions of these) are increasingly popular to model multidimensional structures and high order interactions (i.e., more than binary relations and primary effects) frequently present in complex systems. As mathematical objects, these multi-dimensional graph structures stand poised to connect applied network science to computational topology. We can recognize traditional network science as a limiting case of a broader methodology, one which uses hypergraphs to natively represent the full complexity of the multi-way relationships present, thereby not being limited to the binary (two-place) relationships representable in graphs. While graphs dominate network analysis applications, they frequently do so at the expense of "cutting off" higher order, multidimensional interactions and dependences by limiting representations to "2-sections" and "primary effects". And as hypergrphs explicitly generalize graphs, so a burgeoning "hypernetwork science" is generalizing network science, including development of multi-dimensional versions of standard network methods like centrality, connectivity, modularity, and clustering coefficients.

The Pacific Northwest National Laboratory has been pursuing hypergraph analytics in a number of areas, including mathematics, methodology, applications, and software environments for both visualization and high performance computing. In this talk I will introduce the fundamental concepts and methods of hypernetwork science, in the context of our HyperNetX open source software capability. Scaling approaches and studies are described in the context of our Chapel Hypergraph Library (CHGL) for the high performance, data parallel programming language Chapel. Finally, we will describe another significant advantage of the hypergraph approach to complex systems analysis, namely its affordance of opportunities to analyze proper hypergraphs as topological objects, and thereby the potential for the use of the tools of higher order mathematics available in computational topology.

Bio: Dr. Cliff Joslyn is the Chief Scientist for Knowledge Sciences at the Pacific Northwest National Laboratory in Seattle, Washington, and an Adjunct Professor of Systems Science at Portland State University in Portland, Oregon. He was previously a Team Leader in the Computer Science Division at the Los Alamos National Laboratory, and an NRC Research Associate at NASA's Goddard Space Flight Center. He holds a PhD and MS in Systems Science from SUNY Binghamton, and a BS in Cognitive Science and Mathematics from Oberlin College, with High Honors in Cybernetics and the Science of Mind. For over twenty five years Dr. Joslyn has been a leader of research efforts for the US Government in information science, discrete mathematics, and computer science in a range of applications. His research has spanned from non-traditional uncertainty quantification to semantic information systems, developing novel methods from order theory and network science. Most recently he leads research efforts in computational topology and hypergraph analytics, with applications in information integration for computational biology, network defense, and open source analytics.

Web page: https://www.pnnl.gov/science/staff/staff_info.asp?staff_num=8847

Liudmila Prokhorenkova

Yandex Research and Moscow Institute of Physics and Technology, Moscow, Russia

Title: Some applications of graphs and probability theory to machine learning

Abstract: In the first half of the talk, I will discuss the problem of nearest neighbor search: given a dataset, the goal is to preprocess it in such a way that for an arbitrary forthcoming query we can quickly find its nearest neighbors in the dataset. Recently, graph-based approaches were shown to demonstrate superior performance over other types of algorithms for this task. However, there has been very little research on their theoretical guarantees. First, I will cover some popular graph-based approaches and then present our recent theoretical analysis of their performance. There are still plenty of theoretical questions to be addressed in this area.

In the second half of the talk, I will discuss validation measures for clustering. There are many cluster similarity indices used to evaluate clustering (and community detection) algorithms. Choosing a suitable index for a particular task is a fundamental problem: an inappropriate choice may result in suboptimal methods chosen and biased decisions made. I will discuss a theoretical approach to this problem: formally define a list of desirable properties and check which indices satisfy them. Among the desirable properties, there is a constant baseline requirement that is non-trivial and important but is violated by most of the widely used indices.

Bio: Liudmila Prokhorenkova (Ostroumova) received her PhD in mathematics at Lomonosov Moscow State University in 2015 with a thesis on random graph models of complex networks. Currently she is a senior researcher at Yandex Research and at Moscow Institute of Physics and Technology. Liudmila authored about 40 publications in mathematics and computer science. She also received an Outstanding Reviewer award at the conference WSDM-2018. Her research interests include random graphs and complex network analysis as well as theoretical aspects of machine learning, e.g., the analysis of gradient boosting algorithms such as CatBoost (open source gradient boosting library by Yandex).

Web page: https://research.yandex.com/people/6984

Katarzyna Rybarczyk

Adam Mickiewicz University, Poznan, Poland

Title: Random intersection graph models of complex networks

Abstract: In many networks similar objects have tendency to cluster together. It is believed that this tendency of connecting objects with similar features is responsible for many properties of complex networks. One of the models capturing it is the random intersection graph model. In random intersection graphs each of n network's nodes (vertices) selects randomly an attribute set from a given collection of m attributes and two nodes establish adjacency relation whenever they share a common attribute.

During the talk I want to discuss well known and latest results concerning random intersection graphs. I will stress those results which relate to complex networks analysis. I will discuss differences and similarities to other networks’ models. I will also mention results concerning degree distribution and clustering. Last but not least I will present latest results concerning random walks on such networks. In this context I will be interested in how much a natural clustering might influence the cover time of a random walk on networks.

Bio: Katarzyna Rybarczyk obtained her Master’s degrees in mathematics (from Adam Mickiewicz University in Poznań) and in architecture (from Poznań Technical University) in 2005. She received PhD in theoretical computer science in 2010 with a thesis concerning applications of random intersection graphs as models of complex networks. She is one of the pioneers of research in theory of random intersection graphs initiated by Karoński, Scheinerman and Singer-Cohen in `90. Some of her results concerning equivalence and connectivity of random intersection graphs are seen as classic in the field. Her research interests are far wider and include general theory of random graphs, graph algorithms, cryptography, complex networks, and mathematics in architecture.

Web page: http://kryba.home.amu.edu.pl

Francois Theberge

The Tutte Institute for Mathematics and Computing, Ottawa, Canada

Title: Advances in graph clustering: algorithms, evaluation and applications

Abstract: Graph clustering is a commonly used tool in network science with applications such as community finding, anomaly detection and visualization. In the first part of this talk, we give an overview of an Ensemble Clustering algorithm for Graphs (ECG), which is based on the modularity-optimizing Louvain algorithm. The Louvain algorithm is often one of the best performing algorithms in comparison studies, but it has a few known issues: it can be unstable, and it suffers from the resolution limit of modularity, so “ground-truth” communities are often merged. We show that ECG greatly reduces those issues and moreover, it also provides a measure of the strength of the communities it finds. Code for ECG is available for Python (igraph and networkx) and in the GPU-based RAPIDS suite from NVIDIA. Graph clustering is unsupervised, so comparing algorithms is often done via artificial benchmarks such as LFR and BTER, and using measures like NMI (normalized mutual information). Such measures are “graph-agnostic” since they only consider the clusters as sets, and the fact that those are groups of nodes on a graph is ignored. In the second part of the talk, we show why adjusted measures should be used, and introduce a new family of “graph-aware” measures. We demonstrate the complementarity of graph-agnostic and graph-aware measures via theoretical and empirical results, and use both in a large study comparing several graph clustering algorithms. We also revisit the popular LFR benchmark, proposing a similar but simpler model: ABCD (Artificial Benchmark for Community Detection). We show some advantages of ABCD including its speed, and a more natural interpretation of the “overall noise” parameter common to both LFR and ABCD. Finally, we show how graph clustering can be used along with a geometric variant of the Chung-Lu model to build a framework for comparing and ranking the quality of graph embeddings.

Bio: François Théberge holds a B.Sc. degree in applied mathematics and computer science from the University of Ottawa, a M.Sc. in telecommunications from INRS and a PhD. in electrical engineering from McGill University. He has been employed by CSE since 1996 during which he was involved in the creation of the data science team as well as the research group now known as the Tutte Institute for Mathematics and Computing. He also holds an adjunct professorial position in the Department of Mathematics and Statistics at the University of Ottawa. His current interests include relational-data mining and deep learning.