Télécom ParisTech

Actualité

Illustration de l'actualité : PhD defense Alexandre Hollocou : Novel Approaches to the Clustering of Large Graphs

PhD defense Alexandre Hollocou : Novel Approaches to the Clustering of Large Graphs

mercredi
19
décembre
2018

PhD Comics : I'm defending my thesis, Mom !

Jury

  • M. Marc LELARGE INRIA Directeur de these
  • M. Thomas BONALD Télécom ParisTech Co-directeur de these
  • M. Cheng-Shang CHANG National Tsing Hua University Rapporteur
  • M. Renaud LAMBIOTTE University of Oxford Rapporteur
  • Mme Florence D'ALCHÉ-BUC Télécom ParisTech Examinateur
  • M. Nicolas VAYATIS Ecole Normale Supérieure Paris-Saclay Examinateur
  • M. Laurent VIENNOT INRIA Examinateur

Abstract

Graphs are ubiquitous in many fields of research ranging from sociology to biology. A graph is a very simple mathematical structure that consists of a set of elements, called nodes, connected to each other by edges. It is yet able to represent complex systems such as protein-protein interaction or scientific collaborations. Graph clustering is a central problem in the analysis of graphs whose objective is to identify dense groups of nodes that are sparsely connected to the rest of the graph. These groups of nodes, called clusters, are fundamental to an in-depth understanding of graph structures. There is no universal definition of what a good cluster is, and different approaches might be best suited for different applications. Whereas most of classic methods focus on finding node partitions, i.e. on coloring graph nodes so that each node has one and only one color, more elaborate approaches are often necessary to model the complex structure of real-life graphs and to address sophisticated applications. In particular, in many cases, we must consider that a given node can belong to more than one cluster. Besides, many real-world systems exhibit multi-scale structures and one much seek for hierarchies of clusters rather than flat clusterings. Furthermore, graphs often evolve over time and are too massive to be handled in one batch so that one must be able to process stream of edges. Finally, in many applications, processing entire graphs is irrelevant or expensive, and it can be more appropriate to recover local clusters in the neighborhood of nodes of interest rather than color all graph nodes.