A biologist interested in bioclimatic habitats of species needs to find geographical areas that admit two characterizations, one in terms of their climatic profile and one in terms of the occupying species.

For a political analyst , matching the personal profiles of an election’s candidates to their viewpoints on various relevant issues might help cast light on the political scene and provide insight into the different candidates’ positions.

Finally, by matching the terms used by individuals of an ethnic group to call one another to their genealogical relationships might provide an ethnographer stepping stones to elucidate the meaning of kinship terms they use.

These are three examples of a general problem setting where we need to identify correspondences between data that have different nature (species vs. climate, personal profiles vs. political viewpoints and kinship terminology vs. genealogical linkage).

To identify the correspondences over binary data sets, Ramakrishnan et al. proposed redescription mining in 2004. Subsequent research has extended the problem formulation to more complex correspondences and data types, making it applicable to wide variety of data analysis tasks.

Beyond these examples, redescription mining has diverse practical applications, from detecting criminal networks to optimizing circuit designs.


Our tutorial on Redescription Mining is currently available in two flavors, a third one will be presented this summer:

Formal Flavor

Originally presented at the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery (ECML-PKDD 2016), in Riva del Garda, Italy, on Friday September 23, 2016.

  • Duration: 4h
  • Target audience: data mining researchers and students
  • Emphasis: formalization of a general framework, introduction of the problem variants

Practical Flavor

Originally presented at the SIAM International Conference on Data Mining (SDM 2017), in Houston, TX, USA, on Thursday April 27, 2017.

  • Duration: 2h
  • Target audience: data analysts and method developers
  • Emphasis: techniques, applications, visual and interactive mining

Sapor Liberum

To be presented at the ACM SIGKDD International Conference on Knowledge discovery and data mining (KDD 2018), that will take place in London, UK, on Sunday morning August 19, 2018.

  • Duration: 3h30
  • Target audience: data analysts and method developers
  • Emphasis: conceptual overview, algorithms, problem variants and applications
  • Based on our recently published introductory book on redescription mining.