Python – Plot dendrogram using sklearn.AgglomerativeClustering


I'm trying to build a dendrogram using the children_ attribute provided by AgglomerativeClustering, but so far I'm out of luck. I can't use scipy.cluster since agglomerative clustering provided in scipy lacks some options that are important to me (such as the option to specify the amount of clusters). I would be really grateful for a any advice out there.

    import sklearn.cluster
    clstr = cluster.AgglomerativeClustering(n_clusters=2)

Best Solution

Here is a simple function for taking a hierarchical clustering model from sklearn and plotting it using the scipy dendrogram function. Seems like graphing functions are often not directly supported in sklearn. You can find an interesting discussion of that related to the pull request for this plot_dendrogram code snippet here.

I'd clarify that the use case you describe (defining number of clusters) is available in scipy: after you've performed the hierarchical clustering using scipy's linkage you can cut the hierarchy to whatever number of clusters you want using fcluster with number of clusters specified in the t argument and criterion='maxclust' argument.