{"id":10096,"date":"2019-01-24T14:58:36","date_gmt":"2019-01-24T14:58:36","guid":{"rendered":"https:\/\/wordlift.io\/blog\/en\/entity\/cluster-analysis\/"},"modified":"2021-12-16T13:00:29","modified_gmt":"2021-12-16T12:00:29","slug":"cluster-analysis","status":"publish","type":"entity","link":"https:\/\/wordlift.io\/blog\/en\/entity\/cluster-analysis\/","title":{"rendered":"Cluster analysis"},"content":{"rendered":"<p><strong>Cluster analysis or clustering<\/strong> is the task of grouping a set of objects in such a way that objects in the same group (called cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). It is a main task of exploratory data mining, and a common technique for statistical data analysis used in many fields, including <a class=\"wl-entity-page-link\" title=\"ML\" href=\"https:\/\/wordlift.io\/blog\/en\/entity\/machine-learning\/\" data-id=\"http:\/\/data.wordlift.io\/wl0216\/entity\/machine_learning;http:\/\/dbpedia.org\/resource\/Machine_learning;https:\/\/www.wikidata.org\/wiki\/Q2539\" >machine learning<\/a>, pattern recognition, image analysis, information retrieval, and bioinformatics.<\/p>\n<p>We can work with <strong>cluster analysis using machine learning<\/strong> to <strong>analyze queries<\/strong>. In this case, we can use different techniques to cluster queries.<\/p>\n<p>Let\u2019s review them together:<b><\/b><\/p>\n<ul>\n<li aria-level=\"1\"><b>Supervised text classification<\/b><span style=\"font-weight: 400;\"> when we know the correct output class for each text in a sample dataset.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Unsupervised text classification<\/b><span style=\"font-weight: 400;\"> zero-shot is one of these techniques where the algorithm observes samples from classes that were not observed during training and needs to predict the class they belong to, we use it for instance for intent classification. You can try it out from <\/span><a href=\"https:\/\/api-docs.wordlift.io\/swagger-ui\/?urls.primaryName=classification#\/Classifier\/classifyUsingPost\"><span style=\"font-weight: 400;\">here<\/span><\/a><span style=\"font-weight: 400;\"><span style=\"font-weight: 400;\"> using a WordLift\u2019s key. <\/span><\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\"><span style=\"font-weight: 400;\"><b><\/b><\/span><\/span><span style=\"font-weight: 400;\"><b><\/b><\/span><b><a class=\"wl-entity-page-link\" title=\"Named-entity recognition\" href=\"https:\/\/wordlift.io\/blog\/en\/entity\/named-entity-recognition\/\" data-id=\"http:\/\/data.wordlift.io\/wl0216\/entity\/named-entity_recognition;http:\/\/rdf.freebase.com\/ns\/m.0658pt;http:\/\/yago-knowledge.org\/resource\/Named-entity_recognition;http:\/\/dbpedia.org\/resource\/Named-entity_recognition\" >Named Entity Recognition<\/a> <\/b><span style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">we can extract entities from queries as we do on <a href=\"http:\/\/longtail.wordlift.io\">longtail.wordlift.io<b><\/b><\/a><\/span><\/span><span style=\"font-weight: 400;\"><b><\/b><\/span><b><\/b><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>K-Means clustering <\/b><span style=\"font-weight: 400;\">one of the simplest and most popular unsupervised machine learning algorithms. K-Means averages the data by identifying a centroid for each group and by grouping all records in a limited number of clusters. A centroid is the imaginary center of each cluster. We used <\/span><a href=\"https:\/\/wordlift.io\/blog\/en\/machine-learning-for-seo\/\"><span style=\"font-weight: 400;\">K-Means here to analyze GSC data<\/span><\/a><span style=\"font-weight: 400;\"><b><\/b><\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Word <a class=\"wl-entity-page-link\" title=\"word embeddings\" href=\"https:\/\/wordlift.io\/blog\/en\/entity\/what-are-embeddings\/\" data-id=\"http:\/\/data.wordlift.io\/wl0216\/entity\/what_are_embeddings_\" >Embedding<\/a>s.<\/b><span style=\"font-weight: 400;\"> Embedding\u00a0encode the meaning of words using, real-valued vectors so that words that are closer in the vector space are expected to be similar in meaning, once meanings are turned into math we can use K-means to group them, we can use cosine similarity to evaluate how close they are or algorithms like PCA, UMAP and t-SNE to visualize them in a 2D or 3D space.<\/span><\/li>\n<\/ul>\n\n","protected":false},"excerpt":{"rendered":"<p>Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). It is a main task of exploratory data mining, and a common &hellip; <a href=\"https:\/\/wordlift.io\/blog\/en\/entity\/cluster-analysis\/\">Continued<\/a><\/p>\n","protected":false},"author":6,"featured_media":10120,"comment_status":"open","ping_status":"closed","template":"","meta":{"_acf_changed":false,"wl_entities_gutenberg":"","_wlpage_enable":"","footnotes":""},"categories":[],"wl_entity_type":[12],"coauthors":[],"class_list":["post-10096","entity","type-entity","status-publish","has-post-thumbnail","hentry","wl_entity_type-thing"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v22.6 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Cluster analysis - WordLift Blog<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/wordlift.io\/blog\/en\/entity\/machine-learning\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Cluster analysis - WordLift Blog\" \/>\n<meta property=\"og:description\" content=\"Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). It is a main task of exploratory data mining, and a common &hellip; Continued\" \/>\n<meta property=\"og:url\" content=\"https:\/\/wordlift.io\/blog\/en\/entity\/machine-learning\/\" \/>\n<meta property=\"og:site_name\" content=\"WordLift Blog\" \/>\n<meta property=\"article:modified_time\" content=\"2021-12-16T12:00:29+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/wordlift.io\/blog\/en\/wp-content\/uploads\/sites\/3\/2019\/01\/WordLift-for-WCEU-2018.001.jpeg\" \/>\n\t<meta property=\"og:image:width\" content=\"1920\" \/>\n\t<meta property=\"og:image:height\" content=\"1080\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"2 minutes\" \/>\n\t<meta name=\"twitter:label2\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data2\" content=\"Andrea Volpini\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/wordlift.io\/blog\/en\/entity\/cluster-analysis\/\",\"url\":\"https:\/\/wordlift.io\/blog\/en\/entity\/machine-learning\/\",\"name\":\"Cluster analysis - WordLift Blog\",\"isPartOf\":{\"@id\":\"https:\/\/wordlift.io\/blog\/en\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/wordlift.io\/blog\/en\/entity\/machine-learning\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/wordlift.io\/blog\/en\/entity\/machine-learning\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/wordlift.io\/blog\/en\/wp-content\/uploads\/sites\/3\/2019\/01\/WordLift-for-WCEU-2018.001.jpeg\",\"datePublished\":\"2019-01-24T14:58:36+00:00\",\"dateModified\":\"2021-12-16T12:00:29+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/wordlift.io\/blog\/en\/entity\/machine-learning\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/wordlift.io\/blog\/en\/entity\/machine-learning\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/wordlift.io\/blog\/en\/entity\/machine-learning\/#primaryimage\",\"url\":\"https:\/\/wordlift.io\/blog\/en\/wp-content\/uploads\/sites\/3\/2019\/01\/WordLift-for-WCEU-2018.001.jpeg\",\"contentUrl\":\"https:\/\/wordlift.io\/blog\/en\/wp-content\/uploads\/sites\/3\/2019\/01\/WordLift-for-WCEU-2018.001.jpeg\",\"width\":1920,\"height\":1080,\"caption\":\"Clustering in Machine Learning\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/wordlift.io\/blog\/en\/entity\/machine-learning\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Blog\",\"item\":\"https:\/\/wordlift.io\/blog\/en\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Cluster analysis\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/wordlift.io\/blog\/en\/#website\",\"url\":\"https:\/\/wordlift.io\/blog\/en\/\",\"name\":\"WordLift Blog\",\"description\":\"AI-Powered SEO\",\"publisher\":{\"@id\":\"https:\/\/wordlift.io\/blog\/en\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/wordlift.io\/blog\/en\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/wordlift.io\/blog\/en\/#organization\",\"name\":\"WordLift\",\"url\":\"https:\/\/wordlift.io\/blog\/en\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/wordlift.io\/blog\/en\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/mk0wordliftblog7j5te.kinstacdn.com\/wp-content\/uploads\/sites\/3\/2017\/04\/logo-1.png\",\"contentUrl\":\"https:\/\/mk0wordliftblog7j5te.kinstacdn.com\/wp-content\/uploads\/sites\/3\/2017\/04\/logo-1.png\",\"width\":152,\"height\":40,\"caption\":\"WordLift\"},\"image\":{\"@id\":\"https:\/\/wordlift.io\/blog\/en\/#\/schema\/logo\/image\/\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Cluster analysis - WordLift Blog","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/wordlift.io\/blog\/en\/entity\/machine-learning\/","og_locale":"en_US","og_type":"article","og_title":"Cluster analysis - WordLift Blog","og_description":"Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). It is a main task of exploratory data mining, and a common &hellip; Continued","og_url":"https:\/\/wordlift.io\/blog\/en\/entity\/machine-learning\/","og_site_name":"WordLift Blog","article_modified_time":"2021-12-16T12:00:29+00:00","og_image":[{"width":1920,"height":1080,"url":"https:\/\/wordlift.io\/blog\/en\/wp-content\/uploads\/sites\/3\/2019\/01\/WordLift-for-WCEU-2018.001.jpeg","type":"image\/jpeg"}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"2 minutes","Written by":"Andrea Volpini"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/wordlift.io\/blog\/en\/entity\/cluster-analysis\/","url":"https:\/\/wordlift.io\/blog\/en\/entity\/machine-learning\/","name":"Cluster analysis - WordLift Blog","isPartOf":{"@id":"https:\/\/wordlift.io\/blog\/en\/#website"},"primaryImageOfPage":{"@id":"https:\/\/wordlift.io\/blog\/en\/entity\/machine-learning\/#primaryimage"},"image":{"@id":"https:\/\/wordlift.io\/blog\/en\/entity\/machine-learning\/#primaryimage"},"thumbnailUrl":"https:\/\/wordlift.io\/blog\/en\/wp-content\/uploads\/sites\/3\/2019\/01\/WordLift-for-WCEU-2018.001.jpeg","datePublished":"2019-01-24T14:58:36+00:00","dateModified":"2021-12-16T12:00:29+00:00","breadcrumb":{"@id":"https:\/\/wordlift.io\/blog\/en\/entity\/machine-learning\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/wordlift.io\/blog\/en\/entity\/machine-learning\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/wordlift.io\/blog\/en\/entity\/machine-learning\/#primaryimage","url":"https:\/\/wordlift.io\/blog\/en\/wp-content\/uploads\/sites\/3\/2019\/01\/WordLift-for-WCEU-2018.001.jpeg","contentUrl":"https:\/\/wordlift.io\/blog\/en\/wp-content\/uploads\/sites\/3\/2019\/01\/WordLift-for-WCEU-2018.001.jpeg","width":1920,"height":1080,"caption":"Clustering in Machine Learning"},{"@type":"BreadcrumbList","@id":"https:\/\/wordlift.io\/blog\/en\/entity\/machine-learning\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Blog","item":"https:\/\/wordlift.io\/blog\/en\/"},{"@type":"ListItem","position":2,"name":"Cluster analysis"}]},{"@type":"WebSite","@id":"https:\/\/wordlift.io\/blog\/en\/#website","url":"https:\/\/wordlift.io\/blog\/en\/","name":"WordLift Blog","description":"AI-Powered SEO","publisher":{"@id":"https:\/\/wordlift.io\/blog\/en\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/wordlift.io\/blog\/en\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/wordlift.io\/blog\/en\/#organization","name":"WordLift","url":"https:\/\/wordlift.io\/blog\/en\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/wordlift.io\/blog\/en\/#\/schema\/logo\/image\/","url":"https:\/\/mk0wordliftblog7j5te.kinstacdn.com\/wp-content\/uploads\/sites\/3\/2017\/04\/logo-1.png","contentUrl":"https:\/\/mk0wordliftblog7j5te.kinstacdn.com\/wp-content\/uploads\/sites\/3\/2017\/04\/logo-1.png","width":152,"height":40,"caption":"WordLift"},"image":{"@id":"https:\/\/wordlift.io\/blog\/en\/#\/schema\/logo\/image\/"}}]}},"_wl_alt_label":["clustering"],"wl:entity_url":"http:\/\/data.wordlift.io\/wl0216\/entity\/cluster_analysis","_links":{"self":[{"href":"https:\/\/wordlift.io\/blog\/en\/wp-json\/wp\/v2\/entities\/10096"}],"collection":[{"href":"https:\/\/wordlift.io\/blog\/en\/wp-json\/wp\/v2\/entities"}],"about":[{"href":"https:\/\/wordlift.io\/blog\/en\/wp-json\/wp\/v2\/types\/entity"}],"author":[{"embeddable":true,"href":"https:\/\/wordlift.io\/blog\/en\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/wordlift.io\/blog\/en\/wp-json\/wp\/v2\/comments?post=10096"}],"version-history":[{"count":0,"href":"https:\/\/wordlift.io\/blog\/en\/wp-json\/wp\/v2\/entities\/10096\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/wordlift.io\/blog\/en\/wp-json\/wp\/v2\/media\/10120"}],"wp:attachment":[{"href":"https:\/\/wordlift.io\/blog\/en\/wp-json\/wp\/v2\/media?parent=10096"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wordlift.io\/blog\/en\/wp-json\/wp\/v2\/categories?post=10096"},{"taxonomy":"wl_entity_type","embeddable":true,"href":"https:\/\/wordlift.io\/blog\/en\/wp-json\/wp\/v2\/wl_entity_type?post=10096"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/wordlift.io\/blog\/en\/wp-json\/wp\/v2\/coauthors?post=10096"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}