{"id":20591,"date":"2022-02-11T11:00:00","date_gmt":"2022-02-11T10:00:00","guid":{"rendered":"https:\/\/wordlift.io\/blog\/en\/?p=20591"},"modified":"2023-04-17T10:59:50","modified_gmt":"2023-04-17T08:59:50","slug":"category-page-optimization-for-ecommerce","status":"publish","type":"post","link":"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/","title":{"rendered":"E-commerce SEO for Product Category Pages with the help of AI"},"content":{"rendered":"\r\n<h2 class=\"wp-block-heading\">Building Product Sub-categories (semi)automatically<\/h2>\r\n\r\n\r\n\r\n<p>Manually working on product knowledge graphs and improving data quality on <a class=\"wl-entity-page-link\" title=\"eCommerce SEO\" href=\"https:\/\/wordlift.io\/blog\/en\/entity\/e-commerce-seo\/\" data-id=\"http:\/\/data.wordlift.io\/wl0216\/entity\/e-commerce_seo__how_to_drive_organic_traffic_to_your_store;http:\/\/www.wikidata.org\/entity\/Q212930;http:\/\/dbpedia.org\/resource\/Online_shopping;http:\/\/rdf.freebase.com\/ns\/m.047m52;http:\/\/rdf.freebase.com\/ns\/m.019qb_;http:\/\/dbpedia.org\/resource\/Search_engine_optimization;https:\/\/wordlift.io\/blog\/en\/entity\/search-engine-optimization\/;http:\/\/no.dbpedia.org\/resource\/Nettbutikk;http:\/\/ru.dbpedia.org\/resource\/\u0418\u043d\u0442\u0435\u0440\u043d\u0435\u0442-\u043c\u0430\u0433\u0430\u0437\u0438\u043d;http:\/\/fi.dbpedia.org\/resource\/Verkkokauppa;http:\/\/pt.dbpedia.org\/resource\/Com\u00e9rcio_on-line;http:\/\/bg.dbpedia.org\/resource\/\u0418\u043d\u0442\u0435\u0440\u043d\u0435\u0442_\u043c\u0430\u0433\u0430\u0437\u0438\u043d;http:\/\/hu.dbpedia.org\/resource\/Web\u00e1ruh\u00e1z;http:\/\/uk.dbpedia.org\/resource\/\u0406\u043d\u0442\u0435\u0440\u043d\u0435\u0442-\u043c\u0430\u0433\u0430\u0437\u0438\u043d;http:\/\/sk.dbpedia.org\/resource\/Internetov\u00fd_obchod;http:\/\/id.dbpedia.org\/resource\/Belanja_daring;http:\/\/sr.dbpedia.org\/resource\/Elektronska_prodavnica;http:\/\/sv.dbpedia.org\/resource\/Webbutik;http:\/\/en.dbpedia.org\/resource\/Online_shopping;http:\/\/is.dbpedia.org\/resource\/Vefverslun;http:\/\/it.dbpedia.org\/resource\/Negozio_online;http:\/\/es.dbpedia.org\/resource\/Tienda_en_l\u00ednea;http:\/\/et.dbpedia.org\/resource\/Internetikaubandus;http:\/\/cs.dbpedia.org\/resource\/Internetov\u00fd_obchod;http:\/\/pl.dbpedia.org\/resource\/Sklep_internetowy;http:\/\/ro.dbpedia.org\/resource\/Magazin_virtual;http:\/\/da.dbpedia.org\/resource\/Webshop;http:\/\/nl.dbpedia.org\/resource\/Webwinkel;http:\/\/tr.dbpedia.org\/resource\/\u0130nternet_\u00fczerinden_al\u0131\u015fveri\u015f\" >e-commerce<\/a> websites is extremely time-consuming and impossible for most businesses that cannot afford an army of expert taxonomists.<\/p>\r\n\r\n\r\n\r\n<p>In this blog post, I will share <strong>how we\u2019re seeding and enriching taxonomies for e-commerce websites<\/strong> using <a class=\"wl-entity-page-link\" title=\"NLP\" href=\"https:\/\/wordlift.io\/blog\/en\/entity\/natural-language-processing\/\" data-id=\"http:\/\/data.wordlift.io\/wl0216\/entity\/natural_language_processing;http:\/\/rdf.freebase.com\/ns\/m.05flf;http:\/\/dbpedia.org\/resource\/Natural_language_processing;http:\/\/be.dbpedia.org\/resource\/\u0410\u043f\u0440\u0430\u0446\u043e\u045e\u043a\u0430_\u043d\u0430\u0442\u0443\u0440\u0430\u043b\u044c\u043d\u0430\u0439_\u043c\u043e\u0432\u044b;http:\/\/ru.dbpedia.org\/resource\/\u041e\u0431\u0440\u0430\u0431\u043e\u0442\u043a\u0430_\u0435\u0441\u0442\u0435\u0441\u0442\u0432\u0435\u043d\u043d\u043e\u0433\u043e_\u044f\u0437\u044b\u043a\u0430;http:\/\/pt.dbpedia.org\/resource\/Processamento_de_linguagem_natural;http:\/\/bg.dbpedia.org\/resource\/\u041e\u0431\u0440\u0430\u0431\u043e\u0442\u043a\u0430_\u043d\u0430_\u0435\u0441\u0442\u0435\u0441\u0442\u0432\u0435\u043d_\u0435\u0437\u0438\u043a;http:\/\/lt.dbpedia.org\/resource\/Nat\u016bralios_kalbos_apdorojimas;http:\/\/fr.dbpedia.org\/resource\/Traitement_automatique_du_langage_naturel;http:\/\/uk.dbpedia.org\/resource\/\u041e\u0431\u0440\u043e\u0431\u043a\u0430_\u043f\u0440\u0438\u0440\u043e\u0434\u043d\u043e\u0457_\u043c\u043e\u0432\u0438;http:\/\/id.dbpedia.org\/resource\/Pemrosesan_bahasa_alami;http:\/\/ca.dbpedia.org\/resource\/Processament_de_llenguatge_natural;http:\/\/sr.dbpedia.org\/resource\/Obrada_prirodnih_jezika;http:\/\/en.dbpedia.org\/resource\/Natural_language_processing;http:\/\/is.dbpedia.org\/resource\/M\u00e1lgreining;http:\/\/it.dbpedia.org\/resource\/Elaborazione_del_linguaggio_naturale;http:\/\/es.dbpedia.org\/resource\/Procesamiento_de_lenguajes_naturales;http:\/\/cs.dbpedia.org\/resource\/Zpracov\u00e1n\u00ed_p\u0159irozen\u00e9ho_jazyka;http:\/\/pl.dbpedia.org\/resource\/Przetwarzanie_j\u0119zyka_naturalnego;http:\/\/ro.dbpedia.org\/resource\/Prelucrarea_limbajului_natural;http:\/\/da.dbpedia.org\/resource\/Sprogteknologi;http:\/\/tr.dbpedia.org\/resource\/Do\u011fal_dil_i\u015fleme\" >natural language processing<\/a>.<\/p>\r\n\r\n\r\n\r\n<p>In general with retail products, improving a graph is much more difficult. We typically don\u2019t deal with clearly defined entities but rather we work with unbounded attributes &#8211; lenses for glasses, materials for furniture, and so on. Let\u2019s first review a few concepts and some helpful SEO best practices for creating category pages. This will help us understand the workflow and how we can fine-tune it for our needs.<\/p>\r\n\r\n\r\n\r\n<p class=\"has-text-align-center\"><strong>?\u200d? Want to jump right to the Colab? It&#8217;s <\/strong><a href=\"https:\/\/bit.ly\/kw-clustering\"><strong>here<\/strong><\/a><strong>!<\/strong><\/p>\r\n\r\n\r\n\r\n<p class=\"has-text-align-left\"><a href=\"#what-taxonomy\">What is a product taxonomy?<\/a><\/p>\r\n\r\n\r\n\r\n<ul class=\"wp-block-list\" style=\"margin-bottom:20px\"><li><a href=\"#create-product-taxonomy\">What are the main challenges in creating a product taxonomy?<\/a><\/li><li><a href=\"#how-create\">How to create a product taxonomy?<\/a><\/li><\/ul>\r\n\r\n\r\n\r\n<p><a href=\"#seo-best-practices\">SEO best practices for creating and optimizing category pages<\/a><\/p>\r\n\r\n\r\n\r\n<ul class=\"wp-block-list\">\r\n<li><a href=\"#clear-goal\">Having a clear goal in mind, SEO needs to be measurable<\/a><\/li>\r\n\r\n\r\n\r\n<li><a href=\"#number-of-products\">Setting a minimum number of products for new categories<\/a><\/li>\r\n\r\n\r\n\r\n<li><a href=\"#content\">Working on content that helps<\/a><\/li>\r\n\r\n\r\n\r\n<li><a href=\"#keyword-cannibalization\">Preventing keyword cannibalization while improving internal links<\/a><\/li>\r\n<\/ul>\r\n\r\n\r\n\r\n<p><a href=\"#workflow\">Overview of the workflow to automate the creation of subcategories<\/a><\/p>\r\n\r\n\r\n\r\n<ul class=\"wp-block-list\">\r\n<li><a href=\"#extraction-techniques\">Keyword extraction techniques<\/a><\/li>\r\n\r\n\r\n\r\n<li><a href=\"#search-demand\">Analyzing search demand<\/a>\r\n<ul class=\"wp-block-list\">\r\n<li><a href=\"#keybert\">KeyBERT<\/a><\/li>\r\n<\/ul>\r\n<\/li>\r\n\r\n\r\n\r\n<li><a href=\"#clustering\">Clustering queries (optional)<\/a>\r\n<ul class=\"wp-block-list\">\r\n<li><a href=\"#topic-modelling\">Topic modeling<\/a><\/li>\r\n<\/ul>\r\n<\/li>\r\n\r\n\r\n\r\n<li><a href=\"#extract-keywords\">Extract keywords<\/a>\r\n<ul class=\"wp-block-list\">\r\n<li><a href=\"#data-cleanup\">Data cleanup &#8211; the steps<\/a><\/li>\r\n<\/ul>\r\n<\/li>\r\n\r\n\r\n\r\n<li><a href=\"#list-of-products\">Analyze the list of products<\/a><\/li>\r\n<\/ul>\r\n\r\n\r\n\r\n<p><a href=\"#conclusion\">Conclusions and future work<\/a><\/p>\r\n\r\n\r\n\r\n<h2 class=\"wp-block-heading\" id=\"what-taxonomy\">What is a product taxonomy?<\/h2>\r\n\r\n\r\n\r\n<p>A taxonomy for products acts like shelves in a supermarket. It is the <strong>structure that organizes all products in the most accessible way so that customers can easily find them with the least cognitive effort <\/strong>(and the least amount of clicks). Generally, a product taxonomy is a combination of hierarchies of categories and subcategories and product attributes that are specific for each category (e.g. <a href=\"https:\/\/www.hubspot.com\/brand-kit-generator\/color-palette-generator\" target=\"_blank\" rel=\"noreferrer noopener\">color<\/a> or size for t-shirts and frame shape for sunglasses).<\/p>\r\n\r\n\r\n\r\n<h3 class=\"wp-block-heading\" id=\"create-product-taxonomy\">What are the main challenges in creating a product taxonomy?<\/h3>\r\n\r\n\r\n\r\n<p>In an ideal world, you want to have clean attributes for every product but in reality, this rarely is the case. Data is sparse, incomplete, and of low quality. Structure sparsity is a very common problem to deal with, you would like to have every item clearly classified but you don\u2019t and only a subset of the catalog has been organized correctly. In general, the knowledge domain is also highly sophisticated and attributes tend to be specific for each category. You also need to deal with sub-types, overlapping elements, and synonyms (i.e. \u201cI call it googles but you call them glasses\u201d).<\/p>\r\n\r\n\r\n\r\n<h3 class=\"wp-block-heading\" id=\"how-create\">How to create a product taxonomy?<\/h3>\r\n\r\n\r\n\r\n<p>In knowledge organization, taxonomies don\u2019t grow like plants on their own. They are the results of highly specialized human labor and can be created using different approaches: top-down, bottom-up, or a combination of both. We can either rely on domain experts that guide us through the maze of how to turn more abstract concepts like \u201caccessories\u201d into fine-grained subcategories such as \u201caccessories &gt; sunglasses &gt; sports activewear &gt; running\u201d and so on or we can discover these terms by ourselves and playfully combine them by learning how they relate to each other.<\/p>\r\n\r\n\r\n\r\n<p>As anticipated, we want to automate the creation of these taxonomies by leveraging <a class=\"wl-entity-page-link\" title=\"machine with learning\" href=\"https:\/\/wordlift.io\/blog\/en\/entity\/machine-learning\/\" data-id=\"http:\/\/data.wordlift.io\/wl0216\/entity\/machine_learning;http:\/\/dbpedia.org\/resource\/Machine_learning;https:\/\/www.wikidata.org\/wiki\/Q2539\" >machine learning<\/a> and our understanding of core SEO principles. In general with AI, our main focus is not on the technology that we\u2019re going to use, but rather on how we plan to involve and interact with the humans involved in the process. We want to augment our product knowledge graphs and make them helpful to both, the consumer looking for a product to buy and the merchant that is selling that product.<\/p>\r\n\r\n\r\n\r\n<h2 class=\"wp-block-heading\" id=\"seo-best-practices\">SEO best practices for creating and optimizing category pages<\/h2>\r\n\r\n\r\n\r\n<p>Before reviewing our framework let\u2019s highlight the SEO best practices when creating a product taxonomy. They will help us understand the code and the workflow that we envisioned.<\/p>\r\n\r\n\r\n\r\n<h3 class=\"wp-block-heading\" id=\"clear-goal\">1. Having a clear goal in mind, SEO needs to be measurable<\/h3>\r\n\r\n\r\n\r\n<p>We have two options:<\/p>\r\n\r\n\r\n\r\n<ol class=\"wp-block-list\">\r\n<li>Work to attract new users from Google by organizing products around search intents.<\/li>\r\n\r\n\r\n\r\n<li>Improve the on-site experience by enriching the existing taxonomy with new products and by extending their list of attributes.<\/li>\r\n<\/ol>\r\n\r\n\r\n\r\n<p>We can work with both targets in mind but the steps will be different and in general, it greatly helps to go one step at a time. You want to measure the impact and work against measurable results. Doing too many things at once never helps.&nbsp;<\/p>\r\n\r\n\r\n\r\n<h3 class=\"wp-block-heading\" id=\"number-of-products\">2. Setting a minimum number of products for new categories<\/h3>\r\n\r\n\r\n\r\n<p>There is no golden rule here but you can use your common sense. You want to be relevant for searchers and provide for each new category page a minimum number of products. Establishing the threshold has to do with competition and user experience. If your competitors have 100 different products on average it is unlucky that you will rank with just 10 products for the same intent. At the same time setting this threshold is also highly dependent on the user experience. If you are able to provide more details at a first glance and can always bring to the top the products that people love you can still win the game even with fewer products.<\/p>\r\n\r\n\r\n\r\n<h3 class=\"wp-block-heading\" id=\"content\">3. Working on content that helps<\/h3>\r\n\r\n\r\n\r\n<p>A textual introduction to the listing is vital. It doesn\u2019t need to be too long (that would simply not work especially on mobile), ideally the user should be able to collapse it but you need to be very informative, personalized, and crystal clear. What are we selling? To whom are we selling (what is the target audience)? And what we want people to remember about these products. We have been also successful in demonstrating tractions (in both traffic and sales) when adding, at the bottom of categories pages, relevant question-answer pairs using FAQ markup.<\/p>\r\n\r\n\r\n\r\n<p class=\"has-text-align-center\"><img decoding=\"async\" width=\"343\" height=\"245\" src=\"https:\/\/lh5.googleusercontent.com\/jyr9_7sq5u9ZR6d7H6UXWqMWhKHr34edBhN01B2edvAicHrNawxySKmdN-oyCG6-eQzs-B2x3iI0jYuBHNGPzsjoeS9eysRcxfYYHWx1mlorlR-cul3Kvrk0zYPqGDJQTwxKeqDT\"><\/p>\r\n\r\n\r\n\r\n<p class=\"has-text-align-center\"><em>FAQ traffic related to a retail website in the US.<\/em><\/p>\r\n\r\n\r\n\r\n<h3 class=\"wp-block-heading\" id=\"keyword-cannibalization\">4. Preventing keyword cannibalization while improving internal links<\/h3>\r\n\r\n\r\n\r\n<p>Needless to say, you don\u2019t want to generate traffic in overlap with already existing pages. Quite the opposite, you might want to distribute traffic more evenly across all pages. This basically means that on one side you want to ensure that any new category is not going to overlap with existing pages, and on the other side, you want to ensure that highly-trafficked pages can link to contextually relevant, and less visible, sub-category pages.<\/p>\r\n\r\n\r\n\r\n<p>Check out the example below. The intro text under home &gt; men &gt; bags of a famous luxury fashion store, provides the user with relevant links to more specific (and less trafficked landing pages).<\/p>\r\n\r\n\r\n\r\n<p class=\"has-text-align-center\"><img decoding=\"async\" width=\"624\" height=\"123\" src=\"https:\/\/lh6.googleusercontent.com\/uVMU6KXtUF07wiQvdvVAM-KMsvz8L7N1OVR_u3l57vePalYTHsbjpyHYVDB0bpmk33gUvkjPJuSHqcpLay1_YfiFpViy2iFBqMnYhTRagP4qm1K1QVUacsV2TUzRCbmc5uRp0Eqb\"><\/p>\r\n\r\n\r\n\r\n<p>This is smart as intents are very well defined (1. \u201cI want to buy a designer bag for man\u201d for the main category page and 2 .\u201cI want to buy a Guggi|Off-White\u2019s|Saint Laurent bag for man\u201d) and we can more evenly distribute the link equity from the main page to the sub-pages.&nbsp;&nbsp;&nbsp;<\/p>\r\n\r\n\r\n\r\n<p>One last note can be added to address concerns on the crawl budget. When creating new pages people are usually afraid of hitting the crawl budget. Now, while it is true that Google tends to be more conservative on indexing the web (also to reduce its carbon footprint) we shall always keep in mind that if you have a million or fewer web pages you shouldn\u2019t worry about a crawl budget. Don\u2019t trust me, read instead <a href=\"https:\/\/www.searchenginejournal.com\/gary-illyes-whats-new-in-google-search-pubcon-keynote-recap\/274273\/\">what Gary Illyes said on this topic<\/a> some time ago.&nbsp;<\/p>\r\n\r\n\r\n\r\n<h2 class=\"wp-block-heading\" id=\"workflow\">Overview of the workflow to automate the creation of subcategories<\/h2>\r\n\r\n\r\n\r\n<p>The basic idea is the following:<\/p>\r\n\r\n\r\n\r\n<ol class=\"wp-block-list\">\r\n<li>We begin from a category page and we want to understand our searcher personas. We do that by <strong>diving into the queries behind the page<\/strong>. We can collect these queries from either Google Search Console or from SEMrush (sometimes looking at competitors might help gain more insights). We can also choose a timeframe that makes sense for this specific business.<\/li>\r\n\r\n\r\n\r\n<li>We can (optionally) <strong>cluster the keywords to get a sense of the main intents<\/strong>. We will use BERTopic here.<\/li>\r\n\r\n\r\n\r\n<li>We <strong>extract the keywords behind these queries<\/strong> (using KeyBERT), remove false positives, low scores, duplicates, oddities of all kinds and other irrelevant terms.<\/li>\r\n\r\n\r\n\r\n<li>We <strong>re-use these keywords to analyze the titles of the products<\/strong> belonging to our target category.&nbsp;<\/li>\r\n<\/ol>\r\n\r\n\r\n\r\n<p>Finally, <strong>choose the terms that will cover the highest number of products<\/strong> as sub-categories.&nbsp;<\/p>\r\n\r\n\r\n\r\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"776\" height=\"383\" src=\"https:\/\/wordlift.io\/blog\/en\/wp-content\/uploads\/sites\/3\/2022\/02\/Schermata-2022-02-08-alle-14.13.31.png\" alt=\"\" class=\"wp-image-20597\" srcset=\"https:\/\/wordlift.io\/blog\/en\/wp-content\/uploads\/sites\/3\/2022\/02\/Schermata-2022-02-08-alle-14.13.31.png 776w, https:\/\/wordlift.io\/blog\/en\/wp-content\/uploads\/sites\/3\/2022\/02\/Schermata-2022-02-08-alle-14.13.31-300x148.png 300w, https:\/\/wordlift.io\/blog\/en\/wp-content\/uploads\/sites\/3\/2022\/02\/Schermata-2022-02-08-alle-14.13.31-768x379.png 768w, https:\/\/wordlift.io\/blog\/en\/wp-content\/uploads\/sites\/3\/2022\/02\/Schermata-2022-02-08-alle-14.13.31-150x74.png 150w\" sizes=\"(max-width: 776px) 100vw, 776px\" \/><\/figure>\r\n\r\n\r\n\r\n<h3 class=\"wp-block-heading\" id=\"extraction-techniques\">Keyword extraction techniques<\/h3>\r\n\r\n\r\n\r\n<p>Keyword extraction is a crucial text mining task in our workflow. Provided a <a class=\"wl-entity-page-link\"  href=\"https:\/\/wordlift.io\/blog\/en\/entity\/search\/\" data-id=\"http:\/\/data.wordlift.io\/wl0216\/entity\/search;http:\/\/dbpedia.org\/resource\/Search_engine_technology;http:\/\/en.dbpedia.org\/resource\/Search_engine_technology\" >search<\/a> query, the extraction algorithm shall identify a set of terms that represent at best the query. Before choosing <a href=\"https:\/\/maartengr.github.io\/KeyBERT\/index.html\">KeyBERT<\/a>, I have tested: YAKE!, an unsupervised keyword extraction library that supports multilingual content (you can give it a quick try <a href=\"http:\/\/yake.inesctec.pt\/demo\/user\">here<\/a>), <a href=\"https:\/\/huggingface.co\/bigscience\/T0_3B\">T Zero,<\/a> a large language model optimized for zero-shot tasks, and GPT-3. For the sake of this tutorial, we will analyze the category for Ray-Ban on the FarFetch website, a well-established British-Portuguese online luxury fashion retail brand.&nbsp;<\/p>\r\n\r\n\r\n\r\n<p class=\"has-text-align-center\"><img decoding=\"async\" width=\"624\" height=\"340.67073643216304\" src=\"https:\/\/lh5.googleusercontent.com\/YgXOZH5inGAtT1lgrMZZEpd_Wk97IxYcAK6ET8w7hfepPsnAwfJNmc6cVvIfnjH8wJjUXfXOmYu5kUZQYFk5FZUVCUvKUU0Tl0hyWWK_S1E333Urs8SLT7wCpQ9Lw1El9_AVcSpz\"><\/p>\r\n\r\n\r\n\r\n<p class=\"has-text-align-center\"><em>The category for Ray-Ban on FF<\/em><\/p>\r\n\r\n\r\n\r\n<h3 class=\"wp-block-heading\" id=\"search-demand\">1. Analyzing search demand<\/h3>\r\n\r\n\r\n\r\n<p>We always want to use <a class=\"wl-entity-page-link\" title=\"Artificial intelligence\" href=\"https:\/\/wordlift.io\/blog\/en\/entity\/artificial-intelligence\/\" data-id=\"http:\/\/data.wordlift.io\/wl0216\/entity\/artificial_intelligence;http:\/\/rdf.freebase.com\/ns\/m.0mkz;http:\/\/dbpedia.org\/resource\/Artificial_intelligence;http:\/\/data.wordlift.io\/wl0216\/entity\/artificial_intelligence_2;http:\/\/data.wordlift.io\/wl0216\/entity\/artificial_intelligence;http:\/\/data.wordlift.io\/wl0216\/entity\/artificial_intelligence_2;http:\/\/pt.dbpedia.org\/resource\/Intelig\u00eancia_artificial;http:\/\/hr.dbpedia.org\/resource\/Umjetna_inteligencija;http:\/\/hu.dbpedia.org\/resource\/Mesters\u00e9ges_intelligencia;http:\/\/id.dbpedia.org\/resource\/Kecerdasan_buatan;http:\/\/is.dbpedia.org\/resource\/Gervigreind;http:\/\/it.dbpedia.org\/resource\/Intelligenza_artificiale;http:\/\/ro.dbpedia.org\/resource\/Inteligen\u021b\u0103_artificial\u0103;http:\/\/be.dbpedia.org\/resource\/\u0428\u0442\u0443\u0447\u043d\u044b_\u0456\u043d\u0442\u044d\u043b\u0435\u043a\u0442;http:\/\/ru.dbpedia.org\/resource\/\u0418\u0441\u043a\u0443\u0441\u0441\u0442\u0432\u0435\u043d\u043d\u044b\u0439_\u0438\u043d\u0442\u0435\u043b\u043b\u0435\u043a\u0442;http:\/\/bg.dbpedia.org\/resource\/\u0418\u0437\u043a\u0443\u0441\u0442\u0432\u0435\u043d_\u0438\u043d\u0442\u0435\u043b\u0435\u043a\u0442;http:\/\/sk.dbpedia.org\/resource\/Umel\u00e1_inteligencia;http:\/\/sl.dbpedia.org\/resource\/Umetna_inteligenca;http:\/\/ca.dbpedia.org\/resource\/Intel\u00b7lig\u00e8ncia_artificial;http:\/\/sq.dbpedia.org\/resource\/Inteligjenca_artificiale;http:\/\/sr.dbpedia.org\/resource\/\u0412\u0458\u0435\u0448\u0442\u0430\u0447\u043a\u0430_\u0438\u043d\u0442\u0435\u043b\u0438\u0433\u0435\u043d\u0446\u0438\u0458\u0430;http:\/\/sv.dbpedia.org\/resource\/Artificiell_intelligens;http:\/\/cs.dbpedia.org\/resource\/Um\u011bl\u00e1_inteligence;http:\/\/da.dbpedia.org\/resource\/Kunstig_intelligens;http:\/\/tr.dbpedia.org\/resource\/Yapay_zek\u00e2;http:\/\/de.dbpedia.org\/resource\/K\u00fcnstliche_Intelligenz;http:\/\/lt.dbpedia.org\/resource\/Dirbtinis_intelektas;http:\/\/lv.dbpedia.org\/resource\/M\u0101ksl\u012bgais_intelekts;http:\/\/uk.dbpedia.org\/resource\/\u0428\u0442\u0443\u0447\u043d\u0438\u0439_\u0456\u043d\u0442\u0435\u043b\u0435\u043a\u0442;http:\/\/en.dbpedia.org\/resource\/Artificial_intelligence;http:\/\/es.dbpedia.org\/resource\/Inteligencia_artificial;http:\/\/et.dbpedia.org\/resource\/Tehisintellekt;http:\/\/nl.dbpedia.org\/resource\/Kunstmatige_intelligentie;http:\/\/no.dbpedia.org\/resource\/Kunstig_intelligens;http:\/\/fi.dbpedia.org\/resource\/Teko\u00e4ly;http:\/\/fr.dbpedia.org\/resource\/Intelligence_artificielle;http:\/\/pl.dbpedia.org\/resource\/Sztuczna_inteligencja\" >AI<\/a> to help humans connect and easily find what they are looking for. In other words, we need to use machines to augment human intelligence. To build a taxonomy that makes sense I envisioned the following human-AI interaction. We start by accessing queries from either the Google Search Console or SEMrush. For this tutorial, I have prepared the <a class=\"wl-entity-page-link\"  href=\"https:\/\/wordlift.io\/blog\/en\/entity\/keyword\/\" data-id=\"http:\/\/data.wordlift.io\/wl0216\/entity\/keyword;http:\/\/www.wikidata.org\/entity\/Q1289923\" >keyword<\/a> analysis behind the category page of FarFetch\u2019s Ray-Ban sunglasses in Google Drive (<a href=\"https:\/\/docs.google.com\/spreadsheets\/d\/1FJmsqqc1U0oOEkaqdykzL4Ds_Q20LqNfs-XGicVVKWI\/edit?usp=sharing\">here<\/a>). We will read this spreadsheet and extract the queries using BERTopic and KeyBERT.&nbsp;&nbsp;<\/p>\r\n\r\n\r\n\r\n<h4 class=\"wp-block-heading\" id=\"keybert\">KeyBERT<\/h4>\r\n\r\n\r\n\r\n<p><img decoding=\"async\" width=\"206\" height=\"103\" src=\"https:\/\/lh3.googleusercontent.com\/Pqg-t0rnE1mqJ5bIqesIdlVd4URSXfdKokMh6V2qrw-q5WdVYZfj8miB1um0tHjDhvsEe-S_BA-zAGtW4xBbTk89rrLKUPAZNWdxlkJ9eKXhLdurSDz0OQVc5GsxOmEFUp9T-fmv\"><\/p>\r\n\r\n\r\n\r\n<p>The advantages I found in KeyBERT, for this project, are the following:<\/p>\r\n\r\n\r\n\r\n<ul class=\"wp-block-list\">\r\n<li>It\u2019s faster than T Zero and less demanding in terms of memory (running <a href=\"https:\/\/huggingface.co\/bigscience\/T0_3B\">T0_3B<\/a> on Colab is challenging and time-consuming even when you set the runtime with high memory and GPU acceleration).&nbsp;&nbsp;<\/li>\r\n\r\n\r\n\r\n<li>It\u2019s based on the <a href=\"https:\/\/www.sbert.net\/\">SentenceTransformers<\/a> (SBERT) and this gives us the flexibility to replace the underlying model and work with any of the <a href=\"https:\/\/www.sbert.net\/docs\/pretrained_models.html#model-overview\">models available<\/a> in the HuggingFace Model Hub. This helps because it provides us with the flexibility to replace the model based on the language of the website. It also allows us to <a href=\"https:\/\/www.sbert.net\/docs\/training\/overview.html\">train our own model<\/a> to improve performances.<\/li>\r\n\r\n\r\n\r\n<li>It\u2019s simple, open source and very easy to use, also it allows us to work on both the analysis of the queries and the match between the terms in our controlled vocabulary and the titles of the products.&nbsp;<\/li>\r\n<\/ul>\r\n\r\n\r\n\r\n<p>If you have used SBERT you will immediately find yourself at ease with KeyBERT. The library uses BERT-embeddings and cosine similarity to extract terms from sentences (queries in our case). KeyBERT is a library by <a href=\"https:\/\/github.com\/MaartenGr\">Maarten Grootendorst<\/a> and you can immediately experiment with the <a href=\"https:\/\/www.charlywargnier.com\/post\/introducing-the-bert-keyword-extractor-a-streamlit-interface-for-keybert\">BERT Keyword extractor<\/a> developed by my dear friend Charly Wargnier ?.&nbsp;<\/p>\r\n\r\n\r\n\r\n<h3 class=\"wp-block-heading\" id=\"clustering\">2. Clustering queries (optional)<\/h3>\r\n\r\n\r\n\r\n<p>I used BERTopic to explore the queries. In the end, you can simply skip the <a class=\"wl-entity-page-link\" title=\"Cluster analysis\" href=\"https:\/\/wordlift.io\/blog\/en\/entity\/cluster-analysis\/\" data-id=\"http:\/\/data.wordlift.io\/wl0216\/entity\/cluster_analysis;http:\/\/rdf.freebase.com\/ns\/m.031f5p;http:\/\/dbpedia.org\/resource\/Cluster_analysis;http:\/\/de.dbpedia.org\/resource\/Clusteranalyse;http:\/\/ru.dbpedia.org\/resource\/\u041a\u043b\u0430\u0441\u0442\u0435\u0440\u043d\u044b\u0439_\u0430\u043d\u0430\u043b\u0438\u0437;http:\/\/pt.dbpedia.org\/resource\/Clustering;http:\/\/lv.dbpedia.org\/resource\/Klasteru_anal\u012bze;http:\/\/hr.dbpedia.org\/resource\/Grupiranje;http:\/\/fr.dbpedia.org\/resource\/Partitionnement_de_donn\u00e9es;http:\/\/hu.dbpedia.org\/resource\/Klaszteranal\u00edzis;http:\/\/uk.dbpedia.org\/resource\/\u041a\u043b\u0430\u0441\u0442\u0435\u0440\u043d\u0438\u0439_\u0430\u043d\u0430\u043b\u0456\u0437;http:\/\/sk.dbpedia.org\/resource\/Zhlukov\u00e1_anal\u00fdza;http:\/\/sl.dbpedia.org\/resource\/Grupiranje;http:\/\/ca.dbpedia.org\/resource\/Clusteritzaci\u00f3_de_dades;http:\/\/sv.dbpedia.org\/resource\/Klusteranalys_(datavetenskap);http:\/\/en.dbpedia.org\/resource\/Cluster_analysis;http:\/\/it.dbpedia.org\/resource\/Clustering;http:\/\/es.dbpedia.org\/resource\/Algoritmo_de_agrupamiento;http:\/\/et.dbpedia.org\/resource\/Klasteranal\u00fc\u00fcs;http:\/\/cs.dbpedia.org\/resource\/Shlukov\u00e1_anal\u00fdza;http:\/\/pl.dbpedia.org\/resource\/Analiza_skupie\u0144;http:\/\/nl.dbpedia.org\/resource\/Clusteranalyse\" >clustering<\/a> part, and directly translate queries into keywords using KeyBERT. The advantage of going through the clustering process is that you will be able to better understand the data. If you don\u2019t have enough knowledge of the domain I would suggest going through this step.&nbsp;&nbsp;<\/p>\r\n\r\n\r\n\r\n<p>There is another reason for clustering the data, we might want to analyze how queries change over time (this is particularly useful when products are affected by seasonality).<\/p>\r\n\r\n\r\n\r\n<h4 class=\"wp-block-heading\" id=\"topic-modelling\">Topic modeling<\/h4>\r\n\r\n\r\n\r\n<p>As anticipated we will also use another well-known library by <a href=\"https:\/\/github.com\/MaartenGr\">Maarten<\/a> called BERTopic for the clustering of the terms. BERTopic also is based on SBERT.<\/p>\r\n\r\n\r\n\r\n<p><img decoding=\"async\" width=\"191\" height=\"158\" src=\"https:\/\/lh3.googleusercontent.com\/UjvanHO5lHylqxssY9qKsSIkI56fsjMf1B-AnKTQdNGreRyMkkeObUemJLHy1BngfXaLaugEXmME5PxOePAfijJ8HAYfDqqBetmwP7WY_aFiG6nwjJQY8zXGuP0kBpfTy03QWpzq\"><\/p>\r\n\r\n\r\n\r\n<p><a href=\"https:\/\/maartengr.github.io\/BERTopic\/api\/bertopic.html\">BERTopic<\/a> is well documented and you can experiment with various settings. What I ended up using is the seed_topic_list parameter to have the model converge around a list of seed terms.&nbsp;<\/p>\r\n\r\n\r\n\r\n<p class=\"has-text-align-center\"><img decoding=\"async\" width=\"382\" height=\"310\" src=\"https:\/\/lh3.googleusercontent.com\/ndBwczSufJfnQf1lQUBAvic347aGU30HxsRoKORoCvniIyqpaFVhLN6XJxl3Sga36cBGrHPYKRqfbkvVKk5j6Ue74hy7AYnLg61WzIih8Iv2tvh4no9us4jA0siWjzPEIwUZvd7x\"><\/p>\r\n\r\n\r\n\r\n<p class=\"has-text-align-center\"><em>Clusters are visualized in a two-dimensional diagram<\/em>.<\/p>\r\n\r\n\r\n\r\n<p>You can use the various visualizations provided by the library to see the queries behind the page. If we look at the diagram below, while not perfectly organized, we can spot some potential subcategories worth creating.&nbsp;<\/p>\r\n\r\n\r\n\r\n<p class=\"has-text-align-center\"><img decoding=\"async\" width=\"624\" height=\"431\" src=\"https:\/\/lh3.googleusercontent.com\/eFBzJ8UGEyhl4Hb44WhwTJFxAODF_NGEqtS7hEFs4nd_Y9Sma5thn_ROYqn60yoXBRj5Aib01Vx9ynJL_PScF325cg-HTpAEzikvwLN4SKhhUGPbdkY8OSZ8celriQr2ziAC8fZV\"><\/p>\r\n\r\n\r\n\r\n<p class=\"has-text-align-center\"><em>A barchart of the clusters.<\/em><\/p>\r\n\r\n\r\n\r\n<p>The iconic wayfarer versus the classic aviators but also the shapes of the frames (round, circle glasses, etc.) seem to be valuable subcategories to be created.&nbsp;<\/p>\r\n\r\n\r\n\r\n<figure class=\"wp-block-table\"><table><tbody><tr><td class=\"has-text-align-center\" data-align=\"center\"><img decoding=\"async\" src=\"https:\/\/lh4.googleusercontent.com\/z1fSMao6mA1z6xTaA8GhDxHjPm3MEa08rD0-p8bb4P3yBhe5GNEmJew0nFZmhWYcdMMDyWGy5x4QI2yQyZrn0WsgyXtDZD0uRUudXH0asqQPV9Svpc1UxvWjMwh5NF9_gQqUzTai\" width=\"280\" height=\"187\"><\/td><td class=\"has-text-align-center\" data-align=\"center\"><img decoding=\"async\" src=\"https:\/\/lh5.googleusercontent.com\/y2spabZHyKQ_RzMzt0uq8cDTnlXuv9A-HUTUmN6LNDHkSGxPqAb2nNXYHVadNgTJPweAdWNGbzg5LpfjlPK87xLC84_zjMJN_E4hEI8R8mg1WEDZVhKloJPSjrtZAhmOmxHZovN_\" width=\"263\" height=\"188\"><\/td><\/tr><tr><td class=\"has-text-align-center\" data-align=\"center\">James Dean wearing the wayfarer&nbsp;<\/td><td class=\"has-text-align-center\" data-align=\"center\">Tom Cruise in Top Gun wearing the aviator&nbsp;<\/td><\/tr><\/tbody><\/table><\/figure>\r\n\r\n\r\n\r\n<h3 class=\"wp-block-heading\" id=\"extract-keywords\">3. Extract keywords<\/h3>\r\n\r\n\r\n\r\n<p>The core of the workflow, once we have gained a first insight using BERTopic, is really the extraction of keywords from the list of queries that we have obtained from SEMrush.&nbsp;<br>We do this by using the <em><strong>extract_kw<\/strong><\/em> function that will iterate through a pandas <span>data frame<\/span> and extract keywords using KeyBERT and its <a href=\"https:\/\/maartengr.github.io\/KeyBERT\/api\/keybert.html\">useful set of parameters<\/a>. KeyBERT is relatively fast (depending on the size of the dataset) and runs well on CPU and provides the accuracy we need for this kind of task.<\/p>\r\n\r\n\r\n\r\n<p class=\"has-text-align-center\"><img decoding=\"async\" width=\"624\" height=\"427\" src=\"https:\/\/lh5.googleusercontent.com\/dxjslq50yRQhBysBj9026xgoZNw8_TkZTN4UusU1YZSZRr5rY5IXpVcsBFpuyOyzenRGDX2i-POUL9qHs_iP6ByfY7bR6jZq_3-nu8vqW-pBeemY6tfzqXW_l7oAOmUtQ1qB1hvR\"><\/p>\r\n\r\n\r\n\r\n<p>We obtain from <em>extract_kw<\/em> a new <span>data<\/span> frame with two terms (<em>top_n=2<\/em>) for each query and we can also:<\/p>\r\n\r\n\r\n\r\n<ul class=\"wp-block-list\">\r\n<li>remove stop word for English (<em>_stop_words=\u201denglish\u201d<\/em> is the default),&nbsp;<\/li>\r\n\r\n\r\n\r\n<li>decide the length in words of the extracted keywords (<em>_ngram_range=(1,2)<\/em>),<\/li>\r\n\r\n\r\n\r\n<li>set the diversity of the extracted keywords (only works when&nbsp; <em>use_mmr=True<\/em>),<\/li>\r\n\r\n\r\n\r\n<li>use a list of terms to guide the extraction &#8211; as we did previously with BERTopic&nbsp; (<em>_seeded_keywords<\/em>),<\/li>\r\n\r\n\r\n\r\n<li>use a list of terms to match the keywords (<em>_candidates<\/em>), this will be particularly helpful in the next phase of our workflow.<\/li>\r\n<\/ul>\r\n\r\n\r\n\r\n<h4 class=\"wp-block-heading\" id=\"data-cleanup\">Data cleanup &#8211; the steps<\/h4>\r\n\r\n\r\n\r\n<p>Before using the keywords and analyzing the products in our category we\u2019ll need to do some cleanup of the extracted terms. We do this by:<\/p>\r\n\r\n\r\n\r\n<ol class=\"wp-block-list\">\r\n<li>Matching only terms above a certain threshold (prob_filter &gt;= .7)<\/li>\r\n\r\n\r\n\r\n<li>Combining all terms into a single list and remove any duplicated term<\/li>\r\n\r\n\r\n\r\n<li>Excluding stop words. These are site-specific and have to do with what people search for and what is relevant for faceting the products. We\u2019ll need to remove branded queries like \u201cray ban\u201d, things like \u201ceye sunglasses\u201d but also irrelevant keywords like \u201cdecal sunglasses\u201d.&nbsp;&nbsp;&nbsp;<\/li>\r\n<\/ol>\r\n\r\n\r\n\r\n<p class=\"has-text-align-center\"><img decoding=\"async\" src=\"https:\/\/lh6.googleusercontent.com\/ugFRdaN0vvo-T_K8Q_1HGK2peRDN3KkJ73VFosd6HaviYG7sknsjUB3xn1WcRWurMHEHHF7w4rqjsNwW85cR-Z0dFCw0yZyN96lk5A5hhko9Uk7u_8wC-iu78j2gsWTLsJMmVu4z\" width=\"624\" height=\"51\"><\/p>\r\n\r\n\r\n\r\n<h3 class=\"wp-block-heading\" id=\"list-of-products\">4. Analyze the list of products&nbsp;<\/h3>\r\n\r\n\r\n\r\n<p>We can visualize the resulting terms using a word cloud and we can finally use them to review the list of products in our category.&nbsp;<\/p>\r\n\r\n\r\n\r\n<p class=\"has-text-align-center\"><img decoding=\"async\" width=\"458\" height=\"458\" src=\"https:\/\/lh3.googleusercontent.com\/0A6bkYg6z4hETGgnHzfHJpdb1X3SH97G5UPM9O5eIFyE1cZSrSAATJPGDaoPgb8-wGy2ctvCSGwgS64eUEpheRRrPXIZU3ovSf1PgxyAjDo3BFUJx2BZTTbcNMFYG7SnCJr2rgXu\"><\/p>\r\n\r\n\r\n\r\n<p class=\"has-text-align-center\"><em>A word cloud of the extracted keywords.<\/em><\/p>\r\n\r\n\r\n\r\n<p>For this tutorial, a subset of the products is listed on the same Google Sheet (<a href=\"https:\/\/docs.google.com\/spreadsheets\/d\/1FJmsqqc1U0oOEkaqdykzL4Ds_Q20LqNfs-XGicVVKWI\/edit#gid=23687057\">here<\/a>) under the tab \u201cProducts\u201d. We can run the extraction using, once again KeyBERT, but this time we will use the <em>_candidates<\/em>&nbsp; parameter to let the model match the <a class=\"wl-entity-page-link\"  href=\"https:\/\/wordlift.io\/blog\/en\/entity\/text\/\" data-id=\"http:\/\/data.wordlift.io\/wl0216\/entity\/text;http:\/\/www.wikidata.org\/entity\/Q234460\" >text<\/a> with any of the previously identified keywords.&nbsp;<\/p>\r\n\r\n\r\n\r\n<p>We can run this analysis using either the title of the product page, the description of the product, or a combination of both.&nbsp;<\/p>\r\n\r\n\r\n\r\n<p>We will test it, first using the description of a pair of aviator sunglasses with hexagonal lenses.&nbsp;<\/p>\r\n\r\n\r\n\r\n<p class=\"has-text-align-center\"><img decoding=\"async\" width=\"624\" height=\"159\" src=\"https:\/\/lh3.googleusercontent.com\/DZ2XuIu_XA0PptzvsPCokgw8oWrhl6puwlO0hQnnXlOCB8VbIDlEy3wJeAi5j9crYTJNRn5ER0efYzG-EC79OPCfjqE5M_7aJjPTib2N8WySLqszvI3D8PszdxSMT14_nFCcCFRS\"><\/p>\r\n\r\n\r\n\r\n<p>Results are relevant as we obtain for this pair the following terms that well identify this product:<\/p>\r\n\r\n\r\n\r\n<p>[(&#8216;hexagonal sunglasses&#8217;, 0.6067),<\/p>\r\n\r\n\r\n\r\n<p>&nbsp;(&#8216;classic sunglasses&#8217;, 0.529),<\/p>\r\n\r\n\r\n\r\n<p>&nbsp;(&#8216;sunglasses gold&#8217;, 0.5166)]<\/p>\r\n\r\n\r\n\r\n<p>Using now one single line of code we can process all the products at once and, in our example, we will analyze the title only. Finally, we can look at the resulting data frame and, by looking, at all the terms extracted from the products list we will get:&nbsp;<\/p>\r\n\r\n\r\n\r\n<ul class=\"wp-block-list\">\r\n<li>Top N terms from the first extracted keyword (<em>kw_1<\/em>)\r\n<ul class=\"wp-block-list\">\r\n<li>Above a certain threshold (<em>[<span class=\"has-inline-color has-vivid-red-color\">&#8216;prob_1&#8217;<\/span>] &gt;= <span class=\"has-inline-color has-vivid-green-cyan-color\">0.5<\/span><\/em>)<\/li>\r\n\r\n\r\n\r\n<li>Sorted by count of associated products<\/li>\r\n<\/ul>\r\n<\/li>\r\n<\/ul>\r\n\r\n\r\n\r\n<p class=\"has-text-align-center\"><img decoding=\"async\" width=\"251\" height=\"189\" src=\"https:\/\/lh5.googleusercontent.com\/CcMY_-SSok5F7IKbTvfxLQ0EJeJ5eOyoZxhv6zeOVTcSIxBTEtfotPM4SYzBKLXDh1xIFU_04k_6HlPbV6VlJXdvT1M-SZj8ujUsUdUqleiqWAUWQXqhXdqVOLOeNuJ6DQESl_MD\"><\/p>\r\n\r\n\r\n\r\n<p>While numbers are irrelevant as we\u2019re dealing with a small subset of the products in our original category we can immediately see that we could add the following subcategories:&nbsp;<\/p>\r\n\r\n\r\n\r\n<ol class=\"wp-block-list\">\r\n<li>Hexagonal sunglasses<\/li>\r\n\r\n\r\n\r\n<li>Wayfarer&nbsp;<\/li>\r\n\r\n\r\n\r\n<li>Aviator<\/li>\r\n\r\n\r\n\r\n<li>Octagonal sunglasses<\/li>\r\n\r\n\r\n\r\n<li>Square sunglasses<\/li>\r\n<\/ol>\r\n\r\n\r\n\r\n<p>If we look at the secondary terms being extracted we might also find it useful to add:<\/p>\r\n\r\n\r\n\r\n<p>6. Prescription glasses<\/p>\r\n\r\n\r\n\r\n<h2 class=\"wp-block-heading\" id=\"conclusion\">Conclusions and future work<\/h2>\r\n\r\n\r\n\r\n<p>While I am happy with this initial version of this workflow there are a lot of areas of potential improvement and iteration. Let\u2019s review them:&nbsp;<\/p>\r\n\r\n\r\n\r\n<ol class=\"wp-block-list\">\r\n<li>Differentiating between hierarchized categories (ie. eyeglasses &gt; prescription glasses) and facets (product attributes like hexagonal or octagonal lenses).&nbsp;&nbsp;&nbsp;<\/li>\r\n\r\n\r\n\r\n<li>Adding key business metrics to the selection of the terms. We can weight some queries higher based on their likelihood of ranking. We can add a calculated score that factors&nbsp; multiple weighted inputs (i.e. the likelihood of ranking, the products we have stock, the expected conversion rate etc.).<\/li>\r\n\r\n\r\n\r\n<li>Checking first how new products would fit within the existing categories. While we explained how to detect <em>new sub-categories,<\/em> we can re-use the same methodology to match new products with categories that we already have. Before adding new sub-categories we shall always evaluate if, among the existing categories, there is already a good match.&nbsp;<\/li>\r\n\r\n\r\n\r\n<li>Considering a seasonality input, we might decide, for example, to highlight a sub-category related to skiing accessories during the winter and hide it during the summer.&nbsp;&nbsp;<\/li>\r\n\r\n\r\n\r\n<li>Matching the detected terms with already existing taxonomies. A category, in a <a href=\"https:\/\/wordlift.io\/blog\/en\/product-knowledge-graph\/\">Product Knowledge Graph (PKG)<\/a>, becomes an entity. But in a PKG we already have, in most cases, entities for materials, styles and other product-related attributes. So in these cases, turning terms into <a class=\"wl-entity-page-link\" title=\"ience is\" href=\"https:\/\/wordlift.io\/blog\/en\/entity\/what-is-an-entity\/\" data-id=\"http:\/\/data.wordlift.io\/wl0216\/entity\/entity;http:\/\/dbpedia.org\/resource\/Entity;http:\/\/rdf.freebase.com\/ns\/m.0bl9f;https:\/\/g.co\/kg\/m\/0bl9f;https:\/\/www.wikidata.org\/wiki\/Q35120;http:\/\/data.wordlift.io\/wl0216\/entity\/entity\" >entities<\/a> will be very useful.<\/li>\r\n\r\n\r\n\r\n<li>Working, using these detected terms, to instruct an NLG pipeline and create or improve introductory text paragraphs that can match the searcher intent. If we create the sub-category for the wayfarer we will also need a compelling intro text. We can generate this text to respond at best to queries we gathered around the wayfarer.<\/li>\r\n<\/ol>\r\n\r\n\r\n\r\n<p>Once again, always remember the magic is not in the technology that we use but in creating effective and continuous human-AI interactions.&nbsp;<\/p>\r\n\r\n\r\n\r\n<p><\/p>\r\n\r\n\r\n\r\n<p class=\"has-text-align-center\"><img decoding=\"async\" width=\"512\" height=\"512\" src=\"https:\/\/lh4.googleusercontent.com\/WfDfvuBVvl8TXDZbI-nHuWd5UYmNWoQ5xfqk0YrNnHbrLvN0QK2_05lVh32lmYBgtcJhwgfmhzG4Mi59FFgL8xIyXcI6bfjsZ60qi2b0j_5bBU7MBG7kmEyQyt8_9EjdReJD_al3\"><\/p>\r\n\r\n\r\n\r\n<p><\/p>\r\n\r\n\r\n\r\n\r\n","protected":false},"excerpt":{"rendered":"<p>Learn about category page optimization for e-commerce improving product category pages by using natural language processing. <\/p>\n","protected":false},"author":6,"featured_media":20609,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"wl_entities_gutenberg":"","_wlpage_enable":"","footnotes":""},"categories":[4202,8],"tags":[],"wl_entity_type":[30],"coauthors":[],"class_list":["post-20591","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-e-commerce","category-seo","wl_entity_type-article"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v22.6 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Category Page Optimization For E-commerce SEO - WordLift Blog<\/title>\n<meta name=\"description\" content=\"Learn about category page optimization for e-commerce improving product category pages by using natural language processing.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Category Page Optimization For E-commerce SEO\" \/>\n<meta property=\"og:description\" content=\"Learn about category page optimization for e-commerce improving product category pages by using natural language processing.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/\" \/>\n<meta property=\"og:site_name\" content=\"WordLift Blog\" \/>\n<meta property=\"article:published_time\" content=\"2022-02-11T10:00:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-04-17T08:59:50+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/wordlift.io\/blog\/en\/wp-content\/uploads\/sites\/3\/2022\/02\/Blog-Covers-17.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"1200\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Andrea Volpini\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:title\" content=\"Category Page Optimization For E-commerce SEO\" \/>\n<meta name=\"twitter:description\" content=\"Learn about category page optimization for e-commerce improving product category pages by using natural language processing.\" \/>\n<meta name=\"twitter:image\" content=\"https:\/\/wordlift.io\/blog\/en\/wp-content\/uploads\/sites\/3\/2022\/02\/Blog-Covers-17.jpg\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Andrea Volpini\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"16 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/\"},\"author\":{\"name\":\"Andrea Volpini\",\"@id\":\"https:\/\/wordlift.io\/blog\/en\/#\/schema\/person\/574352082cc71dab8d164410f1cabe0a\"},\"headline\":\"E-commerce SEO for Product Category Pages with the help of AI\",\"datePublished\":\"2022-02-11T10:00:00+00:00\",\"dateModified\":\"2023-04-17T08:59:50+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/\"},\"wordCount\":2942,\"publisher\":{\"@id\":\"https:\/\/wordlift.io\/blog\/en\/#organization\"},\"image\":{\"@id\":\"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/wordlift.io\/blog\/en\/wp-content\/uploads\/sites\/3\/2022\/02\/Blog-Covers-16.jpg\",\"articleSection\":[\"E-Commerce\",\"seo\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/\",\"url\":\"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/\",\"name\":\"Category Page Optimization For E-commerce SEO - WordLift Blog\",\"isPartOf\":{\"@id\":\"https:\/\/wordlift.io\/blog\/en\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/wordlift.io\/blog\/en\/wp-content\/uploads\/sites\/3\/2022\/02\/Blog-Covers-16.jpg\",\"datePublished\":\"2022-02-11T10:00:00+00:00\",\"dateModified\":\"2023-04-17T08:59:50+00:00\",\"description\":\"Learn about category page optimization for e-commerce improving product category pages by using natural language processing.\",\"breadcrumb\":{\"@id\":\"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/#primaryimage\",\"url\":\"https:\/\/wordlift.io\/blog\/en\/wp-content\/uploads\/sites\/3\/2022\/02\/Blog-Covers-16.jpg\",\"contentUrl\":\"https:\/\/wordlift.io\/blog\/en\/wp-content\/uploads\/sites\/3\/2022\/02\/Blog-Covers-16.jpg\",\"width\":1200,\"height\":1200},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Blog\",\"item\":\"https:\/\/wordlift.io\/blog\/en\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"E-commerce SEO for Product Category Pages with the help of AI\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/wordlift.io\/blog\/en\/#website\",\"url\":\"https:\/\/wordlift.io\/blog\/en\/\",\"name\":\"WordLift Blog\",\"description\":\"AI-Powered SEO\",\"publisher\":{\"@id\":\"https:\/\/wordlift.io\/blog\/en\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/wordlift.io\/blog\/en\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/wordlift.io\/blog\/en\/#organization\",\"name\":\"WordLift\",\"url\":\"https:\/\/wordlift.io\/blog\/en\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/wordlift.io\/blog\/en\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/mk0wordliftblog7j5te.kinstacdn.com\/wp-content\/uploads\/sites\/3\/2017\/04\/logo-1.png\",\"contentUrl\":\"https:\/\/mk0wordliftblog7j5te.kinstacdn.com\/wp-content\/uploads\/sites\/3\/2017\/04\/logo-1.png\",\"width\":152,\"height\":40,\"caption\":\"WordLift\"},\"image\":{\"@id\":\"https:\/\/wordlift.io\/blog\/en\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/wordlift.io\/blog\/en\/#\/schema\/person\/574352082cc71dab8d164410f1cabe0a\",\"name\":\"Andrea Volpini\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/wordlift.io\/blog\/en\/#\/schema\/person\/image\/466a1652833e48ca11c81b363eba7c25\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/6b9d3d311b50a8749201fe4b318907a8?s=96&d=mm&r=pg\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/6b9d3d311b50a8749201fe4b318907a8?s=96&d=mm&r=pg\",\"caption\":\"Andrea Volpini\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Category Page Optimization For E-commerce SEO - WordLift Blog","description":"Learn about category page optimization for e-commerce improving product category pages by using natural language processing.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/","og_locale":"en_US","og_type":"article","og_title":"Category Page Optimization For E-commerce SEO","og_description":"Learn about category page optimization for e-commerce improving product category pages by using natural language processing.","og_url":"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/","og_site_name":"WordLift Blog","article_published_time":"2022-02-11T10:00:00+00:00","article_modified_time":"2023-04-17T08:59:50+00:00","og_image":[{"width":1200,"height":1200,"url":"https:\/\/wordlift.io\/blog\/en\/wp-content\/uploads\/sites\/3\/2022\/02\/Blog-Covers-17.jpg","type":"image\/jpeg"}],"author":"Andrea Volpini","twitter_card":"summary_large_image","twitter_title":"Category Page Optimization For E-commerce SEO","twitter_description":"Learn about category page optimization for e-commerce improving product category pages by using natural language processing.","twitter_image":"https:\/\/wordlift.io\/blog\/en\/wp-content\/uploads\/sites\/3\/2022\/02\/Blog-Covers-17.jpg","twitter_misc":{"Written by":"Andrea Volpini","Est. reading time":"16 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/#article","isPartOf":{"@id":"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/"},"author":{"name":"Andrea Volpini","@id":"https:\/\/wordlift.io\/blog\/en\/#\/schema\/person\/574352082cc71dab8d164410f1cabe0a"},"headline":"E-commerce SEO for Product Category Pages with the help of AI","datePublished":"2022-02-11T10:00:00+00:00","dateModified":"2023-04-17T08:59:50+00:00","mainEntityOfPage":{"@id":"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/"},"wordCount":2942,"publisher":{"@id":"https:\/\/wordlift.io\/blog\/en\/#organization"},"image":{"@id":"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/#primaryimage"},"thumbnailUrl":"https:\/\/wordlift.io\/blog\/en\/wp-content\/uploads\/sites\/3\/2022\/02\/Blog-Covers-16.jpg","articleSection":["E-Commerce","seo"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/","url":"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/","name":"Category Page Optimization For E-commerce SEO - WordLift Blog","isPartOf":{"@id":"https:\/\/wordlift.io\/blog\/en\/#website"},"primaryImageOfPage":{"@id":"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/#primaryimage"},"image":{"@id":"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/#primaryimage"},"thumbnailUrl":"https:\/\/wordlift.io\/blog\/en\/wp-content\/uploads\/sites\/3\/2022\/02\/Blog-Covers-16.jpg","datePublished":"2022-02-11T10:00:00+00:00","dateModified":"2023-04-17T08:59:50+00:00","description":"Learn about category page optimization for e-commerce improving product category pages by using natural language processing.","breadcrumb":{"@id":"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/#primaryimage","url":"https:\/\/wordlift.io\/blog\/en\/wp-content\/uploads\/sites\/3\/2022\/02\/Blog-Covers-16.jpg","contentUrl":"https:\/\/wordlift.io\/blog\/en\/wp-content\/uploads\/sites\/3\/2022\/02\/Blog-Covers-16.jpg","width":1200,"height":1200},{"@type":"BreadcrumbList","@id":"https:\/\/wordlift.io\/blog\/en\/category-page-optimization-for-ecommerce\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Blog","item":"https:\/\/wordlift.io\/blog\/en\/"},{"@type":"ListItem","position":2,"name":"E-commerce SEO for Product Category Pages with the help of AI"}]},{"@type":"WebSite","@id":"https:\/\/wordlift.io\/blog\/en\/#website","url":"https:\/\/wordlift.io\/blog\/en\/","name":"WordLift Blog","description":"AI-Powered SEO","publisher":{"@id":"https:\/\/wordlift.io\/blog\/en\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/wordlift.io\/blog\/en\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/wordlift.io\/blog\/en\/#organization","name":"WordLift","url":"https:\/\/wordlift.io\/blog\/en\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/wordlift.io\/blog\/en\/#\/schema\/logo\/image\/","url":"https:\/\/mk0wordliftblog7j5te.kinstacdn.com\/wp-content\/uploads\/sites\/3\/2017\/04\/logo-1.png","contentUrl":"https:\/\/mk0wordliftblog7j5te.kinstacdn.com\/wp-content\/uploads\/sites\/3\/2017\/04\/logo-1.png","width":152,"height":40,"caption":"WordLift"},"image":{"@id":"https:\/\/wordlift.io\/blog\/en\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/wordlift.io\/blog\/en\/#\/schema\/person\/574352082cc71dab8d164410f1cabe0a","name":"Andrea Volpini","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/wordlift.io\/blog\/en\/#\/schema\/person\/image\/466a1652833e48ca11c81b363eba7c25","url":"https:\/\/secure.gravatar.com\/avatar\/6b9d3d311b50a8749201fe4b318907a8?s=96&d=mm&r=pg","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/6b9d3d311b50a8749201fe4b318907a8?s=96&d=mm&r=pg","caption":"Andrea Volpini"}}]}},"_wl_alt_label":[],"wl:entity_url":"http:\/\/data.wordlift.io\/wl0216\/post\/e-commerce-seo-for-product-category-pages-with-the-help-of-ai-20591","_links":{"self":[{"href":"https:\/\/wordlift.io\/blog\/en\/wp-json\/wp\/v2\/posts\/20591"}],"collection":[{"href":"https:\/\/wordlift.io\/blog\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/wordlift.io\/blog\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/wordlift.io\/blog\/en\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/wordlift.io\/blog\/en\/wp-json\/wp\/v2\/comments?post=20591"}],"version-history":[{"count":26,"href":"https:\/\/wordlift.io\/blog\/en\/wp-json\/wp\/v2\/posts\/20591\/revisions"}],"predecessor-version":[{"id":24526,"href":"https:\/\/wordlift.io\/blog\/en\/wp-json\/wp\/v2\/posts\/20591\/revisions\/24526"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/wordlift.io\/blog\/en\/wp-json\/wp\/v2\/media\/20609"}],"wp:attachment":[{"href":"https:\/\/wordlift.io\/blog\/en\/wp-json\/wp\/v2\/media?parent=20591"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wordlift.io\/blog\/en\/wp-json\/wp\/v2\/categories?post=20591"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wordlift.io\/blog\/en\/wp-json\/wp\/v2\/tags?post=20591"},{"taxonomy":"wl_entity_type","embeddable":true,"href":"https:\/\/wordlift.io\/blog\/en\/wp-json\/wp\/v2\/wl_entity_type?post=20591"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/wordlift.io\/blog\/en\/wp-json\/wp\/v2\/coauthors?post=20591"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}