Long live AI, let’s scale the generation of meta descriptions with our adorable robot [CODE IS HERE]
So here is how everything works in the code linked to this article.
- We start with a CSV that I generated using the WooRank’s crawler (here you can tweak the code and use any CSV that helps you detect where on the site MDs are missing and where it can be useful to add them); the file provided in the code has been made available on Google Drive (this way we can always look at the data before running the script).
- We analyze the data from the crawler and build a dataframe using Pandas.
- We then choose what URLs are more critical: in the code provided I basically work on the analysis of the wordlift.io website and focus only on content from the English blog that has already a ranking position. Feel free to play with the Pandas filters and to infuse your own SEO knowledge and experience to the script.
- We then crawl each page (and here you might want to define the CSS class that the site uses in the HTML to detect the body of the article – hence preventing you from analyzing menus and other unnecessary elements in the page).
- We ask BERT (with a vanilla configuration that you can fine-tune) to generate a summary for each page and to write it on a csv file.
- With the resulting CSV we can head back to our beloved CMS and find the best way to import the data (you might want to curate BERT’s suggestions before actually going live with it – once again – most of the cases we can do better then the machine).
Super easy, not too intensive in computational terms and…environmentally friendly ?
Have fun playing with it! Always remember, it is a robot friend and not a real replacement of your precious work. BERT can do the heavy lifting of reading the page and highlighting what matters the most but it might still fail in getting the right length or in adding the proper CTA (i.e. “read more to find …”).