by Alex Galea
Search engine optimization (SEO) requires a variety of technical considerations, such as page titles, redirects and structured data. With Python we can build a scalable pipeline to extract and audit this data from web pages. We’ll show how this (and more) can be done using a Jupyter Notebook!
Web scraping technologies allow us (at Ayima) to extract on-page data from our client’s sites at scale. Over the last couple years, we’ve built a collection of tools that are regularly used to audit large sets of pages. Oftentimes we are interested in well-known SEO data like page titles and meta descriptions, however there’s a ton of other important data we look at as well. This includes meta robots tags, canonical URLs, redirects, structured data and (surprisingly) facets!
About the Author
Author website: https://medium.com/@galea