Close Menu
TechsiansTechsians
    Facebook X (Twitter) Instagram
    TechsiansTechsians
    • Home
    • App Development
    • Technology
    • Mobile Review
    • Web Design
    • Software
    • Web Design
    • Website
    TechsiansTechsians
    Home»News»Automated SEO Audits with Slack + Python
    News

    Automated SEO Audits with Slack + Python

    ChesterBy ChesterMay 11, 2021
    Facebook Twitter Pinterest LinkedIn Tumblr Email
    Share
    Facebook Twitter LinkedIn Pinterest Email

    In this age of extreme SEO competition, performing scheduled SEO audits has become quite essential in sustaining consistent traffic and page ranks.

    Of course, there are many available SEO Audit tools from various well-known platforms, but they are costly and unnecessary burdens to your digital marketing budget.

    At PEMAVOR, we’d like to share our expertise to fulfill your needs for digital marketing.

    So, here we are with another terrific Python script that will make your life easier!

    With Python: Compare the Content Script, we’ve shown a practical way of comparing your content against your strongest competitor.

    Then we opened the door a bit more, Python Autosuggest Trends and Python Semantic Keyword Clustering have enabled you to come up with keyword ideas without paying extra fees. Well, after all, we’re performance marketing experts, and we don’t want you to pay extra for anything, neither PPC nor SEO.

    Now, it’s time to take a massive step into the mysterious and practical world of Python.

    I will now show you how you can set up your SEO monitoring solution in Slack using three audit scripts.

    ✔️ Add settings to Slack for notifications and file uploads

    ✔️ Audit Job #1: “Sitemap Status Code Checker”
    Report the number of cases with status codes different than 20x.
    Attach URL+Bad Status Code as File to the message.

    ✔️ Audit Job #2: “Internal Link Checker”
    Check all internal links found on the website – report the number of cases with bad status codes.
    Attach file for bad cases with URL where the link was found, the link URL, the link status code, and the link anchor text.

    ✔️ Audit Job #3: “Missing Meta Description Checker”
    Check for missing meta description on all URLs – report the number of cases
    Attach URLs with missing meta description as file

    The running example below has two more SEO audit scripts, as I believe that you’ll get many Python-SEO audit solutions as long as you’re creative. You can automate almost everything in Python.

    Monitoring App in Slack

    At first, you’ll need a Slack environment. The app has a free plan, so it should be fine for now.

    1. Now that you have Slack go to this link and create a new app.
    2. Click on “Create new app.”
    3. Select an App name, e.g., SEO Audit, and select your Slack workspace
    4. You need to add features for notifications and files out of your Python script. Go to “OAuth & Permissions”
    5. Under “Bot Token Scopes,” add the following OAuth Scopes:

    files:write
    channels:join
    chat:write

    1. Now, click “install to the workspace,” and you’ll see “OAuth Access Token.” You need to copy and paste in your Python script.
    2. So, the Slack part is almost finished. Choose a channel where you want your messages; just click the “Add apps” menu and look for your newly created app.

    3 Simple SEO Audits in Python

    As I already mentioned, this is just a blueprint on how to create your SEO audit. You can add as many routines as you wish.

    Just don’t forget to change the sitemap URL and add your own Slack OAuth Access Token, then you’re more than ready.

    Here is the Python code:

    # Pemavor.com SEO Monitoring with Slack Notifications

    # Author: Stefan Neefischer

    import requests

    from urllib.request import urlparse, urljoin

    from bs4 import BeautifulSoup

    import advertools as adv

    import sys

    import json

    import time

    import pandas as pd

    import warnings

    warnings.filterwarnings(“ignore”)

    def slack_notification_message(slack_token,slack_channel,message):

    data = {

    ‘token’: slack_token,

    ‘channel’: slack_channel,

    ‘text’: message

    }

    url_chat=’https://slack.com/api/chat.postMessage’

    response = requests.post(url=url_chat,data=data)

    def slack_notification_file(slack_token,slack_channel,filename,filetype):

    # link to files.upload method

    url = “https://slack.com/api/files.upload”

    querystring = {“token”:slack_token}

    payload = { “channels”:slack_channel}

    file_upload = { “file”:(filename, open(filename, ‘rb’),filetype) }

    headers = { “Content-Type”: “multipart/form-data”, }

    response = requests.post(url, data=payload, params=querystring, files=file_upload)

    def getStatuscode(url):

    try:

    headers = {‘User-Agent’: ‘Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36’}

    r = requests.get(url, headers=headers, verify=False,timeout=25, allow_redirects=False) # it is faster to only request the header

    soup = BeautifulSoup(r.text)

    metas = soup.find_all(‘meta’)

    description=[ meta.attrs[‘content’] for meta in metas if ‘name’ in meta.attrs and meta.attrs[‘name’] == ‘description’ ]

    if len(description)>0:

    des=1

    else:

    des=-1

    return r.status_code,des

    except:

    return -1,-1

    def is_valid(url):

    “””

    Checks whether `url` is a valid URL.

    “””

    parsed = urlparse(url)

    return bool(parsed.netloc) and bool(parsed.scheme)

    def get_all_website_links(url):

    “””

    Returns all URLs that is found on `url` in which it belongs to the same website

    “””

    # all URLs of `url`

    internal_urls = list()

    # domain name of the URL without the protocol

    domain_name = urlparse(url).netloc

    headers = {‘User-Agent’: ‘Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36’}

    r_content = requests.get(url, headers=headers, verify=False, timeout=25, allow_redirects=False).content

    soup = BeautifulSoup(r_content, “html.parser”)

    for a_tag in soup.findAll(“a”):

    href = a_tag.attrs.get(“href”)

    #print(a_tag.string)

    if href == “” or href is None:

    # href empty tag

    continue

    # join the URL if it’s relative (not absolute link)

    href = urljoin(url, href)

    parsed_href = urlparse(href)

    # remove URL GET parameters, URL fragments, etc.

    href = parsed_href.scheme + “://” + parsed_href.netloc + parsed_href.path

    if not is_valid(href):

    # not a valid URL

    continue

    if href in internal_urls:

    # already in the set

    continue

    if domain_name not in href:

    # external link

    continue

    internal_urls.append([href,a_tag.string])

    return internal_urls

    def get_sitemap_urls(site):

    sitemap = adv.sitemap_to_df(site)

    sitemap_urls = sitemap[‘loc’].dropna().to_list()

    return sitemap_urls

    def  sitemap_internallink_status_code_checker(site,SLEEP,slack_token,slack_channel):

    print(“Start scrapping internal links for all sitemap urls”)

    sitemap_urls = get_sitemap_urls(site)

    sub_links_dict = dict()

    for url in sitemap_urls:

    sub_links = get_all_website_links(url)

    sub_links_dict[url] = list(sub_links)

    print(“checking status code and description”)

    scrapped_url=dict()

    description_url=dict()

    url_statuscodes = []

    for link in sub_links_dict.keys():

    int_link_list=sub_links_dict[link]

    for int_link in int_link_list:

    internal_link=int_link[0]

    #print(internal_link)

    linktext=int_link[1]

    #print(linktext)

    if internal_link in scrapped_url.keys():

    check = [link,internal_link,linktext,scrapped_url[internal_link],description_url[internal_link]]

    else:

    linkstatus,descriptionstatus=getStatuscode(internal_link)

    scrapped_url[internal_link]=linkstatus

    description_url[internal_link]=descriptionstatus

    check = [link,internal_link,linktext,linkstatus,descriptionstatus]

    time.sleep(SLEEP)

    url_statuscodes.append(check)

    url_statuscodes_df=pd.DataFrame(url_statuscodes,columns=[“url”,”internal_link”,”link_text”,”status_code”,”description_status”])

    #check status code for all sitemap urls

    sitemap_statuscodes=[]

    for url in sitemap_urls:

    if url in scrapped_url.keys():

    check=[url,scrapped_url[url]]

    else:

    linkstatus,descriptionstatus=getStatuscode(url)

    check=[url,linkstatus]

    time.sleep(SLEEP)

    sitemap_statuscodes.append(check)

    sitemap_statuscodes_df=pd.DataFrame(sitemap_statuscodes,columns=[“url”,”status_code”])

    # statitics and then send to slack

    strstatus=””

    df_internallink_status=url_statuscodes_df[url_statuscodes_df[“status_code”]!=200]

    if len(df_internallink_status)>0:

    df_internallink_status=df_internallink_status[[“url”,”internal_link”,”link_text”,”status_code”]]

    df_internallink_status[“status_group”]=(df_internallink_status[‘status_code’] / 100).astype(int) *100

    for status in df_internallink_status[“status_group”].unique():

    ststatus=f'{status}’

    noUrls=len(df_internallink_status[df_internallink_status[“status_group”]==status])

    sts=ststatus[:-1] + ‘X’

    if sts==’X’:

    sts=”-1″

    strstatus=f”>*{noUrls}* internal link with status code *{sts}*\n” + strstatus

    df_internallink_status=df_internallink_status[[“url”,”internal_link”,”link_text”,”status_code”]]

    df_internallink_status.to_csv(“internallinks.csv”,index=False)

    else:

    strstatus=”>*Great news!*, There is no internal links with bad status code\n”

    strdescription=””

    df_description=url_statuscodes_df[url_statuscodes_df[“description_status”]==-1]

    if len(df_description)>0:

    df_description=df_description[[“internal_link”,”status_code”,”description_status”]]

    df_description=df_description.drop_duplicates(subset = [“internal_link”])

    df_description.rename(columns={‘internal_link’: ‘url’}, inplace=True)

    df_description.to_csv(“linksdescription.csv”,index=False)

    lendesc=len(df_description)

    strdescription=f”>*{lendesc}* url that don’t have *meta description*.\n”

    else:

    strdescription=”>*Great news!*, There is no url that don’t have *meta description*\n”

    sitemapstatus=””

    df_sitemap_status=sitemap_statuscodes_df[sitemap_statuscodes_df[“status_code”]!=200]

    if len(df_sitemap_status)>0:

    df_sitemap_status=df_sitemap_status[[“url”,”status_code”]]

    df_sitemap_status[“status_group”]=(df_sitemap_status[‘status_code’] / 100).astype(int) *100

    for status in df_sitemap_status[“status_group”].unique():

    ststatus=f'{status}’

    noUrls=len(df_sitemap_status[df_sitemap_status[“status_group”]==status])

    sts=ststatus[:-1] + ‘X’

    if sts==’X’:

    sts=”-1″

    sitemapstatus=f”>*{noUrls}* url with status code *{sts}*\n” + sitemapstatus

    df_sitemap_status=df_sitemap_status[[“url”,”status_code”]]

    df_sitemap_status.to_csv(“sitemaplinks.csv”,index=False)

    else:

    sitemapstatus=”>*Great news!*, There is no url in sitemap with bad status code\n”

    if (len(df_sitemap_status) + len(df_internallink_status) + len(df_description))>0:

    message=f”After analysing {site} sitemap: \n”+strstatus+strdescription+sitemapstatus+”For more details see the attachement files.”

    else:

    message=f”After analysing {site} sitemap: \n”+strstatus+strdescription+sitemapstatus

    print(“send slack notifications”)

    #send notification to slack

    slack_notification_message(slack_token,slack_channel,message)

    if len(df_sitemap_status)>0:

    slack_notification_file(slack_token,slack_channel,”sitemaplinks.csv”,”text/csv”)

    if len(df_internallink_status)>0:

    slack_notification_file(slack_token,slack_channel,”internallinks.csv”,”text/csv”)

    if len(df_description)>0:

    slack_notification_file(slack_token,slack_channel,”linksdescription.csv”,”text/csv”)

    # Enter your XML Sitemap

    sitemap = “https://www.pemavor.com/sitemap.xml”

    SLEEP = 0.5 # Time in seconds the script should wait between requests

    #————————————————————————-

    # Enter your slack OAutch token here

    slack_token = “XXXX-XXXXXXXX-XXXXXX-XXXXXXX”

    # Change slack channel to your target one

    slack_channel= “SEO Monitoring”

    sitemap_internallink_status_code_checker(sitemap,SLEEP,slack_token,slack_channel)

    Where to Run and Schedule your Scripts?

    • It’s better you host your script in the Cloud. We use Cloud Functions or Cloud Runs that are triggered by Pub/Sub
    • Or, you can simply use a small virtual server which is provided by many web hosting services. As they generally run on Linux, add your Python code there and schedule it using the good old crontab
    • RaspberryPi is also a good solution if you want to hack around a little bit. You can run your own home-based Linux server 24×7. It is pretty cheap, around 60$ and mobile, so that you can place and hide it somewhere, maybe from your wife ☺. I say it is a perfect project for Covid lockdowns!

    Quite fun – not only in boring Covid-19 times: Automating your stuff with a Raspberry Pi home server

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Chester
    • Website

    Recent Posts

    8 Must-Visit Restaurants in Woodbridge, VA for A Great Dining Experience

    April 1, 2026

    How AI Detector Free Platforms Are Changing Content Evaluation

    March 24, 2026

    Why free proxies are losing their relevance

    March 21, 2026

    Remote ID for Recreational Flyers: Compliance Without Confusion About SDI Cost

    March 19, 2026
    Categories
    • App Development
    • Apps
    • Auto Mobile
    • Business
    • Gadgets
    • Game
    • Law
    • Mobile Review
    • News
    • Social Media
    • Software
    • Software Development
    • Technology
    • Web Design
    • Web Development
    • Website
    Facebook X (Twitter) Instagram Pinterest
    • Privacy Policy
    • Contact us
    Techsians.com © © 2026, All Rights Reserved

    Type above and press Enter to search. Press Esc to cancel.