Download Website H1 Tag and Meta Title/Meta Description - Using Python3/Beautiful Soup4 - HTML parser - Jacky Yuan | Digital Marketing Consultant
16040
post-template-default,single,single-post,postid-16040,single-format-standard,ajax_fade,page_not_loaded,,qode-title-hidden,qode_grid_1300,qode-content-sidebar-responsive,qode-theme-ver-9.2,wpb-js-composer js-comp-ver-4.11.2.1,vc_responsive

Download Website H1 Tag and Meta Title/Meta Description – Using Python3/Beautiful Soup4 – HTML parser

S: One of the fundamental jobs for SEO team is managing 80+ clients’ website H1 tags and metadata. Since our agency does not develop the websites for clients, a majority of clients’ H1 tags and metadata is changing without notifications to SEO team. And we got complains from clients why we act proactively.

T: Monitor Clients’ Website “H1 Tag and metadata” every day and send email notifications when the data changed.

A:

Writing Python3, with Beautiful Soup Library to script website data. Then put the code on Amazon AWS EC2 server, create the crontab task, and run it everyday morning.

  • First Scripts: Send “Meta Data and H1 Tags” as the text file daily. Send data to website platforms.
  • Second Scripts: Compare previous date data. Send alerts if data changed.

R:

  • Increase clients’ satisfaction.
  • Save 10+ hours for SEO team every week.
  • Further, Implement the scripts to SEM team.

jackyyuan-python

First Scripts: Send “Meta Data and H1 Tags” as the text file daily. Send data to website platforms.


import urllib.request, urllib.parse, urllib.error
from urllib.request import urlopen
from bs4 import BeautifulSoup
import ssl
import re
import numpy
import os.path
import smtplib
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from email.mime.base import MIMEBase
from email import encoders
# Ignore SSL certificate errors
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
</pre></pre>

fname = "website.txt"
f=open(fname,'r')
fout = open('website-meta-data.txt','w')
#for security resaons, I am unable to show the entire codes. If you need the full codes, you could leave your emails below.


import urllib.request, urllib.parse, urllib.error
from urllib.request import urlopen
from bs4 import BeautifulSoup
import ssl
import re
import numpy
import os.path
import smtplib
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from email.mime.base import MIMEBase
from email import encoders
# Ignore SSL certificate errors
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE

#fname = input('Enter File:')
def first_check_url():

def second_check_url():

#for security resaons, I am unable to show the entire codes. If you need the full codes, you could leave your emails below.