Posted in python
19
1:30 am, April 4, 2021

python extract title tag from url and html using regex

this will extract the title tag as text from the url and the title tag in the following python script

Python

import re
from urllib.request import urlopen
url = "http://olympus.realpython.org/profiles/dionysus"
page = urlopen(url)
html = page.read().decode("utf-8")
pattern = "<title.*?>.*?</title.*?>"
match_results = re.search(pattern, html, re.IGNORECASE)
title = match_results.group()
title = re.sub("<.*?>", "", title) # Remove HTML tags
print(title)

View Statistics
This Week
63
This Month
318
This Year
0

No Items Found.

Add Comment
Type in a Nick Name here
 
Search Code
Search Code by entering your search text above.
Welcome

This is my test area for webdev. I keep a collection of code snippits here, mostly for my reference. Also if i find a good site, i usually add it here.

Join me on Substack if you want me to send you a collection of the things i have done or found or read for the week. Or follow me on twitter if you prefer, i dont post much but i probably should!

❤👩‍💻🕹

Random Quote
All of humanity's problems stem from man's inability to sit quietly in a room alone, wrote Blaise Pascal.
Unknown