Posted in python
19
1:30 am, April 4, 2021

python extract title tag from url and html using regex

this will extract the title tag as text from the url and the title tag in the following python script

Python

import re
from urllib.request import urlopen
url = "http://olympus.realpython.org/profiles/dionysus"
page = urlopen(url)
html = page.read().decode("utf-8")
pattern = "<title.*?>.*?</title.*?>"
match_results = re.search(pattern, html, re.IGNORECASE)
title = match_results.group()
title = re.sub("<.*?>", "", title) # Remove HTML tags
print(title)

View Statistics
This Week
37
This Month
97
This Year
0

No Items Found.

Add Comment
Type in a Nick Name here
 
Search Code
Search Code by entering your search text above.
Welcome

This is my test area for webdev. I keep a collection of code snippits here, mostly for my reference. Also if i find a good site, i usually add it here.

Join me on Substack if you want me to send you a collection of the things i have done or found or read for the week. Or follow me on twitter if you prefer, i dont post much but i probably should!

❤👩‍💻🕹

Random Quote
“We ought to take outdoor walks in order that the mind may be strengthened and refreshed by the open air and much breathing.".
Seneca