Posted in python
19
1:30 am, April 4, 2021

python extract title tag from url and html using regex

this will extract the title tag as text from the url and the title tag in the following python script

Python

import re
from urllib.request import urlopen
url = "http://olympus.realpython.org/profiles/dionysus"
page = urlopen(url)
html = page.read().decode("utf-8")
pattern = "<title.*?>.*?</title.*?>"
match_results = re.search(pattern, html, re.IGNORECASE)
title = match_results.group()
title = re.sub("<.*?>", "", title) # Remove HTML tags
print(title)

View Statistics
This Week
49
This Month
264
This Year
0

No Items Found.

Add Comment
Type in a Nick Name here
 
Search Code
Search Code by entering your search text above.
Welcome

This is my test area for webdev. I keep a collection of code snippits here, mostly for my reference. Also if i find a good site, i usually add it here.

Join me on Substack if you want me to send you a collection of the things i have done or found or read for the week. Or follow me on twitter if you prefer, i dont post much but i probably should!

❤👩‍💻🕹

Random Quote
For a long time it had seemed to me that life was about to begin - real life. But there was always some obstacle in the way, something to be gotten through first, some unfinished business, time still to be served, or a debt to be paid. Then life would begin. At last it dawned on me that these obstacles were my life.
Alfred D. Souza