Posted in python
19
1:30 am, April 4, 2021

python extract title tag from url and html using regex

this will extract the title tag as text from the url and the title tag in the following python script

Python

import re
from urllib.request import urlopen
url = "http://olympus.realpython.org/profiles/dionysus"
page = urlopen(url)
html = page.read().decode("utf-8")
pattern = "<title.*?>.*?</title.*?>"
match_results = re.search(pattern, html, re.IGNORECASE)
title = match_results.group()
title = re.sub("<.*?>", "", title) # Remove HTML tags
print(title)

View Statistics
This Week
38
This Month
246
This Year
0

No Items Found.

Add Comment
Type in a Nick Name here
 
Search Code
Search Code by entering your search text above.
Welcome

This is my test area for webdev. I keep a collection of code snippits here, mostly for my reference. Also if i find a good site, i usually add it here.

Join me on Substack if you want me to send you a collection of the things i have done or found or read for the week. Or follow me on twitter if you prefer, i dont post much but i probably should!

❤👩‍💻🕹

Random Quote
Our brains are wired to find things we are looking for - if your always cynical or waiting for things to go wrong, then your life will reflect that. On the other hand, having a positive outlook on life will bring you joy and provide you with inspiration when you least expect it ☀️🍉🌴
Unknown