Posted in python
19
1:30 am, April 4, 2021

python extract title tag from url and html using regex

this will extract the title tag as text from the url and the title tag in the following python script

Python

import re
from urllib.request import urlopen
url = "http://olympus.realpython.org/profiles/dionysus"
page = urlopen(url)
html = page.read().decode("utf-8")
pattern = "<title.*?>.*?</title.*?>"
match_results = re.search(pattern, html, re.IGNORECASE)
title = match_results.group()
title = re.sub("<.*?>", "", title) # Remove HTML tags
print(title)

View Statistics
This Week
64
This Month
264
This Year
0

No Items Found.

Add Comment
Type in a Nick Name here
 
Search Code
Search Code by entering your search text above.
Welcome

This is my test area for webdev. I keep a collection of code snippits here, mostly for my reference. Also if i find a good site, i usually add it here.

Join me on Substack if you want me to send you a collection of the things i have done or found or read for the week. Or follow me on twitter if you prefer, i dont post much but i probably should!

❤👩‍💻🕹

Random Quote
When I realized that, no individual step is hard in any process. Building this airport I'm standing in right now started with a guy writing the architectural plans on paper. That's not hard for him to do. Then laying the first beam isn't had. The whole thing is really hard. So, just take each step kind of piece by piece and when I was able to do that and stop trying to chase this prize and started putting in the work, things just started coming together.
Unknown