In ORACLE 10G, we can simply use REGEXP_REPLACE to extract all data between HTML TAGS
Examples
--------------------------------------------------------------------------------------------------------------
WITH T AS
(
SELECT
'<HTML><HEAD>THIS IS HEAD.</HEAD><BODY>THIS IS BODY.<P>THIS IS P.</P><H1>THIS IS H1.</H1></BODY></HTML>' STR
FROM
DUAL
)
SELECT REGEXP_REPLACE(STR,'<.*?>') FROM T
WITH T AS
(
SELECT
'<HTML><HEAD>THIS IS HEAD.</HEAD><BODY>THIS IS BODY.<P>THIS IS P.</P><H1>THIS IS H1.</H1></BODY></HTML>' STR
FROM
DUAL
)
SELECT REGEXP_REPLACE(STR, '<[^>]+>') FROM T
ORACLE 11G has provided us a simple way to extract text between a specified HTML tag.
Example
--------------------------------------------------------------------------------------------------------------
WITH T AS
(
SELECT
'<HTML><HEAD>THIS IS HEAD.</HEAD><BODY>THIS IS BODY.<P>THIS IS P.</P><H1>THIS IS H1.</H1></BODY></HTML>' STR
FROM
DUAL
)
SELECT XMLTYPE(STR).EXTRACT('/HTML/BODY/P/text()').GETSTRINGVAL() FROM T
Related Links:
Oracle: Extract Initials of Names
http://nimishgarg.blogspot.com/2011/01/oracle-extract-initials-of-name.html
Oracle: Extract Numbers from String (Ex: Pin from Address)
http://nimishgarg.blogspot.com/2010/04/oracle-sql-extract-numbers-from-string.html
The 11G example appears to work in 10g (10.2.0.5).
ReplyDeleteThanks for pointing this out :)
ReplyDeleteGreat blog.The code is good and easy to understandable.It help to extract between HTML tags.Thanks for wonderful posting.
ReplyDeletethank buddy :)
ReplyDeleteHi there ,
ReplyDeleteIts pretty nice blog in there . However I wanted to ask if i can use the same on a column inside a table say Projects_Projects so that the whole column is stripped off html tags without updating the data in the table .The main purpose being the data needs to be displayed in the report generated from the database
Yes you can do so
Deletehow abaut if i want select between ( and ) tags. i try replace < with ( and > with ) it did not work
ReplyDeletei did not get your issue. Please provide input string and output data
DeleteHi Nimish, I have similar issue, I'd like to replace pointy brackets with square brackets.
ReplyDeleteFROM: text text text text text
TO: text text text [BACKSPACE] text text
Thanks,
Tom
use translate
Deletetranslate('<Nimish>Hello<Garg>','<>','[]')
thanks, this is much appreciated!
ReplyDeleteHi..can you help me with same query in SQL.
ReplyDeleteHi Nimish,
ReplyDeletecan you pls help with a query to extract following tags th attributes or table border=2 from the DB Column
nice solution. thx!
ReplyDeletebookmarked!!, I like your web site!
ReplyDeleteGood article, please check my blog for more fusion hcm topics in detail
ReplyDeletehttps://www.oraclefusionhcm.com
I really loved reading your blog. It was very well authored and easy to undertand. Unlike additional blogs I have read which are really not tht good. I also found your posts very interesting. In fact after reading, I had to go show it to my friend and he ejoyed it as well! html email link
ReplyDelete