Extracting count of H2 tags matching specific word


#1

=XPathOnUrl(“www.example.com”,"count(//h2)")

I’m using the above to give me the count of all the h2 tags on a page.
This is working OK.

I looking to take this a step further but I’ve reached the limit of my knowledge.

I want to get the match count of the h2 tags that contain a specific keyword and also the total number of h2 tags.

Let’s say the keyword is “banana”

So the output I’m looking for is:

Number of h2 tags = xxx

Number of h2 tags containing banana or Banana = xxx

Any help would be much appreciated


#2

Your first example should give you the total count. As for the number of H2's containing "banana", this Xpath expression should do the trick:

count(//h2[contains(text(),'banana')])


#3

Many thanks Tim

Works perfectly, much appreciated.


#4

Also, take a look a CsQueryOnURL. You'd be surprised how much easier it can be depending on the data you need.

This short guide should point you in a very simple direction:
https://www.w3schools.com/cssref/css_selectors.asp

I'm attaching an example of a semi-long CsQueryOnURL function that returned all the data I needed into separate columns. It's a rough result, but It can give you an idea and make your own calls where you need them.

Example URL: (Might be a dead link at some point, but try it on any url from the same site)
Link to Product Page

Formula: ( Assumes URL is in cell A1, or you replace "A1" with the URL entirely.)
=Dump(CsQueryOnUrl(A1,"title, div[class~=prod-detail] h4:first-child, ul[class=unstyled specs] li, ul + h4, div[class=small] strong, b[class=item-price deal]","",,"text"),TRUE)