Occurences of multiple words on a given URL


#1

Hello,

I am looking for an easy way to check if a word is found on a given URL. This can be done by using 'IsFoundOnPage' but how do i do that if the list of words to check for is more than 100?

So on the one hand i got all the URL's of the domain and on the other hands 100+ words and need to know if anu of those 100 words is found on which URL and the number of occurrences. I thought about regex but doesn't that have a character limit?

Thnx for the help


#2

Hi,

This is not possible in an easy way, but it is a great suggestion for an upcoming feature!

It is possible to mess around with regex and various formulas but its not ideal. I would solve this with VBA, are you familiar with it?


#3

Hi,

Thanks for the quick reply!
No unfortunately i am not familiar with VBA but if you have some guideline or tips i will sure find something to make it work.


#4

If you actually mean how many times a keyword appears in the URL, this explanation should help. If you mean how many times a keyword appears within the PAGE, please reply, as that's a bit more complicated.

Depending on your version of Excel, cells can contain up to 32K characters, so that will be your potential limit for your list of target keywords. As Victor mentioned, if you need greater scale, you'll need to turn to writing VBA code.

If this limit is acceptable, SeoTools can handle this challenge with ease. Here's how...

(Feel free to substitute your own locations)

  1. List your target keywords starting in cell A5
  2. List your URLs starting in cell C5
  3. Place this formula in cell A2:
    ="\b(" & StringJoin(A5:A1005, "|", TRUE) & ")\b"
    (Expand beyond A1005 if more than 1,000 keywords)
  4. Place this formula in cell D6:
    =StringJoin( RegexpFind( C5, $A$2, 1 ), ", " )
  5. Place this formula in cell E6:
    =IFERROR( ROWS( RegexpFind( C5, $A$2, 1 ) ), "" )
  6. Copy both formulas in D6:E6 down for all URLs

I believe this should result in what you're looking for:
image

Hope this helps!
-Tim


#5

Hi WolfDen,

Its within the page itself, within the content.


#6

The simplest solution as your original thought about using IsFoundOnPage(). If you have caching turned on, simply repeat the function for each keyword in your list. Only the first instance for each URL will take time to fetch, but from there, each keyword will only take a fraction of a second.

If you need something more flexible than a basic text match, you can also use RegexpFindOnUrl() wrapped within ROWS(), similar to this:

=ROWS( RegexpFindOnUrl("https://seotoolsforexcel.com/", "(seo)", 1) )

Note the above will find instances of "seo" anywhere within the page, so if you want to search only within visible text, you can try something like this:

=ROWS( RegexpFind( XPathOnUrl("https://seotoolsforexcel.com/", "//body//div", "", , "text"), "(seo)", 1 ) )

Good luck!

-Tim


#7

HI,

Thanks for the reply. Will try it out!