Delay Doesn't Seem To Be Working For Me

I've tried to adjust the global http settings several times such as in the image below:

Which I believe should give me a 10-20 sec delay between searches. But when I am bulk scraping from Google or any other sites with XPathOnURL, it still seems to do it almost simultaneously - which, not surprisingly, often gets me blocked from Google for several hours. Am I missing something here?

That is weird, can you try to remove the Cache setting as well as using If same: Host?

@diskborste
I have same problem but with GA API.
When i'm trying do get 1k+ rows i get error with to many concurrent connections

My config for tests. Even with 10k interval and 1 concurrent connection API gets 5-12 rows each second.

  <ConcurrentRequest>1</ConcurrentRequest>
  <ConcurrentHostRequests>1</ConcurrentHostRequests>
  <RunAsyncUdfsSynchronously>false</RunAsyncUdfsSynchronously>
  <GlobalSettings><IntervalBetweenRequests RandomFrom="10000" RandomTo="10100" IfSame="Host" />
  <Cache>false</Cache>

I tried what you suggested but I am still having the same issue.

this should work (works for me: win7x64, office 16/365, latest STFE):

=XPathOnUrl("http://www.google.com/search?q=test;";"//h3[@class='r']/a";"href";HttpSettings(TRUE;;;RANDBETWEEN(10000;20000))

Note:

  • such queries can fail, if on using non-english Office you don't translate true to your Office language,
  • Some Office versions use , instead of ;
1 Like

Thank you! I will give that a try.