Correct usage of RegExpReplace

I scrape google's search results number with the line:

=RegexpReplace(Dump(XPathOnUrl("https://www.google.de/search?q=test&num=10&hl=en&safe=off&source=lnt&tbs=cdr:1,cd_min:1/1/2016,cd_max:2/1/2016";"//*[@id=""resultStats""]";;HttpSettings(WAHR);"text"));"\d*\.\d*";"$1")

I need RegexpReplace because the GoogleSearch function is buggy, i need special dates in a query, and the scraped result is a string like "About 13.600 results", where i'm needing only the number. So i try to filter words out with RegexpReplace.... and fail. This regex \d*\.\d* finds correctly the number, but RegexpReplace makes from it instead the number About $1 results (0,33 seconds). You see - it doesn't replace the whole match with the found string, but replaces found string with $1. Why? How can i achieve the correct replacement?

Can you provide an example of how the GoogleSearch function is buggy?

Also, I would recommend RegexpFind if you want to isolatet the numbers.

i've notified you about this GoogleSearch bug yesterday - you said, aou are aware of it: Google Search: number results and search results don't work as formula

Why would you recommend RegexpFind instead of RegexpReplace?

Because you only need to isolate the number string (the dot makes it slightly more complicated) instead of replacing both "About" and "results")

nice! - thank you for advice!
Does RegexpFind let be concatenated on the fly with something like =RegexpFind(Dump(XpathOnURL(...? i would like to drive without help column

You can arrange the formula parts whatever way you like. Dump creates an array so I'm not sure if your example would work. The otherway around, Dump(regexpfind(xpath(x))) is valid and creates an array which is dumped onto a range on the sheet.

well, sounds good. test it tommorrow at work and report

This isn't quite the same, but when nesting =Dump(RegexpReplace(RegexpFindOnUrl())) like so, nothing shows up as an array anymore. Without regexreplace, the array appears fine using dump. Any idea what I'm doing wrong?

I think that is because RegexpReplace doesn't produce an array but only a single string. Perhaps change to RegexpFind and match the substrings you want to be part of your array?

That solution works. Thank you! This add-on is really incredible in what it can do. SEO is the tip of the iceberg. I'm part of a team using it for scientific text analysis and it's great because all of our Excel skills translate right into linguistic analysis. We don't have to learn Python or buy other expensive programs and services because your tool is so versatile.

2 Likes