1. Wayback Machine - Collapse 2. Xpath wayback

Hi, a few days ago I discovered SEO TOOLS FOR EXCEL. Amazing tool. I am testing it and looking for ways to use it. I will rather definitely buy the full version, but I already want to prepare the sheets.

I have two problems. They both relate to web.archive.org.

1. Module: Wayback machine.
Everything works fine, but I would like to remove duplicates. If I am checking 1000 domains at a time then I would prefer not to have duplicate URLs with respect to the number of copies.

There is a "Collapse" option. However, after many attempts I am unable to use it. Can I ask for guidelines on how to configure this for myself so that I don't have duplicates.

2. xpath a certain item to web.archive.org.

I depend on Xpath to pull a certain element from such a URL: http://web.archive.org/web//test-website.com

It's exactly about this text: "151 URLs have been captured for this URL prefix."

I try with these XPath:


I also tried adding a crawl delay, but to no avail. Is it possible to do this? I have a different scraper and he gives it, but I want to have it right away in excel.

Thank you very much for your help.

Hi Andrew,

  1. Can you explain what you mean with collapse? Is this a feature of the Wayback Machine API? In any case, it is not possible to remove duplicates automatically if you check domains like this:

However, you can use the built-in "Remove duplicates" function in Excel under the Data tab.

  1. This content is loaded via javascript and not available during html page load. You should be able to get it using this internal api request:


For example:

Thank you for your response. There is such an option "Collapse". That's what I was referring to.