Xpath On Url will not pull elements

This is the Div Class main group that all sub groups are under.

Then I have the following 4 room loop types where I am trying to pull just the roomtypecode for each one in a cell from the website using Xpath On Url. I need the “ZZZA”,”GENR”,”DBDB”,”CITY”. in Cell A1,A2,A3, and A4 respectively. But, it will only pull the first roomtypecode for any site.

li class="guestRoomListItem clearfix" data-roomloopcount="1" data-roomtypecode="ZZZA" data-info="{"roomloopcount":"1","roomtypecode":"ZZZA"}">

li class="guestRoomListItem clearfix" data-roomloopcount="2" data-roomtypecode="GENR" data-info="{"roomloopcount":"2","roomtypecode":"GENR"}">

li class="guestRoomListItem clearfix" data-roomloopcount="3" data-roomtypecode="DBDB" data-info="{"roomloopcount":"3","roomtypecode":"DBDB"}">

li class="guestRoomListItem clearfix" data-roomloopcount="4" data-roomtypecode="CITY" data-info="{"roomloopcount":"4","roomtypecode":"CITY"}">

These are the Formula’s I am using now for each request, but it will only pull the first Room type code, even though I have it set to 1, 2, 3, and 4.

=IFERROR(XPathOnUrl($H$26,"//div[@class='guestRoomsResp']/ul/li[1]","data-roomtypecode"),"")
=IFERROR(XPathOnUrl($H$26,"//div[@class='guestRoomsResp']/ul/li[2]","data-roomtypecode"),"")
=IFERROR(XPathOnUrl($H$26,"//div[@class='guestRoomsResp']/ul/li[3]","data-roomtypecode"),"")
=IFERROR(XPathOnUrl($H$26,"//div[@class='guestRoomsResp']/ul/li[4]","data-roomtypecode"),"")

Website is:
http://www.marriott.com/hotels/hotel-rooms/nycmq-new-york-marriott-marquis/

Try:

XPathOnUrl($H$26,"//li[contains(@class,'guestRoomListItem')][3]","data-roomtypecode")

(not sure why your other xpath isn't working)

Thanks Niels Your the Man!! It works, thank you!!!

So I was able to use your example of the URL provided above, but the same example you provided will not work on a different URL with the same HTML framework.

XPathOnUrl($H$26,"//li[contains(@class,'guestRoomListItem')][3]","data-roomtypecode")

This is the other URL I am trying to scrape

http://www.marriott.com/hotels/hotel-rooms/phxap-phoenix-airport-marriott/

Please help!!! Thanks :slight_smile:

Good Afternoon all,

So I am officially confused lol. So when I use the XpathonUrl provided by Niels in a previous post (reference 1 below) it works, but why is it not pulling all of the room type codes? for example:

on the same URL as above (reference 2 below) I have the following HTML strings but on my excel sheet its only pulling data-roomloopcount="1","10","11","12", and "13". Even though I have cells requesting 1-15 it is skipping data-roomloopcount="2"-"9".

< li class="guestRoomListItem clearfix" data-roomloopcount="1" data-roomtypecode="ZZZA" data-
< li class="guestRoomListItem clearfix" data-roomloopcount="2" data-roomtypecode="GENR" data-
< li class="guestRoomListItem clearfix" data-roomloopcount="3" data-roomtypecode="DBDB" data-
< li class="guestRoomListItem clearfix" data-roomloopcount="4" data-roomtypecode="CITY" data-
< li class="guestRoomListItem clearfix" data-roomloopcount="5" data-roomtypecode="DCTY" data-
< li class="guestRoomListItem clearfix" data-roomloopcount="6" data-roomtypecode="ZZZB" data-
< li class="guestRoomListItem clearfix" data-roomloopcount="7" data-roomtypecode="DLCN" data-
< li class="guestRoomListItem clearfix" data-roomloopcount="8" data-roomtypecode="ZZZC" data-
< li class="guestRoomListItem clearfix" data-roomloopcount="9" data-roomtypecode="DSTE" data-
< li class="guestRoomListItem clearfix" data-roomloopcount="10" data-roomtypecode="ZZZD" data-
< li class="guestRoomListItem clearfix" data-roomloopcount="11" data-roomtypecode="KSTE" data-
< li class="guestRoomListItem clearfix" data-roomloopcount="12" data-roomtypecode="EXEC" data-
< li class="guestRoomListItem clearfix" data-roomloopcount="13" data-roomtypecode="CONC" data-

My excel formula looks like =XPathOnUrl(K22,"//li[contains(@class,'guestRoomListItem')][1]","data-roomtypecode")

FYI- the weird thing is that the Xpath request is asking for number 2, but its pulling number 10:

=XPathOnUrl(K22,"//li[contains(@class,'guestRoomListItem')][2]","data-roomtypecode") = but I am getting ZZZD ("10")


I can share the excel file at your request

  1. XPathOnUrl($H$26,"//li[contains(@class,'guestRoomListItem')][3]","data-roomtypecode")
  2. http://www.marriott.com/hotels/hotel-rooms/nycmq-new-york-marriott-marquis/

Did you ever get an answer to this? I've been seeing issues where seotools wasn't pulling the xpath element (even when other tools with the same xpath do).

Continuing the discussion from Xpath On Url will not pull elements:

I get the code troublefree with:

=Dump(XPathOnUrl("http://www.marriott.com/hotels/hotel-rooms/nycmq-new-york-marriott-marquis/";"//*[@id=""guest-rooms-list""]/li[2]";"data-roomtypecode";;"text"))

I'm pretty sure, the problem is placed in the kind of XPath - the simpler Xpath, the higher success chances.

That works. Is there a reason that an xpath would work for a number of other tools but seotools can't? Is it a parsing thing?

I've been trying to scrape footer links on websites for some analysis I'm doing and the problem I have is with this page: http://motherandbabymatters.com/

What I care about is the image link at the bottom of the footer. Here's some xpaths I've used that work with other tools, but not SEOtools:

//div[@id='mtx_copyright']/a

/html/body/div[@id='wrapper']/div[@id='wrapper_container']/div[@id='footer_container']/div[@id='footer']/div[@id='mtx_copyright']/a

/html/body//div[@id='mtx_copyright']/a

//div[@id='footer_container']/div[@id='footer']/div[@id='mtx_copyright']/a

None of these seem to work with SEOtools. :unamused:

With the cited site you haven't much tries to scrape;) I was able to scrape the footer once, on the next try - no longer, because the page uses a security heuristics by www.sitelock.com and redirects suspicial user to captcha secured screen like http://easycaptures.com/3961300365

Yeah that's probably because I've been blasting it trying to find an xpath that works.

Still no luck. :frowning:

//div[@id='mtx_copyright']/a[last()]

@chilly_bang - I'm aware of the sitelock, you have to wait between requests; which just makes it more frustrating :expressionless: