Get data from a dynamic site


#1

Hello!

Please help to get data from the site:

What I tried to do:

  1. XPathOnUrl
    XPathOnUrl("site";"//div[@class='cir-item dl_pc wifi_pc']/p[@class='version_detail' and 1]")
    =empty line
    XPathOnUrl("site";"//div[8]/div/ul/li[1]/div[8]")
    =PC客户端 下载 ~ PC client download

  2. GetTextOnUrl
    GetTextOnUrl("site";HttpSettings(;;TRUE;"0|2000|Url";;"text/html";;;;;;;))
    =<div class=""cir-item dl_pc wifi_pc""><h2>PC客户端</h2><p class=""version_detail""></p>

What else can I do?

Thank you!


#2

Hi,

It appears the content you're trying to scrape are loaded with Javascript. One way to solve this is to use the PhantomJs Connector which allows you to make XPath requets after the site has loaded:


#3

Thank you!
The file "C:\Program Files (x86)\SeoTools for Excel\connectors\PhantomJsCloud.xml" is available.
ApiKey is available.

But the formula =Connector("PhantomJsCloud.XPath","http://www1.miwifi.com/miwifi_download.html","//p[@class='version_detail'",,TRUE,"us") = NullReferenceException.
And "SeoTools > Others > Scraping > PhantomJsCloud > Xpath" doesn't work too.

Other variants don't work with the same error.


#4

Have you entered a valid Phantom JS API key in the connector taskpane in Settings?


#5

Javascript retrieves this info from:
http://api.miwifi.com/upgrade/log/latest?typeList=WiFiPC

If you look at the javascript code here you'll see that you can substitute WiFiPC at the end of url with other platform codes and if you inspect any of the elements on the page you'll see something like <div class="cir-item dl_iphone router_iphone">. Find it in the javascript file and you'll see that dl_iphone_iphone corresponds to R1DIP code which you can append to the end of the URL.

This formula will retrieve the whole info object:
=JsonPathOnUrl("http://api.miwifi.com/upgrade/log/latest?typeList=WiFiPC","$.data.list")

And if you want to get individual items, use JsonPath like $.data.list.[0].realType or $.data.list.[0].url


#6

diskborste, I entered a valid API in this window, which appeared when entering the formula.
Where exactly are the settings you are talking about?

Upd: on another PC
Connector("PhantomJsCloud.XPath";"http://www1.miwifi.com/miwifi_download.html";"//div[8]/div/ul/li[1]/div[8]/p";;TRUE;"us") = 版本2.5.0(3月21日更新)**
The necessary data is obtained. Thank you!


#7

dovydasm, JsonPathOnUrl("http://api.miwifi.com/upgrade/log/latest?typeList=WiFiPC";"$.data.list") = NullReferenceException.
In any case, this metod is too complicated for me. Thank you!


#8

@AcidBurn, try and remove accept -> text/html from Settings -> Global HTTP Settings:


#9

diskborste, JsonPathOnUrl("http://api.miwifi.com/upgrade/log/latest?typeList=WiFiPC";"$.data.list[0].week") = 33.
Thank you!