XpathOnUrl returns empty cell, I am trying to scrape Name, Location and experience from linkedin

waqassalehzada · April 5, 2016, 4:06pm

I am trying to scrape Name, Location, experience etc from Linkedin.
I have searched specific people on linkedin using google search. I have to go to each link in google results and copy paste the information i need.
I am using XpathOnUrl for this purpose but I am getting empty cell instead of results, although i am able to scrape linkedin profile links from google search results.
Google search page url is: https://www.google.com.pk/search?q=marketing+manager+"ge+healthcare"+“Current.”+site:linkedin.com/in/+-uk+-sg&biw=1366&bih=362&noj=1&ei=BYsDV4vYEoe90gSLkoegBQ&start=440&sa=N

I used code =Dump(XPathOnUrl(https://www.google.com.pk/search?q=marketing+manager+%22ge+healthcare%22+%E2%80%9CCurrent.%E2%80%9D+site:linkedin.com/in/+-uk+-sg&biw=1366&bih=362&noj=1&ei=BYsDV4vYEoe90gSLkoegBQ&start=440&sa=N,"//h3[@class='r']/a","href"))
It worked and returned 10 linkedin profile links which is what i needed.
Now i want to further use XpathOnUrl on each of this linkedin url to go and scrape Name, Location, Experience.

For scraping name i am using code: =XPathOnUrl( linkedin_url , "//*[@id='name']/h1/span/span")
It returns empty cell. No error just empty cell. I have also tried changing @id='name' with @class='name' but no success.

Help in this regard is much appriciated.
@nielsbosma
Thanks

diskborste · April 7, 2016, 12:14pm

Try the following Xpath, it works for me:

Name =XPathOnUrl( linkedin_url ,"///div[1]/div[2]/div[1]/h2/a/span")
Location =XPathOnUrl( linkedin_url ,"///dl//span")
Bio =XPathOnUrl( linkedin_url ,"///div[1]/div[2]/div[1]/p")
Followers =XPathOnUrl( linkedin_url ,"///div//strong")
Present Job =XPathOnUrl( linkedin_url ,"///td/ol/li")
Market =XPathOnUrl( linkedin_url ,"///dl/dd[2]")

waqassalehzada · April 13, 2016, 3:29pm

Thanks a lot for the help. Really appriciate it.

Every thing is working in this code now accept 'Name'. Btw i didn't see any h2 tag in codes for 'Name'

Here is the page script which is shown to me http://prntscr.com/ars55l

Can't figure it out i also changed the code for name but didn't work.

Thanks in advace.

waqassalehzada · April 13, 2016, 4:10pm

Well they say try try again and never lose hope.

I tried again and again and figured this code out which works for name.
=XPathOnUrl( Linkedin_URL,"///div//h1")

In case anyone else needed.

satsanga · July 29, 2016, 12:20pm

Thanks for the samples. I try to get XPathOnURL working for followers on a public Company Page.
I had a formula that worked for a few years, but it stopped working, so I am trying to find a new version.

I had no luck with the sample above.

The old formula was:
=XPathOnUrl("https://www.linkedin.com/company/mediafiler", "//*[@id='biz-follow-mod']/div/div/p/a/strong")

I tried these new ones but they keep returning blank cells. No error messages only blank cells:

=XPathOnUrl("https://www.linkedin.com/company/mediafiler", "//[@id='biz-follow-mod']/div/div/p")
=XPathOnUrl("https://www.linkedin.com/company/mediafiler","//[@id='biz-follow-mod']/div/div/div/p/*[@class='followers-count']")

Any ideas? Help would be very appreciated!

Anil · January 31, 2017, 12:45pm

These work perfectly fine. However, unable to get the contact number and email id.
I am trying with following
Contact Number XPathOnUrl(linkedinURL,"//table[2]/tbody/tr[td]")
Email Id XPathOnUrl(linkedinURL,"//div[2]/div/div[2]/div[1]/table[1]/tbody/tr[td]")

Kindly guide how to get contact Number and Email Id. I am trying only with my connections at Linkedin.

Urbi · February 1, 2017, 9:24am

Can you provide URL samples?

frontalspoof · February 10, 2017, 4:02pm

I'm also having issues getting data from any LinkedIn profiles after they made the Gui change, some time ago I was able to get name, location, title, summary but not anymore.

silvacarl · February 22, 2018, 11:54pm

I was able to get this to work, as an example:

=Dump(XPathOnUrl("https://www.google.com/search?num=100&q=privia+health+site%3Alinkedin.com/in", "//h3[@class='r']/a", "href"))

however, what is odd is that using this technique and creating several columns such as this:

="https://www.google.com/search?num=100&q=privia+health+site%3Alinkedin.com/in&start=200"

it never goes above 500 found.

is this a limitation of SEO Tools or?

diskborste · February 23, 2018, 8:30am

Do you mean changing the pagination numbers to 300, 400 etc? SeoTools scrapes whatever is found on the page so I don't see a limitation. Perhaps something has changed in the DOM for search results above 500?

gymrat · May 25, 2018, 8:12am

seems like you can't scrape from linkedin anymore. if not, can someone show me how to scrape JS enabled sites with PhantomJs? i tried it on linkedin but it's not working on any profile page. any help?