XpathOnUrl - Linkedin Followers

josha · September 20, 2017, 11:00pm

Hi, I'm still not good dealing with xpathonURL yet, can you help me out on how to get the # of followers on this page: https://pe.linkedin.com/company/cencosud-s-a-

I want to make a list of the top companies and see how many followers they have.

So far the place where this information is located is here:

can you guys help create the xpath?.

I tried the following:
=Dump(XPathOnUrl("https://pe.linkedin.com/company/cencosud-s-a-","//p[@class='followers-count']",,,"text"))

Thanks.

diskborste · September 21, 2017, 7:00am

Can you post an image with the location of the follower count?

josha · September 21, 2017, 2:40pm

Sure.

This is the URL:
https://pe.linkedin.com/company/cencosud-s-a-

This is the image of what I intent to get for each company.

this is the information i got from the HTML:

219.977 seguidores

Thanks

WolfeDen · September 27, 2017, 6:27pm

Hi, Josha.

Unfortunately, Microsoft’s LinkedIn network prohibits scraping their site and they employ special techniques which prevent scraping data.

The good news is that just last month a US District Court ruled that LinkedIn must allow third-party companies to scrape data publicly posted by LinkedIn users.

However, it remains to be seen when (or if) LinkedIn will comply with that order. I'm also unclear whether the order also addresses LinkedIn's company pages or only the data posted by users.

-Tim

chilly_bang · October 4, 2017, 3:25pm

Well, the number of followers knows, as always, Google. You can get them all with the search query, like:

https://www.google.de/search?num=100&newwindow=1&q=site:https://www.linkedin.com/+inurl:company+-inurl:help+-inurl:pulse+"1..100000000+followers"

The number of followers is staying as first in every snippet, as < em > element. I've got all numbers with Screaming Frog custom extraction: url like i've written and Xpath as //em . In SeoTools the scraping XPathOnURL should be like:

=Dump(XPathOnUrl("https://www.google.de/search?num=100&newwindow=1&q=site:https://www.linkedin.com/+inurl:company+-inurl:help+-inurl:pulse+"1..100000000+followers"";"//em"]";;;"text"))

But it doesn't work for me. Maybe the problem is placed anywhere at HTTP settings and specially at cookie saving.

In Screaming Frog i was able to scrape only after i enabled chrome user agent, language header and allowed cookie saving.

Lisann · April 19, 2019, 10:49pm

Any update on this? I'd love to have follower and connect counts off of profiles.

diskborste · April 20, 2019, 6:50pm

Update to this thread. The request suggested by WolfeDen works when adding the same User-Agent header used in the Google Search Connector:

The user agent is:
Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0

Can be added as an argument to XpathOnUrl from the HttpSettings button found in the taskpane to the left in the example above.