Extracting TOC from a site

#1

Hi , I'm extracting TOC from a site, but the data in excel is not formatted as TOC, which is making it almost useless. Is there any way I can get formatted data. In google doc if I'm using importxml() I'm getting formatted data.

URL:https://www(dot)grandviewresearch(dot)com/industry-analysis/pv-inverters-market/toc
XPATH: =XPathOnUrl(A2,"//div[@class='report_summary full']/p[2]")

A2 is the URL. I want this data to be formatted in TOC format.

Any insight will be helpful.

Thanks,
Animesh

#2

Can you show an example of what TOC format would look like in Excel?

#3

Chapter 1. Methodology and Scope 1.1. Research Methodology 1.2. Research Scope & Assumptions 1.3. List of Data SourcesChapter 2. Executive Summary 2.1. Oil storage - Industry Summary and Key Buying Criteria, 2014 - 2025Chapter 3. Oil Storage Industry Outlook 3.1. Market segmentation 3.2. Market size and growth prospects, 2014 - 2025 3.3. Oil storage market - Value chain analysis 3.4. Regulatory framework 3.5. Recent developments 3.5.1. R&D initiatives 3.5.2. New product development 3.6. Technology outlook 3.7. Production cost analysis 3.8. Key buyer trends 3.9. Oil storage market dynamics 3.9.1. Market driver analysis 3.9.2. Market restraint analysis 3.10. Key market opportunities - Prioritized 3.11. Industry analysis - Porter's 3.12. Oil storage market PESTEL analysis, 2015Chapter 4. Oil Storage Product Outlook 4.1. Global oil storage market share by product, 2015 & 2025 4.2. Open Top 4.2.1. Market estimates and forecast, 2014 - 2025 4.2.2. Market estimates and forecast by region, 2014 - 2025 4.3. Fixed Roof 4.3.1. Market estimates and forecast, 2014 - 2025 4.3.2. Market estimates and forecast by region, 2014 - 2025 4.4. Floating Roof 4.4.1. Market estimates and forecast, 2014 - 2025 4.4.2. Market estimates and forecast by region, 2014 - 2025 4.5. Others 4.5.1. Market estimates and forecast, 2014 - 2025 4.5.2. Market estimates and forecast by region, 2014 - 2025Chapter 5. Oil Storage Application Outlook 5.1. Global oil storage market share by application, 2015 & 2025 5.2. Crude oil 5.2.1. Market estimates and forecast, 2014 - 2025 5.2.2. Market estimates and forecast by region, 2014 - 2025 5.3. Middle Distillates 5.3.1. Market estimates and forecast, 2014 - 2025 5.3.2. Market estimates and forecast by region, 2014 - 2025 5.4. Gasoline 5.4.1. Market estimates and forecast, 2010-2025 5.4.2. Market estimates and forecast by region, 2014 - 2025 5.5. Aviation fuel 5.5.1. Market estimates and forecast, 2010-2025 5.5.2. Market estimates and forecast by region, 2014 - 2025 5.6. Others 5.6.1. Market estimates and forecast, 2014 - 2025 5.6.2. Market estimates and forecast by region, 2014 - 2025 Chapter 6. Oil Storage Regional Outlook 6.1. Global oil storage market share by region, 2015 & 2025 6.2. North America 6.2.1. Market estimates and forecast, 2014 - 2025 6.2.2. Market estimates and forecast by product, 2014 - 2025 6.2.3. Market estimates and forecast by application, 2014 - 2025 6.2.4. U.S. 6.2.4.1. Market estimates and forecast by product, 2014 - 2025 6.2.4.2. Market estimates and forecast by application, 2014 - 2025 6.3. Europe 6.3.1. Market estimates and forecast, 2014 - 2025 6.3.2. Market estimates and forecast by product, 2010 - 2025 6.3.3. Market estimates and forecast by application, 2014 - 2025 6.3.4. Germany 6.3.4.1. Market estimates and forecast by product, 2014 - 2025 6.3.4.2. Market estimates and forecast by application, 2014 - 2025 6.3.5. Netherlands 6.3.5.1. Market estimates and forecast by product, 2014 - 2025 6.3.5.2. Market estimates and forecast by application, 2014 - 2025 6.3.6. Belgium 6.3.6.1. Market estimates and forecast by product, 2014 - 2025 6.3.6.2. Market estimates and forecast by application, 2014 - 2025 6.3.7. Spain 6.3.7.1. Market estimates and forecast by product, 2014 - 2025 6.3.7.2. Market estimates and forecast by application, 2014 - 2025 6.4. Asia Pacific 6.4.1. Market estimates and forecast, 2014 - 2025 6.4.2. Market estimates and forecast by product, 2010 - 2025 6.4.3. Market estimates and forecast by application, 2014 - 2025 6.4.4. China 6.4.4.1. Market estimates and forecast by product, 2014 - 2025 6.4.4.2. Market estimates and forecast by application, 2014 - 2025 6.4.5. Indonesia 6.4.5.1. Market estimates and forecast by product, 2014 - 2025 6.4.5.2. Market estimates and forecast by application, 2014 - 2025 6.4.6. Singapore 6.4.6.1. Market estimates and forecast by product, 2014 - 2025 6.4.6.2. Market estimates and forecast by application, 2014 - 2025 6.5. Central & South America 6.5.1. Market estimates and forecast, 2014 - 2025 6.5.2. Market estimates and forecast by product, 2010 - 2025 6.5.3. Market estimates and forecast by application, 2014 - 2025 6.6. Middle East & Africa 6.6.1. Market estimates and forecast, 2014 - 2025 6.6.2. Market estimates and forecast by product, 2010 - 2025 6.6.3. Market estimates and forecast by application, 2014 - 2025 6.6.4. Saudi Arabia 6.6.4.1. Market estimates and forecast by product, 2014 - 2025 6.6.4.2. Market estimates and forecast by application, 2014 - 2025 6.6.5. UAE 6.6.5.1. Market estimates and forecast by product, 2014 - 2025 6.6.5.2. Market estimates and forecast by application, 2014 - 2025Chapter 7. Competitive Landscape 7.1. Competitive Heat Map Analysis 7.2. Vendor Landscape 7.3. Competitive Environment 7.4. Strategy FrameworkChapter 8. Competitive Landscape 8.1. ZCL Composites 8.1.1. Company Overview 8.1.2. Financial Performance 8.1.3. Product Benchmarking 8.1.4. Strategic Initiatives 8.2. Belco Manufacturing 8.2.1. Company Overview 8.2.2. Financial Performance 8.2.3. Product Benchmarking 8.2.4. Strategic Initiatives 8.3. Containment Solutions 8.3.1. Company Overview 8.3.2. Financial Performance 8.3.3. Product Benchmarking 8.3.4. Strategic Initiatives 8.4. Synalloy Corp. (Palmer) 8.4.1. Company Overview 8.4.2. Financial Performance 8.4.3. Product Benchmarking 8.4.4. Strategic Initiatives 8.5. L.F. Manufacturing 8.5.1. Company Overview 8.5.2. Financial Performance 8.5.3. Product Benchmarking 8.5.4. Strategic Initiatives 8.6. Zepnotek Storage 8.6.1. Company Overview 8.6.2. Financial Performance 8.6.3. Product Benchmarking 8.6.4. Strategic Initiatives 8.7. Oiltanking GmbH 8.7.1. Company Overview 8.7.2. Financial Performance 8.7.3. Product Benchmarking 8.7.4. Strategic Initiatives 8.8. Columbian Steel Tank 8.8.1. Company Overview 8.8.2. Financial Performance 8.8.3. Product Benchmarking 8.8.4. Strategic Initiatives 8.9. Sunoco Logistics 8.9.1. Company Overview 8.9.2. Financial Performance 8.9.3. Product Benchmarking 8.9.4. Strategic Initiatives 8.10. Vitol Tank Terminals 8.10.1. Company Overview 8.10.2. Financial Performance 8.10.3. Product Benchmarking 8.10.4. Strategic Initiatives 8.11. Royal Vopak NV 8.11.1. Company Overview 8.11.2. Financial Performance 8.11.3. Product Benchmarking 8.11.4. Strategic Initiatives 8.12. Vitol Tank Terminals 8.12.1. Company Overview 8.12.2. Financial Performance 8.12.3. Product Benchmarking 8.12.4. Strategic Initiatives

#4

Thanks, but I meant how you would like it to look like? For example Indentations, bold and or font size differences.

#5

I don't think there is an easy way with SeoTools. Here are the results after setting the XPathOnUrl 'Mode' argument to HTML, and using formulas to replace the HTML entities with spaces/line breaks and stripping tags:

Will think about a way to add a third Mode which formats the text based on the HTML is strips. Thanks for asking a great question by the way!

#6

Thanks, can you please let me know the complete formula you used for above formatting. I think that will work for me

#7

Sure, URL in cell A1, XPath output in cell A2.

Formula to clean:
=RegexpReplace(SUBSTITUTE(HtmlDecode(A2),"
",CHAR(10)),"<.*?>","")

Formula to extract HTML:
=XPathOnUrl(A1,"//div[@class='report_summary full']/p[2]",,,"html")

Can be combined into a single formula if you prefer.

/edit, fixed the formula and used HtmlDecode to handle HTML entities.