Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

循环点开每个搜索结果,每个结果的网页结构不同,请问应如何采集网页文本内容? #638

Open
Yangyehong opened this issue Dec 23, 2024 · 0 comments

Comments

@Yangyehong
Copy link

版本信息 | 6.0.2

**EasySpider版本 | EasySpider Version 6.0.2:
系统版本(架构) | System Version (Architecture):
浏览器版本 | Browser Version:
安装方式 | Installation method:

问题描述 | Issue Description

请问大家,如果是爬取百度搜索结果列表,循环点开每个网页,采集网页内容。
存在两个问题:

  1. 网页结构不同;
  2. 并且有的网页是静态的,有的是动态;
    请问,应该如何采集内容?

如何复现 | Steps to Reproduce

@Yangyehong Yangyehong changed the title 循环点开每个网页,网页结构不同,请问应如何采集网页内容? 循环点开每个搜索结果,每个结果的网页结构不同,请问应如何采集网页文本内容? Dec 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant