xpath之28二手房数据爬取

时间:2020-04-13 17:31:04   收藏:0   阅读:69
# 需求: 爬取58二手房房源信息
from lxml import etree
import requests

if __name__ == "__main__":
    # 爬取页面源码数据
    headers = {
        User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36
    }   # UA伪装
    url = https://bj.58.com/ershoufang/
    page_text = requests.get(url=url,headers = headers).text

    # 数据解析
    tree = etree.HTML(page_text)
    # 存储的就是li标签对象
    li_list = tree.xpath(//ul[@class="house-list-wrap"]/li)
    print(li_list)
    fp = open(58二手房.txt,w,encoding=utf-8)
    for li in li_list:
        # 局部解析
        title = li.xpath(./div[2]/h2/a/text())[0]
        print(title)
        fp.write(title+\n) #todo

 

原文:https://www.cnblogs.com/huahuawang/p/12692448.html

评论(0
© 2014 bubuko.com 版权所有 - 联系我们:wmxa8@hotmail.com
打开技术之扣,分享程序人生!