跳转到主要内容

简单的Craigslist包装器。

项目描述

一个简单的Craigslist包装器。

许可证:MIT-Zero

免责声明

  • 我既不为Craigslist工作,也没有任何关联。

  • 此模块是为了教育目的而实现的。它不应用于爬取或从Craigslist下载数据。

安装

pip install python-craigslist

基类

  • CraigslistBase

子类

  • CraigslistCommunity (craigslist.org > community)

  • CraigslistHousing (craigslist.org > housing)

  • CraigslistJobs (craigslist.org > jobs)

  • CraigslistForSale (craigslist.org > for sale)

  • CraigslistEvents (craigslist.org > event calendar)

  • CraigslistServices (craigslist.org > services)

  • CraigslistGigs (craigslist.org > gigs)

  • CraigslistResumes (craigslist.org > resumes)

示例

在旧金山寻找房间吗?

from craigslist import CraigslistHousing
cl_h = CraigslistHousing(site='sfbay', area='sfc', category='roo',
                         filters={'max_price': 1200, 'private_room': True})

# You can get an approximate amount of results with the following call:
print(cl_h.get_results_approx_count())

992

for result in cl_h.get_results(sort_by='newest', geotagged=True):
    print(result)

{
    'id': u'4851150747',
    'name': u'Near SFSU, UCSF and NEWLY FURNISHED - CLEAN, CONVENIENT and CLEAN!',
    'url': u'http://sfbay.craigslist.org/sfc/roo/4851150747.html',
    'datetime': u'2015-01-27 23:44',
    'price': u'$1100',
    'where': u'inner sunset / UCSF',
    'has_image': False,
    'has_map': True,
    'geotag': (37.738473, -122.494721)
}
# ...

也许在硅谷找一个软件工程实习?

from craigslist import CraigslistJobs
cl_j = CraigslistJobs(site='sfbay', area='sby', category='sof',
                      filters={'is_internship': True, 'employment_type': ['full-time', 'part-time']})

for result in cl_j.get_results():
    print(result)

{
    'id': u'5708651182',
    'name': u'GAME DEVELOPER INTERNSHIP AT TYNKER - AVAILABLE NOW!',
    'url': u'http://sfbay.craigslist.org/pen/eng/5708651182.html',
    'datetime': u'2016-07-30 13:30',
    'price': None,
    'where': u'mountain view',
    'has_image': True,
    'has_map': True,
    'geotag': None
}
# ...

纽约市免费食物活动在哪里?

from craigslist import CraigslistEvents
cl_e = CraigslistEvents(site='newyork', filters={'free': True, 'food': True})

for result in cl_e.get_results(sort_by='newest', limit=5):
    print(result)

{
    'id': u'4866178242',
    'name': u'Lituation Thursdays @ Le Reve',
    'url': u'http://newyork.craigslist.org/mnh/eve/4866178242.html',
    'datetime': u'1/29',
    'price': None,
    'where': u'Midtown East',
    'has_image': True,
    'has_map': True,
    'geotag': None
}
# ...

在哪里可以获得过滤器

每个子类都有自己的过滤器集合。要获取特定子类支持的所有过滤器的列表,请使用.show_filters()类方法。

>>> from craigslist import CraigslistJobs, CraigslistForSale
>>> CraigslistJobs.show_filters()

Base filters:
* query = ...
* search_titles = True/False
* has_image = True/False
* posted_today = True/False
* bundle_duplicates = True/False
* search_distance = ...
* zip_code = ...

CraigslistJobs filters:
* is_internship = True/False
* is_nonprofit = True/False
* is_telecommuting = True/False
* employment_type = u'full-time', u'part-time', u'contract', u"employee's choice"


>>> CraigslistForSale.show_filters(category='cta')

Base filters:
* query = ...
* search_titles = True/False
* has_image = True/False
* posted_today = True/False
* bundle_duplicates = True/False
* search_distance = ...
* zip_code = ...

CraigslistForSale filters with category 'cta':
* min_price = ...
* max_price = ...
* make = ...
* model = ...
* min_year = ...
* max_year = ...
* min_miles = ...
* max_miles = ...
* min_engine_displacement = ...
* max_engine_displacement = ...
* condition = u'new', u'like new', u'excellent', u'good', u'fair', u'salvage'
* auto_cylinders = u'3 cylinders', u'4 cylinders', u'5 cylinders', u'6 cylinders', u'8 cylinders', u'10 cylinders', u'12 cylinders', u'other'
* auto_drivetrain = u'fwd', u'rwd', u'4wd'
* auto_fuel_type = u'gas', u'diesel', u'hybrid', u'electric', u'other'
* auto_paint = u'black', u'blue', u'brown', u'green', u'grey', u'orange', u'purple', u'red', u'silver', u'white', u'yellow', u'custom'
* auto_size = u'compact', u'full-size', u'mid-size', u'sub-compact'
* auto_title_status = u'clean', u'salvage', u'rebuilt', u'parts only', u'lien', u'missing'
* auto_transmission = u'manual', u'automatic', u'other'
* auto_bodytype = u'bus', u'convertible', u'coupe', u'hatchback', u'mini-van', u'offroad', u'pickup', u'sedan', u'truck', u'SUV', u'wagon', u'van', u'other'

从哪里获取sitearea

初始化任何子类时,您需要提供您想要查询数据的site,以及可选的area

要获取正确的site,请按照以下步骤操作

  1. 访问craigslist.org/about/sites

  2. 找到您感兴趣的国家或城市,然后点击它。

  3. 您将被引导到 <site>.craigslist.org。URL中<site>的值是您应该使用的。

并非所有站点都有区域。要检查您的站点是否有区域,请查看Craigslist页面标题旁边的链接,位于顶部中间。例如,对于纽约,您将看到

https://user-images.githubusercontent.com/1008637/45307206-bb404d80-b51e-11e8-8e6d-edfbdbd0a6fa.png

点击您感兴趣的链接,您将被重定向到 <site>.craigslist.org/<area>。URL中<area>的值是您应该使用的。如果标题旁边没有区域,则表示您的站点没有区域,您可以不设置该参数。

在哪里获取类别

您可以在初始化任何子类时提供类别。要获取特定子类支持的所有类别的列表,请使用.show_categories()类方法。

>>> from craigslist import CraigslistServices
>>> CraigslistServices.show_categories()

CraigslistServices categories:
* aos = automotive services
* bts = beauty services
* cms = cell phone / mobile services
* cps = computer services
* crs = creative services
* cys = cycle services
* evs = event services
* fgs = farm & garden services
* fns = financial services
* hws = health/wellness services
* hss = household services
* lbs = labor / hauling / moving
* lgs = legal services
* lss = lessons & tutoring
* mas = marine services
* pas = pet services
* rts = real estate services
* sks = skilled trade services
* biz = small biz ads
* trv = travel/vacation services
* wet = writing / editing / translation

搜索结果数量有限制吗?

是的,Craigslist将任何搜索的结果上限设置为3000。

支持

如果您发现任何错误或想提出新功能,请使用问题跟踪器。我会很乐意帮助您!:-)

项目详情


下载文件

下载您平台上的文件。如果您不确定选择哪个,请了解更多关于安装包的信息。

源分发

python-craigslist-1.1.4.tar.gz (39.3 kB 查看哈希值)

上传时间

支持者