chopper

Lib to extract html elements by preserving ancestors and cleaning CSS

These details have not been verified by PyPI

Project links

Homepage

Project description

https://travis-ci.org/jurismarches/chopper.svg?branch=master

https://coveralls.io/repos/jurismarches/chopper/badge.png

Extracts html contents by preserving ancestors and clean CSS

Compatible with Python >= 2.6, <= 3.4

Installation

pip install chopper

Usage

from chopper import Extractor

HTML = """
<html>
  <head>
    <title>Test</title>
  </head>
  <body>
    <div id="header"></div>
    <div id="main">
      <div class="iwantthis">
        HELLO WORLD
        <a href="/nope">Do not want</a>
      </div>
    </div>
    <div id="footer"></div>
  </body>
</html>
"""

CSS = """
div { border: 1px solid black; }
div#main { color: blue; }
div.iwantthis { background-color: red; }
a { color: green; }
div#footer { border-top: 2px solid red; }
"""

extractor = Extractor().keep('//div[@class="iwantthis"]').discard('//a')
html, css = extractor.extract(HTML, CSS)

The result is :

>>> html
"""
<html>
  <body>
    <div id="main">
      <div class="iwantthis">
        HELLO WORLD
      </div>
    </div>
  </body>
</html>"""

>>> css
"""
div{border:1px solid black;}
div#main{color:blue;}
div.iwantthis{background-color:red;}
"""

Algorithm	Hash digest
SHA256	`fec7c008042f3202a17ebc6ec3cf319760b9a6d2f027ef5ffb64c30693e86fa7`
MD5	`331a09160ee97f144e1f92cafb5e2382`
BLAKE2b-256	`eb2998413e3139ad4521d1b51ce1a7f07765358a0c094f4102cb4e40c8d3fff2`

chopper 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Installation

Usage

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes