Skip to main content

Format agnostic tabular data library (XLS, JSON, YAML, CSV)

Project description

Tablib: format-agnostic tabular dataset library

_____         ______  ___________ ______
__  /_______ ____  /_ ___  /___(_)___  /_
_  __/_  __ `/__  __ \__  / __  / __  __ \
/ /_  / /_/ / _  /_/ /_  /  _  /  _  /_/ /
\__/  \__,_/  /_.___/ /_/   /_/   /_.___/

Tablib is a format-agnostic tabular dataset library, written in Python.

Output formats supported:

  • Excel (Sets + Books)

  • JSON (Sets + Books)

  • YAML (Sets + Books)

  • CSV (Sets)

Import formats supported:

  • JSON (Sets + Books)

  • YAML (Sets + Books)

  • CSV (Sets)

Note that tablib purposefully excludes XML support. It always will.

Overview

tablib.Dataset()

A Dataset is a table of tabular data. It may or may not have a header row. They can be build and maniuplated as raw Python datatypes (Lists of tuples|dictonaries). Datasets can be imported from JSON, YAML, and CSV; they can be exported to Excel (XLS), JSON, YAML, and CSV.

tablib.Databook()

A Databook is a set of Datasets. The most common form of a Databook is an Excel file with multiple spreadsheets. Databooks can be imported from JSON and YAML; they can be exported to Excel (XLS), JSON, and YAML.

Usage

Populate fresh data files:

headers = ('first_name', 'last_name')

data = [
    ('John', 'Adams'),
    ('George', 'Washington')
]

data = tablib.Dataset(*data, headers=headers)

Intelligently add new rows:

>>> data.append(('Henry', 'Ford'))

Intelligently add new columns:

>>> data.append(col=('age', 90, 67, 83))

Slice rows:

>>> print data[:2]
[('John', 'Adams', 90), ('George', 'Washington', 67)]

Slice columns by header:

>>> print data['first_name']
['John', 'George', 'Henry']

Easily delete rows:

>>> del data[1]

Exports

Drumroll please………..

JSON!

>>> print data.json
[
  {
    "last_name": "Adams",
    "age": 90,
    "first_name": "John"
  },
  {
    "last_name": "Ford",
    "age": 83,
    "first_name": "Henry"
  }
]

YAML!

>>> print data.yaml
- {age: 90, first_name: John, last_name: Adams}
- {age: 83, first_name: Henry, last_name: Ford}

CSV…

>>> print data.csv
first_name,last_name,age
John,Adams,90
Henry,Ford,83

EXCEL!

>>> open('people.xls', 'wb').write(data.xls)

It’s that easy.

Imports!

JSON

>>> data.json = '[{"last_name": "Adams","age": 90,"first_name": "John"}]'
>>> print data[0]
('John', 'Adams', 90)

YAML

>>> data.yaml = '- {age: 90, first_name: John, last_name: Adams}'
>>> print data[0]
('John', 'Adams', 90)

CSV

>>> data.yaml = 'age, first_name, last_name\n90, John, Adams'
>>> print data[0]
('John', 'Adams', 90)

>>> print data.yaml
- {age: 90, first_name: John, last_name: Adams}

Installation

To install tablib, simply:

$ pip install tablib

Or, if you absolutely must:

$ easy_install tablib

Contribute

If you’d like to contribute, simply fork the repository, commit your changes to the develop branch (or branch off of it), and send a pull request. Make sure you add yourself to AUTHORS.

Roadmap

  • Release CLI Interface

  • Auto-detect import format

  • Add possible other exports (SQL?)

  • Ability to assign types to rows (set, regex=, &c.)

History

0.8.0 (2010-09-25)

  • New format plugin system!

  • Imports! ELEGANT Imports!

  • Tests. Lots of tests.

0.7.1 (2010-09-20)

  • Reverting methods back to properties.

  • Windows bug compenated in documentation.

0.7.0 (2010-09-20)

  • Renamed DataBook Databook for consistiency.

  • Export properties changed to methods (XLS filename / StringIO bug).

  • Optional Dataset.xls(path=’filename’) support (for writing on windows).

  • Added utf-8 on the worksheet level.

0.6.4 (2010-09-19)

  • Updated unicode export for XLS.

  • More exhaustive unit tests.

0.6.3 (2010-09-14)

  • Added Dataset.append() support for columns.

0.6.2 (2010-09-13)

  • Fixed Dataset.append() error on empty dataset.

  • Updated Dataset.headers property w/ validation.

  • Added Testing Fixtures.

0.6.1 (2010-09-12)

  • Packaging hotfixes.

0.6.0 (2010-09-11)

  • Public Release.

  • Export Support for XLS, JSON, YAML, and CSV.

  • DataBook Export for XLS, JSON, and YAML.

  • Python Dict Property Support.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page