Skip to main content
2025 Python Packaging Survey is now live!  Take the survey now

CRATE: clinical records anonymisation and text extraction

Project description

Purpose

  • Anonymises relational databases.

  • Performs some specific preprocessing tasks; e.g.

    • preprocesses some specific databases (e.g. Servelec RiO EMR).

    • fetches some word lists, e.g. forenames/surnames/eponyms.

  • Provides a natural language processing (NLP) pipeline.

  • Web app for

    • querying the anonymised database

    • managing a consent-to-contact process

Documentation

See https://crateanon.readthedocs.io

Sources

Licence

  • Copyright (C) 2015-2021 Rudolf Cardinal (rudolf@pobox.com).

  • Licensed under the GNU GPL v3+: see LICENSE file.

  • Some third-party libraries have slightly different licences:

    • aspects of CamAnonGatePipeline.java are based on demonstration GATE code, copyright (C); University of Sheffield, and licensed under the GNU LGPL; see https://gate.ac.uk/.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page