跳转到主要内容

DBML和Fides数据集清单的互操作性

项目描述

dbml-to-fides

此工具将DBML转换为Fides数据集清单

它可选地支持将DBML的结果合并到现有的Fides数据集清单中。

结合使用,这可以用于自动化,以确保数据集与持续集成中最新的模式更改保持最新。

用法

基本用法

给定一个位于sample.dbml中的示例DBML

Table users {
  id integer [primary key]
  username varchar
  role varchar
  created_at timestamp
}

Table posts {
  id integer [primary key]
  title varchar
  body text [note: 'Content of the post']
  user_id integer
  status post_status
  created_at timestamp
}

Enum post_status {
  draft
  published
  private [note: 'visible via URL only']
}

Ref: posts.user_id > users.id // many-to-one

dbml-to-fides将输出它从DBML文件中推断出的Fides数据集

$ dbml-to-fides sample.dbml
dataset:
- name: public
  collections:
  - name: users
    description: Users
    fields:
    - name: id
      fides_meta:
        primary_key: true
    - name: username
    - name: role
    - name: created_at
  - name: posts
    description: All the content you crave
    fields:
    - name: id
      fides_meta:
        primary_key: true
    - name: title
    - name: body
      description: Content of the post
    - name: user_id
      fides_meta:
        references:
        - dataset: public
          field: users.id
          direction: to
    - name: status
    - name: created_at

与现有Fides数据集合并

如果您有一个位于.fides/sample_dataset.yml的现有Fides数据集

dataset:
- fides_key: sample_dataset
  organization_fides_key: default_organization
  name: public
  description: Sample dataset for my system
  meta: null
  data_categories: []
  data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
  retention: 30 days after account deletion
  collections:
  - name: users
    description: User information
    fields:
    - name: id
      fides_meta:
        primary_key: true
      description: User's unique ID
      data_categories:
      - user.unique_id
      data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
    - name: username
      description: User's username
      data_categories:
      - user.name
      data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
      retention: Account termination
    - name: role
      description: User's system level role/privilege
      data_categories:
      - system.operations
      data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
    - name: created_at
      description: User's creation timestamp
      data_categories:
      - system.operations
      data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
  - name: posts
    description: Post information
    fields:
    - name: id
      fides_meta:
        primary_key: true
      description: Post's unique ID
      data_categories:
      - system.operations
      data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
    - name: title
      description: Post's title
      data_categories:
      - system.operations
      data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
    - name: body
      description: Post's body
      data_categories:
      - system.operations
      data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
    - name: user_id
      fides_meta:
        references:
        - dataset: public
          field: users.id
          direction: to
      description: Post creator's unique User ID
      data_categories:
      - user.unique_id
      data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
    - name: status
      description: User's creation timestamp
      data_categories:
      - system.operations
      data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
    - name: created_at
      description: Post's creation timestamp
      data_categories:
      - system.operations
      data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified

dbml-to-fides可以使用--base-dataset选项合并结果。但,在这种情况下没有差异

$ diff -u .fides/sample_dataset.yml <(dbml-to-fides sample.dbml --base-dataset .fides/sample_dataset.yml)
$

如果我们对DBML进行了更改

@@ -3,6 +3,7 @@ Table users {
   username varchar
   role varchar
   created_at timestamp
+  social_security_number varchar
 }
 
 Table posts {

然后再次运行我们的diff将会将字段添加到我们的Fides数据集中

$ diff -u .fides/sample_dataset.yml <(dbml-to-fides sample.dbml --base-dataset .fides/sample_dataset.yml)
--- .fides/sample_dataset.yml	2023-05-22 15:39:24
+++ /dev/fd/63	2023-05-22 15:40:07
@@ -34,6 +34,7 @@
       data_categories:
       - system.operations
       data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
+    - name: social_security_number
   - name: posts
     description: Post information
     fields:

文件输出

如果我们想要将输出写入文件,我们将添加--output-file标志

$ dbml-to-fides sample.dbml --base-dataset .fides/sample_dataset.yml --output-file .fides/sample_dataset.yml
$ git diff
diff --git a/.fides/sample_dataset.yml b/.fides/sample_dataset.yml
index 594cee4..edc3141 100644
--- a/.fides/sample_dataset.yml
+++ b/.fides/sample_dataset.yml
@@ -34,6 +34,7 @@ dataset:
       data_categories:
       - system.operations
       data_qualifier: aggregated.anonymized.unlinked_pseudonymized.pseudonymized.identified
+    - name: social_security_number
   - name: posts
     description: Post information
     fields:

初始生成

如果您没有现有的Fides数据集,--include-fides-keys标志将创建一个包含所有键的更“完整”的Fides数据集版本。有关每个字段可以/应该用什么信息填充的详细信息,请参阅文档

$ dbml-to-fides sample.dbml --include-fides-keys
dataset:
- fides_key: null
  name: public
  description: null
  organization_fides_key: null
  meta: {}
  third_country_transfers: []
  joint_controller: []
  retention: null
  data_categories: []
  data_qualifiers: []
  collections:
  - name: users
    description: Users
    data_categories: []
    data_qualifiers: []
    retention: null
    fields:
    - name: id
      description: null
      data_categories: []
      data_qualifier: null
      retention: null
      fides_meta:
        primary_key: true
    - name: username
      description: null
      data_categories: []
      data_qualifier: null
      retention: null
    - name: role
      description: null
      data_categories: []
      data_qualifier: null
      retention: null
    - name: created_at
      description: null
      data_categories: []
      data_qualifier: null
      retention: null
    - name: social
      description: null
      data_categories: []
      data_qualifier: null
      retention: null
  - name: posts
    description: All the content you crave
    data_categories: []
    data_qualifiers: []
    retention: null
    fields:
    - name: id
      description: null
      data_categories: []
      data_qualifier: null
      retention: null
      fides_meta:
        primary_key: true
    - name: title
      description: null
      data_categories: []
      data_qualifier: null
      retention: null
    - name: body
      description: Content of the post
      data_categories: []
      data_qualifier: null
      retention: null
    - name: user_id
      description: null
      data_categories: []
      data_qualifier: null
      retention: null
      fides_meta:
        references:
        - dataset: public
          field: users.id
          direction: to
    - name: status
      description: null
      data_categories: []
      data_qualifier: null
      retention: null
    - name: created_at
      description: null
      data_categories: []
      data_qualifier: null
      retention: null

项目详情


下载文件

下载适用于您的平台的文件。如果您不确定选择哪个,请了解有关安装包的更多信息。

源代码分发

dbml-to-fides-1.0.0b1.tar.gz (17.3 kB 查看哈希)

上传时间 源代码

构建分发

dbml_to_fides-1.0.0b1-py3-none-any.whl (7.1 kB 查看哈希)

上传时间 Python 3

由以下机构支持

AWSAWS云计算和安全赞助商DatadogDatadog监控FastlyFastlyCDNGoogleGoogle下载分析MicrosoftMicrosoftPSF赞助商PingdomPingdom监控SentrySentry错误日志StatusPageStatusPage状态页面