Skip to main content

Python Implementation of Pipelined Relational Query Language (PRQL)

Project description

PyPrql

Python implementation of PRQL.

Documentation of PRQL is at https://github.com/max-sixty/prql

from employees
filter country = "USA"
derive [
  gross_salary: salary + payroll_tax,
  gross_cost:   gross_salary + benefits_cost
]
filter gross_cost > 0
aggregate by:[title, country] [
    average salary,
    sum     salary,
    average gross_salary,
    sum     gross_salary,
    average gross_cost,
    sum_gross_cost: sum gross_cost,
    row_count: count salary
]
sort sum_gross_cost
filter row_count > 200
take 20

from pyprql import prql

sql = prql.to_sql(q)
print(sql)

SELECT AVG(salary),
       SUM(salary),
       AVG(salary + payroll_tax),
       SUM(salary + payroll_tax),
       AVG(salary + payroll_tax + benefits_cost),
       SUM(salary + payroll_tax + benefits_cost) as sum_gross_cost,
       COUNT(salary)                             as row_count,
       salary + payroll_tax                      as gross_salary,
       (salary + payroll_tax) + benefits_cost    as gross_cost
FROM ` employees ` employees_e
WHERE country="USA" AND (gross_salary+benefits_cost)>0
GROUP BY title, country
HAVING row_count >200
ORDER BY sum_gross_cost
LIMIT 20

Differences from the spec

The parser is only able to parse casts in select statements insde of [ ], so

select foo | as float

will fail, it must be wrapped in brackets as a single item list.

select [ foo | as float ]

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page