-
Notifications
You must be signed in to change notification settings - Fork 6
First Tutorial Part 8: Formulas
If you can get simple logic implemented in the database, then you should for a variety of reasons, in particular, centralization of logic and performance. An example of that are alculations you need to be performed on data that can be embedded in views instead of living somewhere else (e.g., application code, APIs or ETL logic). Surprisingly, the threshold where such logic becomes too complicated to be managed in the database is pretty high!
Formulas in Tilda are very powerful and offer many advantages:
- Package nuggets of calculations as part of a view definition
- Fully exploit the power of your underlying database functionality
- Centralize calculations in the database for reuse across any client
- Removes logic from more complex system such as ETL, APIs or application code
- Help from the Tilda compiler in managing complexity including validation and composition.
- Self documenation as in, the code is the documentation and the documentation is the code.
- Automated migration
Let's expand on the example we started where we collect test data from users. With Test_XYZ_PivotView and Test_View, we would like to implement some additional fun things:
- Implement a simple flag whether the user passed the test if they answered at least 2 out of the 3 questions.
- Implement a flag whether a user took the test in a time above or below the average time taken by users on average for the test.
- Implement a flag whether passing the test is considered challenging if only half of less of the questions across all tests have been answered correctly.
To do this, we have to do the following:
- Join our two views so we can have test-level aggregates to compare with vs individual user tests
- Implement a formula to calculate the "isPassed" flag
- Implement a formula to calculate the "tookLongerThanAverage"
- Implement a formula that reuses "isPassed" to calculate "wasChallenging"
The first of these should be fairly straightforward at this time:
,{ "name": "Test_XYZ_Analytics_View"
,"description": "A view of XYZ forms with analytics"
,"columns":[
{ "sameas": "Test_XYZ_PivotView.formRefnum" }
,{ "sameas": "Test_XYZ_PivotView.formType" }
,{ "sameas": "Test_XYZ_PivotView.formFillDate" }
,{ "sameas": "Test_XYZ_PivotView.formUserRefnum" }
,{ "sameas": "Test_XYZ_PivotView.formUserEmail" }
,{ "sameas": "Test_XYZ_PivotView.countCorrect" , "name":"formCountCorrect" }
,{ "sameas": "Test_XYZ_PivotView.timeMillisTotal" , "name":"formTimeMillisTotal" }
,{ "sameas": "Test_View.testCount" }
,{ "sameas": "Test_View.answerCountCorrect" , "name":"testAnswerCountCorrect" }
,{ "sameas": "Test_View.timeMillisAvg" , "name":"testTimeMillisAvg" }
]
,"joins": [
{ "object": "Test_View"
, "joinType": "INNER"
, "on": [ { "db":"*", "clause":"Test_View.\"type\" = Test_XYZ_PivotView.\"formType\"" }
]
}
]
}
The only new thing that we haven't done before is add an explicit "join": while joining views, the Tilda compiler won't necessarily know how you need to join things for your needs and as such, you'll need to specify that yourself. In our case here, we want to join on the Form types. Because Test_XYZ_PivotView already filters for 'XYZ' tests only, we don't need to further specify that for Test_View or this combined view.
Next, let's work on our 3 formulas:
- isPassed: we need to calculate if "formCountCorrect" >= 2.
- tookLongerThanAverage: we need to calculate if "formTimeMillisTotal" > "testTimeMillisAvg"
- wasChallenging: we need to calculate if "isPassed"=1 AND "testAnswerCountCorrect" < "testCount"*3/2
,"formulaColumns":[
{ "name": "isPassed"
,"type": "INTEGER"
,"formula":[
"formCountCorrect >= 2"
]
,"title": "Test Passed"
,"description":[
"Whether the test was passed or not by answering at least 2 out of the 3 questions."
]
}
,{ "name": "tookLongerThanAverage"
,"type": "INTEGER"
,"formula":[
"formTimeMillisTotal > testTimeMillisAvg"
]
,"title": "Test Took Longer Than Average"
,"description":[
"Whether the test took longer that the average time spent across all tests."
]
}
,{ "name": "wasChallenging"
,"type": "INTEGER"
,"formula":[
" isPassed=1"
,"AND testAnswerCountCorrect < testCount*3/2"
]
,"title": "Test Was Challenging"
,"description":[
"Whether the test was challenging in that:<LI>"
," <LI>it was passed,</LI>"
," <LI>and overall, less than a third of answers across all tests were answered correctly.</LI>"
,"</UL>"
]
,"values":[
{ "value":"0", "description":"The test was not challenging" }
,{ "value":"1", "description":"The test was challenging" }
]
}
]
The elements are:
- name: The mandatory name of the formula that will result in a column definition, so the regular Tilda Naming Convention applies.
- measure: true|false, optional, false by default. Whether the formula is considered a measure. If specified to true, some additional meta-data will be generated to support querying measure information at the database level.
- formula: An array of string defining the formula in plain SQL as per any syntax your database allows as part of a view column definition. Note that columns referenced do not need to be quoted.
- title: The mandatory title for the formula.
- description: The mandatory array of strings for a possibly multi-line description of the formula.
- values: An optional list of values defining the expected outcomes of the formula if enumerated, i.e., a count for example is a value, but a flag often is null, 0 or 1, which means its values are explicitly enumated. This is optional and is used to support documentation where a formula's possible values can be enumerated.
🎈 NOTE: A formula's description can include HTML tags for formatting in the final generated HTML documentation. Tilda doesn't check if that HTML is valid or not.
🎈 NOTE: A formula's implementation is not explicitly checked by Tilda at build-time (during Gen) since it's open to any syntax supported by the underlying database. As such, it is only checked during Migration when the formula is incorporated into the view and the view is declared in the database. Tilda defines a few common database utility functions (see Common Helper Database Functions).
🎈 NOTE: Most formulas we have seen are somewhat simple, like the examples above. That being said, it's possible to write a formula that makes a view significantly more expensive, for example, something calling a stored procedure that runs multiple queries, or even a non-optimized sub-select. As always, check your performance for the generated view and optimize as needed.
The SQL generated, once again should be rather straightforward:
create or replace view TILDATEST.Test_XYZ_Analytics_View as
select /*DoFormulasSuperView*/
"formRefnum" -- COLUMN
, "formType" -- COLUMN
, "formFillDateTZ" -- COLUMN
, "formFillDate" -- COLUMN
, "formUserRefnum" -- COLUMN
, "formUserEmail" -- COLUMN
, "formCountCorrect" -- COLUMN
, "formTimeMillisTotal" -- COLUMN
, "testCount" -- COLUMN
, "testAnswerCountCorrect" -- COLUMN
, "testTimeMillisAvg" -- COLUMN
-- Whether the test was passed or not by answering at least 2 out of the 3 questions.
, (coalesce("formCountCorrect", 0) >= 2)::integer as "isPassed"
-- Whether the test took longer that the average time spent across all tests.
, (coalesce("formTimeMillisTotal", 0) > coalesce("testTimeMillisAvg", 0))::integer as "tookLongerThanAverage"
-- Whether the test was challenging in that:<LI>
-- <LI>it was passed,</LI>
-- <LI>and overall, less than a third of answers across all tests were answered correctly.</LI>
-- </UL>
, ((coalesce("formCountCorrect", 0) >= 2)::integer=1 AND coalesce("testAnswerCountCorrect", 0) < coalesce("testCount", 0)*3/2)::integer as "wasChallenging"
from (
-- 'A view of XYZ forms with analytics'
select TILDATEST.Test_XYZ_PivotView."formRefnum" as "formRefnum" -- The primary key for this record
, TILDATEST.Test_XYZ_PivotView."formType" as "formType" -- Form template type
, TILDATEST.Test_XYZ_PivotView."formFillDateTZ" as "formFillDateTZ" -- Generated helper column to hold the time zone ID for 'formFillDate'.
, TILDATEST.Test_XYZ_PivotView."formFillDate" as "formFillDate" -- The date the form was filled
, TILDATEST.Test_XYZ_PivotView."formUserRefnum" as "formUserRefnum" -- The primary key for this record
, TILDATEST.Test_XYZ_PivotView."formUserEmail" as "formUserEmail" -- The user's email
, TILDATEST.Test_XYZ_PivotView."countCorrect" as "formCountCorrect" -- Whether the answer is correct or not
, TILDATEST.Test_XYZ_PivotView."timeMillisTotal" as "formTimeMillisTotal" -- Time in milliseconds for the time spent answering the question
, TILDATEST.Test_View."testCount" as "testCount" -- The primary key for this record
, TILDATEST.Test_View."answerCountCorrect" as "testAnswerCountCorrect" -- Whether the answer is correct or not
, TILDATEST.Test_View."timeMillisAvg" as "testTimeMillisAvg" -- Time in milliseconds for the time spent answering the question
from TILDATEST.Test_XYZ_PivotView
inner join TILDATEST.Test_View on Test_View."type" = Test_XYZ_PivotView."formType"
) as T
;
COMMENT ON VIEW TILDATEST.Test_XYZ_Analytics_View IS E'-- DDL META DATA VERSION 2019-01-09\ncreate or replace view TILDATEST.Test_XYZ_Analytics_View as ...';
COMMENT ON COLUMN TILDATEST.Test_XYZ_Analytics_View."formRefnum" IS E'The primary key for this record';
COMMENT ON COLUMN TILDATEST.Test_XYZ_Analytics_View."formType" IS E'Form template type';
COMMENT ON COLUMN TILDATEST.Test_XYZ_Analytics_View."formFillDateTZ" IS E'Generated helper column to hold the time zone ID for ''formFillDate''.';
COMMENT ON COLUMN TILDATEST.Test_XYZ_Analytics_View."formFillDate" IS E'The date the form was filled';
COMMENT ON COLUMN TILDATEST.Test_XYZ_Analytics_View."formUserRefnum" IS E'The primary key for this record';
COMMENT ON COLUMN TILDATEST.Test_XYZ_Analytics_View."formUserEmail" IS E'The user''s email';
COMMENT ON COLUMN TILDATEST.Test_XYZ_Analytics_View."formCountCorrect" IS E'Whether the answer is correct or not';
COMMENT ON COLUMN TILDATEST.Test_XYZ_Analytics_View."formTimeMillisTotal" IS E'Time in milliseconds for the time spent answering the question';
COMMENT ON COLUMN TILDATEST.Test_XYZ_Analytics_View."testCount" IS E'The primary key for this record';
COMMENT ON COLUMN TILDATEST.Test_XYZ_Analytics_View."testAnswerCountCorrect" IS E'Whether the answer is correct or not';
COMMENT ON COLUMN TILDATEST.Test_XYZ_Analytics_View."testTimeMillisAvg" IS E'Time in milliseconds for the time spent answering the question';
COMMENT ON COLUMN TILDATEST.Test_XYZ_Analytics_View."isPassed" IS E'The calculated formula: Whether the test was passed or not by answering at least 2 out of the 3 questions.';
COMMENT ON COLUMN TILDATEST.Test_XYZ_Analytics_View."tookLongerThanAverage" IS E'The calculated formula: Whether the test took longer that the average time spent across all tests.';
COMMENT ON COLUMN TILDATEST.Test_XYZ_Analytics_View."wasChallenging" IS E'The calculated formula: Whether the test was challenging in that:<LI>\n <LI>it was passed,</LI>\n <LI>and overall, less than a third of answers across all tests were answered correctly.</LI>\n</UL>';
Again, this should be straightforward to understand. Each formula is tracked as a view column, and validation, migration and documentation are automated. The only tricks are:
- The columns used in the formulas, if they match any of the known columns from the tables and subviews used, will be escaped. If you were to embed a subselect for example to another query, you will want to quote identifiers as appropriate.
- The compiler knows that you can convert a boolean (the type of the expression) to an integer, and since the formula was defined as an Integer, the generated code does that automatically (using Postgres :: type conversation notation).