Minimum requirements for Source data
Unique Row Identifier (for example, a primary key)
Every source table must have some unique row identifier. This is needed to trace unique rows through the pipeline and to efficiently detect new and updated records.The Unique Row Identifier can be a “synthetic” column (such as a concatenation or a hash of some other columns). The only requirement is that it is unique and unchanging.
Updated At timestamp (required for Fact tables)
For source tables under ~10 million total rows, Prequel can detect data changes without the use of an explicit Updated At timestamp.Optional enhancements
Updated At timestamp (optional for Dimension tables)
For more efficient change detection or to support volumes greater than 10m total rows, an Updated At column can greatly improve efficiency.Fact tables require a Updated At column (semantically, this would be synonomous with Created At) as a best practice. If a fact table does not have an Updated At column, it can be treated as a dimension table, but at the cost of lower efficiency.
Compatibility grid
| Dataset Type | Unique Row Identifier | Updated At Column | Volume Limits |
|---|---|---|---|
| Fact | Required | Required | 100m rows per day |
| Dimension | Required | Optional | 100m rows per day with Updated At, 10m rows total without Updated At |
Volume limits are based on current performance characteristics and may be revised in the future.