Hello,
I have access to SQL server with transactional data. I am working on a process to upload this to a TB cube.
At month end, these records may be updated with new base values. This is captured on premise with a stored procedure that updates a "master" table and increments a column called "VarID" (variation id). Following the guide I have a unique identifier, a timestamp as well as VarID in my master table.
Next, I am using three data flows:
1. LoadCubeMaster - this triggers a stored procedure to upload the master table.
2. CubeMasterToDL - this uses a JSON document to send the master table to the data lake.
3. Master - this runs a relational modelling script that runs a load query of said document to a relational table, then runs a mapping to a cube
The problem I am having is related to the variation ID. If one of the records gets updated then the VarID increments and uploads that record. But once this data flow is complete, if another record is updated then the data flow will no longer pick this up without selecting the active connection point and rewinding the incremental again. The goal is for this to be handled without any user input.
What is the recommended way to handle updated records?
How will this affect future periods?
I am relatively new to using the data lake so any advice would be greatly appreciated.