ION ETL - DLDocDate vs RepSet_Variation_ID

I'm just starting with the ION ETL tool, so far I have all the GHR tables being transferred to my local SQL database (TEST), I'll work on FSM next. All went well with the initial download, except for running the jvm out of heap space. Hopefully I've fixed that.

I missed implementing logging when I set it up so now I'm going back and making these changes and running the transformation to validate it. I'm noticing, even though no data has changed on my Test DataLake, I'm getting every single record originally downloaded. For instance EmployeeFieldHistory is still running, after 2.5 hours, and has processed over 7 million records.

I checked the sys.dm_db_index_usage_stats and sure enough is says for "user_updates" over 7 million.

SELECT OBJECT_NAME(OBJECT_ID) AS TableName,
last_user_update,*
FROM sys.dm_db_index_usage_stats
WHERE database_id = DB_ID( 'INFORDWHR_SRC')
AND OBJECT_ID=OBJECT_ID('EmployeeFieldHistory ')

Is this just a one off or will this full scan occur every time I run it? If it happens every time, then it is of no use to me.

Why don't we use the RepSet_Variation_ID instead of the new field DLDocDate? I would think the initial query would use the max RepSet_Variation_ID download locally and get any greater from the DataLake.

What am I missing here that is seems so simple to me, yet is not the standard in your template?

I will admit I have my own set of tables in the DataLake I use, the tables are not updated by anything else but replication, all fields are downloaded to my SQL database with no modifications.

Thank you

Find more posts tagged with

Infor Data Fabric

Accepted answers

All comments

Bojan Rafajloski

Hello Polar Bear,

Which function are you using to ETL Data in the Data Lake Input Step to pull data? 'Query' or 'QueryAll'?

Current best practices is to utilize the QueryAll function as the ETL Client has built in support for incremental loads based on the Data Object Timestamp (dlDocumentDate / Atlas Indexed Date). The incremental extraction is using the object timestamp, instead of a record timestamp or sequence number, because this method is pulling objects from the Data Lake.

User Guide on Creating a new transformation.

In case you decide to use the 'Query' function to extract data using the Compass Query Platform you can build a query with a different incremental logic. The ETL Client doesn't have out of the box support for QueryAll and incremental ETL. We still recommend that you consider if to utilize an timestamp for the incremental ETL. In this case via Compass that can be done with the column "lastModified()". lastModified is a reference to the object's index date. Extracting data based on the object timestamp ensures that data is pulled in sequence as it appeared in Data Lake and that you are not having potential gaps in your pulls..

Legacy Contributor

Thanks for the reply Bojan,

I am using the QueryAll, I've set this up as the documentation provided suggested.

mark-booker

I am looking to start using ETL to pull FSM data into an on premise DB. GeneralLedgerTransactionDetail for example. Can you share your experiances?

Copyright © 2025 Infor. All rights reserved.