Good Day!
Inevitably this is a Compass / driver question, but framing of problem provided:
In one of the latest releases for Birst, a new Connect option was enabled that supposedly uses the Compass JDBC driver connection to connect to the Infor Data Lake, for Multi-tenant (non-Govcloud) customers.
- In theory, this is wonderful to limit the initial extract sizes by allowing filtering beyond the partition/high-level indexing that general sees us using the lastmodified datestamp and filtering through subsequent scripting.
- In practice, we are hitting the cached / optimized layer provided on the data lake, that is not necessarily up-to-date.
- I understand the data lake is a semi-relational/non-relational data storage, and the querying is more SQL-like.
A few questions related:
- What is the frequency of the rebuilds of this optimized layer for Compass? (e.g. nightly, weekly, every 5th run)
- Is there query syntax available to leverage through Compass, to avoid hitting the cache layer?
- Even in a less-than-optimal pull of the data, it should be faster than writing all the records for a given time frame, especially for smaller (active records) sources that are full-refresh modeling
- Assuming we are using a read-only connection, so running EXEC commands won't work through Birst
- Is there a setting in ION Desk / Data Fabric, to affect when these intermediate layers are rebuilt?
If we don't know when the data is refreshed, we cannot communicate correct framing to users.
Any clarification related to either my understanding of functionality, the use-case, unsolicited but relevant information, or answers to the related questions, are greatly appreciated!
Ben
