Data Services enables Data citizens to efficiently use Data from across Data estates that can be used for BI purposes. This is done by providing an integrated set of tools to acquire, transform and ultimately serve that Data.
During this release, IFS Data Services is a combination of below main services.
These are multi-tenant services. The main IFS Cloud Web functionalities launched during this release provide the capability to load Data into the Data Lake, enrich, cleanse and transform via a Data Pipeline for specific use cases. Data Services related functionalities are dependent on IFS.ai Platform.
1) Data Lake
The main data store for the Data Services will be a Data Lake (ADLS Gen 2). This Data Lake will hold Data based on different requirements (Analytics-based solutions, documents for indexing). Raw Data is ingested into the Data Lake, and the Data is enriched and transformed via a Data Pipeline for specific use cases of ESG /Copilot.
2) Data Pump
The Data Pump will do the actual data movement and Parquet file generation. It reads the Data from the Oracle database. The created Parquet file is sent to Data Lake service and then to the specific Data Lake.
3) Data Lake Service
Data Lake Service can be used to Upload or Download the specified files from/ into Cloud Storage, Add, Update, and Get metadata-related details in a Cloud Storage, and List Down the storage hierarchy according to a given container and for a given path as well within a Cloud Storage (currently supporting only the Azure Data Lake Storage). Tenant information is determined by the service.
4) Data Pipeline Service
Data Pipeline Service is used to start a Data Pipeline (Workflow) that can orchestrate several scripts. Tenant information is determined by the Data Pipeline service and it passes the Data Lake and connection information to the Workflow.
5) Workflow
Data Services uses Workflow to orchestrate the flow of running the scripts.
6) Workload Job Definitions
The Workload Job Definitions page in IFS Cloud Web can be used to load Data into the Data Lake and start a Data Pipeline via the Data Pipeline service. A Workload Job Definition consists of Data Sources and Workflows (Actions).
Further features available in IFS Cloud Web include: