Hi Team,
Sharing a detailed update on the progress of ECS onboarding for DG:
- We have successfully refactored the ingestion job for the database application by removing all dependencies on HDFS. This ensures better compatibility with the ECS environment and aligns with the target architecture.
- The refactored application is now able to run on ECS using spark-submit in local mode as well as in client mode. This confirms that the basic setup and execution flow are working as expected.
- The job is capable of establishing a connection with the source database, reading the required data, and creating audit entries, validating the end-to-end ingestion flow.
- Additionally, we have validated the setup for the CIRAS application on ECS. At the moment, we are reading a sample set of 10 records and printing them to the console to verify correctness and connectivity.
- Currently, data ingestion has been successfully tested with MySQL Server and Oracle databases, confirming that the system can handle multiple source types.
- As next steps, we plan to:
- Test ingestion with larger data volumes to evaluate performance and stability.
- Gradually onboard and validate additional data sources one by one for reading the data.