ETL / Data Testing Engineer
Experience:2-4years
Salary Range:Upto 8 LPA
Job Overview:
The ETL and Data Testing Engineer will be responsible for ensuring the accuracy, completeness, and integrity of data processed through the Extract, Transform, and Load (ETL) Pipeline. This role involves close collaboration with ETL developers, data engineers, and business stakeholders to validate that the ETL processes meet both technical and business requirements. The ideal candidate will apply their expertise in ETL testing, data validation, and performance tuning to ensure seamless data integration and quality across the system.
Key Responsibilities:
- Design, develop, and execute test cases for ETL processes to ensure data accuracy, transformation logic, and mapping validation.
- Validate data migration from source to destination (data warehouses, data lakes, etc.) for consistency, completeness, and accuracy, ensuring data integrity across systems.
- Use SQL and other data verification tools to test and validate large data sets, optimizing SQL queries, stored procedures, and views for performance improvement.
- Automate repetitive ETL and data validation tasks using appropriate tools (e.g., , Python, Pyspark, Pandas or ETL-specific tools), reducing manual efforts.
- Validate data models and entity relationship diagrams to ensure alignment with business requirements and technical specifications.
- Perform data profiling, quality checks, and statistical analysis on structured and unstructured data sets to identify patterns, trends, or anomalies that impact data quality.
- Apply data masking methodologies to protect sensitive data during the testing process.
- Collaborate with ETL developers, data engineers, and other stakeholders to troubleshoot issues, identify root causes, and resolve data and pipeline problems.
- Document and track defects in ETL processes, communicating testing progress, issues, and results clearly to cross-functional teams and ensuring timely resolution.
- Leverage knowledge of metadata management tools to support data governance and ensure comprehensive documentation of data assets.
- Analyze and resolve data discrepancies, providing recommendations to optimize ETL processes, and improve overall system performance.
- Must have the ability to multi-task and adapt quickly to changes while maintaining urgency in completing assigned tasks.
- Stay updated on ETL best practices, tools, and technologies, applying them to enhance testing efficiency, data integration, and quality across the organization.
Skill Sets:
Mandatory Skills:
Programming Languages: Python, SQL, PySpark, Pandas
Data Warehousing: Snowflake, Redshift, Google BigQuery, RDS (Anyone)
ETL Tools: Azure Data Factory, AWS Glue, Informatica (Anyone)
Other Skills: Strong Communication Skills, Team Handling experience
Add-On Skills:
Data Validation Tools: Informatica data validation
Cloud Services: AWS Lambda
Data Transformation Tools: dbt (Data Build Tool)
BI Tools: Power BI, Tableu (or any other relevant tool)
If this excites you, Please fill the Form to start the application process and we will be in touch