Become Amazon Certified with updated Data-Engineer-Associate exam questions and correct answers
A company specializing in geospatial data analysis needs to regularly import, store, and query large sets of geospatial data. Their current setup with a traditional SQL database struggles with the performance of complex spatial queries and efficient storage. Their Data Engineering Team is considering moving to a cloud-based solution that can optimize the storage and querying of geospatial data, particularly for location-based queries and spatial analysis.
Which AWS solution would best meet their needs?
A data engineer has two datasets that contain sales information for multiple cities and states. One dataset is named reference, and the other dataset is named primary. The data engineer needs a solution to determine whether a specific set of values in the city and state columns of the primary dataset exactly match the same specific values in the reference dataset. The data engineer wants to use Data Quality Definition Language (DQDL) rules in an AWS Glue Data Quality job. Which rule will meet these requirements?
A data engineer wants to orchestrate a set of extract, transform, and load (ETL) jobs that run on AWS. TheETL jobs contain tasks that must run Apache Spark jobs on Amazon EMR, make API calls to Salesforce, andload data into Amazon Redshift.The ETL jobs need to handle failures and retries automatically. The data engineer needs to use Python toorchestrate the jobs.Which service will meet these requirements?
A company has as JSON file that contains personally identifiable information (PIT) data and non-PII data. The company needs to make the data available for querying and analysis. The non-PII data must be available to everyone in the company. The PII data must be available only to a limited group of employees. Which solution will meet these requirements with the LEAST operational overhead?
A company receives test results from testing facilities that are located around the world. The company storesthe test results in millions of 1 KB JSON files in an Amazon S3 bucket. A data engineer needs to process thefiles, convert them into Apache Parquet format, and load them into Amazon Redshift tables. The dataengineer uses AWS Glue to process the files, AWS Step Functions to orchestrate the processes, and AmazonEventBridge to schedule jobs.The company recently added more testing facilities. The time required to process files is increasing. The dataengineer must reduce the data processing time.Which solution will MOST reduce the data processing time?
© Copyrights DumpsCertify 2025. All Rights Reserved
We use cookies to ensure your best experience. So we hope you are happy to receive all cookies on the DumpsCertify.