Browse DBS Questions
Study all 100 questions at your own pace with detailed explanations
Total: 100 questionsPage: 9 of 10
Question 81 of 100
An organization is designing an application architecture. The application will have over 100 TB of data and will support transactions that arrive at rates from hundreds per second to tens of thousands per second, depending on the day of the week and time of the day. All transaction data must be durably and reliably stored. Certain read operations must be performed with strong consistency. Which solution meets these requirements?
AUse Amazon DynamoDB as the data store and use strongly consistent reads when necessary.
BUse an Amazon Relational Database Service (RDS) instance sized to meet the maximum anticipated transaction rate and with the High Availability option enabled.
CDeploy a NoSQL data store on top of an Amazon Elastic MapReduce (EMR) cluster, and select the HDFS High Durability option.
DUse Amazon Redshift with synchronous replication to Amazon Simple Storage Service (S3) and row-level locking for strong consistency.
💡 Try to answer first, then click "Show Answer" to see the correct answer and explanation
Question 82 of 100
A data engineer wants to use an Amazon Elastic Map Reduce for an application. The data engineer needs to make sure it complies with regulatory requirements. The auditor must be able to confirm at any point which servers are running and which network access controls are deployed. Which action should the data engineer take to meet this requirement?
AProvide the auditor IAM accounts with the SecurityAudit policy attached to their group.
BProvide the auditor with SSH keys for access to the Amazon EMR cluster.
CProvide the auditor with CloudFormation templates.
DProvide the auditor with access to AWS DirectConnect to use their existing tools.
💡 Try to answer first, then click "Show Answer" to see the correct answer and explanation
Question 83 of 100
You are using IOT sensors to monitor the movement of a group of hikers on a three day trek and send the information into an Kinesis Stream. They each have a sensor in their shoe and you know for certain that there is no problem with mobile coverage so all the data is getting back to the stream. You have used default settings for the stream. At the end of the third day the data is sent to an S3 bucket. When you go to interpret the data in S3 there is only data for the last day and nothing for the first 2 days. Which of the following is the most probable cause of this?
ATemporary loss of mobile coverage; although mobile coverage was good in the area, even temporary loss of data will stop the streaming
BYou cannot send Kinesis data to the same bucket on consecutive days if you do not have versioning enabled on the bucket. If you don't have versioning enabled you would need to define 3 different buckets or else the data is overwritten each day
CData records are only accessible for a default of 24 hours from the time they are added to a stream.
DA sensor probably stopped working on the second day. If one sensor fails, no data is sent to the stream until that sensor is fixed
💡 Try to answer first, then click "Show Answer" to see the correct answer and explanation
Question 84 of 100
A research scientist is planning for the one-time launch of an Elastic MapReduce cluster and is encouraged by her manager to minimize the costs. The cluster is designed to ingest 200TB of genomics data with a total of 100 Amazon EC2 instances and is expected to run for around four hours. The resulting data set must be stored temporarily until archived into an Amazon RDS Oracle instance. Which option will help save the most money while meeting requirements?
AStore ingest and output files in Amazon S3. Deploy on-demand for the master and core nodes and spot for the task nodes.
BOptimize by deploying a combination of on-demand, RI and spot-pricing models for the master, core and task nodes. Store ingest and output files in Amazon S3 with a lifecycle policy that archives them to Amazon Glacier.
CStore the ingest files in Amazon S3 RRS and store the output files in S3. Deploy Reserved Instances for the master and core nodes and on-demand for the task nodes.
DDeploy on-demand master, core and task nodes and store ingest and output files in Amazon S3 RRS
💡 Try to answer first, then click "Show Answer" to see the correct answer and explanation
Question 85 of 100
A company needs to deploy a data lake solution for their data scientists in which all company data is accessible and stored in a central S3 bucket. The company segregates the data by business unit, using specific prefixes. Scientists can only access the data from their own business unit. The company needs a single sign-on identity and management solution based on Microsoft Active Directory (AD) to manage access to the data in Amazon S3. Which method meets these requirements?
AUse AWS IAM Federation functions and specify the associated role based on the users' groups in AD.
BCreate bucket policies that only allow access to the authorized prefixes based on the users' group name in Active Directory.
CDeploy the AD Synchronization service to create AWS IAM users and groups based on AD information.
DUse Amazon S3 API integration with AD to impersonate the users on access in a transparent manner.
💡 Try to answer first, then click "Show Answer" to see the correct answer and explanation
Question 86 of 100
A company has several teams of analysts. Each team of analysts has their own cluster. The teams need to run SQL queries using Hive, Spark-SQL, and Presto with Amazon EMR. The company needs to enable a centralized metadata layer to expose the Amazon S3 objects as tables to the analysts. Which approach meets the requirement for a centralized metadata layer?
AEMRFS consistent view with a common Amazon DynamoDB table
BBootstrap action to change the Hive Metastore to an Amazon RDS database
Cs3distcp with the output Manifest option to generate RDS DDL
DNaming scheme support with automatic partition discovery from Amazon S3
💡 Try to answer first, then click "Show Answer" to see the correct answer and explanation
Question 87 of 100
A Company has two batch processing applications that consume financial data about the day's stock transactions. Each transaction needs to be stored durably and guarantee that a record of each application is delivered so the audit and billing batch processing applications can process the data. However, the two applications run separately and several hours apart and need access to the same transaction information. After reviewing the transaction information for the day, the information no longer needs to be stored. What is the best way to architect this application? Choose the correct answer from the options below
AUse SQS for storing the transaction messages. When the billing batch process consumes each message, have the application create an identical message and place it in a different SQS for the audit application to use several hours later.
BUse SQS for storing the transaction messages; when the billing batch process performs first and consumes the message, write the code in a way that does not remove the message after consumed, so it is available for the audit application several hours later. The audit application can consume the SQS message and remove it from the queue when completed.
CStore the transaction information in a DynamoDB table. The billing application can read the rows while the audit application will read the rows them remove the data.
DUse Kinesis to store the transaction information. The billing application will consume data from the stream, the audit application can consume the same data several hours later.
💡 Try to answer first, then click "Show Answer" to see the correct answer and explanation
Question 88 of 100
Your website is serving on-demand training videos to your workforce. Videos are uploaded monthly in high resolution MP4 format. Your workforce is distributed globally often on the move and using company-provided tablets that require the HTTP Live Streaming (HLS) protocol to watch a video. Your company has no video transcoding expertise and it required you might need to pay for a consultant. How do you implement the most cost-efficient architecture without compromising high availability and quality of video delivery?
AElastic Transcoder to transcode original high-resolution MP4 videos to HLS. S3 to host videos with lifecycle Management to archive original flies to Glacier after a few days. CloudFront to serve HLS transcoded videos from S3
BA video transcoding pipeline running on EC2 using SQS to distribute tasks and Auto Scaling to adjust the number or nodes depending on the length of the queue S3 to host videos with Lifecycle Management to archive all files to Glacier after a few days CloudFront to serve HLS transcoding videos from Glacier
CElastic Transcoder to transcode original high-resolution MP4 videos to HLS EBS volumes to host videos and EBS snapshots to incrementally backup original rues after a few days. CloudFront to serve HLS transcoded videos from EC2.
DA video transcoding pipeline running on EC2 using SQS to distribute tasks and Auto Scaling to adjust the number of nodes depending on the length of the queue. EBS volumes to host videos and EBS snapshots to incrementally backup original files after a few days. CloudFront to serve HLS transcoded videos from EC2
💡 Try to answer first, then click "Show Answer" to see the correct answer and explanation
Question 89 of 100
You need to create an Amazon Machine Learning model to predict how many inches of rain will fall in an area based on the historical rainfall data. What type of modeling will you use?
ACategorical
BBinary
CRegression
DUnsupervised
💡 Try to answer first, then click "Show Answer" to see the correct answer and explanation
Question 90 of 100
A company has launched EMR cluster to support their big data analytics requirements. AFS has multiple data sources built out of S3, SQL databases, MongoDB, Redis, RDS, other file systems. They are looking for a web application to create and share documents that contain live code, equations, visualizations, and narrative text. Which EMR Hadoop ecosystem fulfils the requirements?
AApache Hive
BApache Hue
CJupyter Notebook
DApache Presto
💡 Try to answer first, then click "Show Answer" to see the correct answer and explanation