[2021] Verified MLS-C01 Dumps Q&As - 1 Year Free & Quickly Updates [Q12-Q29]

Share

[2021] Verified MLS-C01 Dumps Q&As - 1 Year Free & Quickly Updates

Latest 2021 Realistic Verified MLS-C01 Dumps - 100% Free MLS-C01 Exam Dumps


What are the exam results for AWS Certified Machine Learning Specialty

The examination is a pass or fail exam. The examination is scored against a minimum standard established by AWS professionals who are guided by certification industry best practices and guidelines. Your results for the examination are reported as a score from 100-1,000, with a minimum passing score of 720. Your score shows how you performed on the examination as a whole and whether or not you passed. Scaled scoring models are used to equate scores across multiple exam forms that may have slightly different difficulty levels. Your score report contains a table of classifications of your performance at each section level. This information is designed to provide general feedback concerning your examination performance. The examination uses a compensatory scoring model, which means that you do not need to pass the individual sections, only the overall examination. Each section of the examination has a specific weighting, so some sections have more questions than others.

 

NEW QUESTION 12
A Machine Learning Specialist deployed a model that provides product recommendations on a company's website Initially, the model was performing very well and resulted in customers buying more products on average However within the past few months the Specialist has noticed that the effect of product recommendations has diminished and customers are starting to return to their original habits of spending less The Specialist is unsure of what happened, as the model has not changed from its initial deployment over a year ago Which method should the Specialist try to improve model performance?

  • A. The model's hyperparameters should be periodically updated to prevent drift
  • B. The model needs to be completely re-engineered because it is unable to handle product inventory changes
  • C. The model should be periodically retrained using the original training data plus new data as product inventory changes
  • D. The model should be periodically retrained from scratch using the original data while adding a regularization term to handle product inventory changes

Answer: C

 

NEW QUESTION 13
During mini-batch training of a neural network for a classification problem, a Data Scientist notices that training accuracy oscillates What is the MOST likely cause of this issue?

  • A. The learning rate is very high
  • B. The class distribution in the dataset is imbalanced
  • C. The batch size is too big
  • D. Dataset shuffling is disabled

Answer: B

 

NEW QUESTION 14
A Machine Learning Specialist uploads a dataset to an Amazon S3 bucket protected with server- side encryption using AWS KMS.
How should the ML Specialist define the Amazon SageMaker notebook instance so it can read the same dataset from Amazon S3?

  • A. onfigure the Amazon SageMaker notebook instance to have access to the VPC. Grant permission in the KMS key policy to the notebook's KMS role.
  • B. Define security group(s) to allow all HTTP inbound/outbound traffic and assign those security group(s) to the Amazon SageMaker notebook instance.
  • C. Assign the same KMS key used to encrypt data in Amazon S3 to the Amazon SageMaker notebook instance.
  • D. Assign an IAM role to the Amazon SageMaker notebook with S3 read access to the dataset.
    Grant permission in the KMS key policy to that role.

Answer: C

Explanation:
https://docs.aws.amazon.com/sagemaker/latest/dg/encryption-at-rest.html

 

NEW QUESTION 15
A data scientist has developed a machine learning translation model for English to Japanese by using Amazon SageMaker's built-in seq2seq algorithm with 500,000 aligned sentence pairs. While testing with sample sentences, the data scientist finds that the translation quality is reasonable for an example as short as five words. However, the quality becomes unacceptable if the sentence is 100 words long.
Which action will resolve the problem?

  • A. Change preprocessing to use n-grams.
  • B. Adjust hyperparameters related to the attention mechanism.
  • C. Choose a different weight initialization type.
  • D. Add more nodes to the recurrent neural network (RNN) than the largest sentence's word count.

Answer: D

 

NEW QUESTION 16
A gaming company has launched an online game where people can start playing for free but they need to pay if they choose to use certain features The company needs to build an automated system to predict whether or not a new user will become a paid user within 1 year The company has gathered a labeled dataset from 1 million users The training dataset consists of 1.000 positive samples (from users who ended up paying within 1 year) and
999.000 negative samples (from users who did not use any paid features) Each data sample consists of 200 features including user age, device, location, and play patterns Using this dataset for training, the Data Science team trained a random forest model that converged with over
99% accuracy on the training set However, the prediction results on a test dataset were not satisfactory.
Which of the following approaches should the Data Science team take to mitigate this issue? (Select TWO.)

  • A. Change the cost function so that false positives have a higher impact on the cost value than false negatives
  • B. Add more deep trees to the random forest to enable the model to learn more features.
  • C. Change the cost function so that false negatives have a higher impact on the cost value than false positives
  • D. indicate a copy of the samples in the test database in the training dataset
  • E. Generate more positive samples by duplicating the positive samples and adding a small amount of noise to the duplicated data.

Answer: C,D

 

NEW QUESTION 17
A company has raw user and transaction data stored in AmazonS3 a MySQL database, and Amazon RedShift A Data Scientist needs to perform an analysis by joining the three datasets from Amazon S3, MySQL, and Amazon RedShift, and then calculating the average-of a few selected columns from the joined data Which AWS service should the Data Scientist use?

  • A. Amazon Athena
  • B. AWS Glue
  • C. Amazon QuickSight
  • D. Amazon Redshift Spectrum

Answer: A

 

NEW QUESTION 18
A Machine Learning Specialist working for an online fashion company wants to build a data ingestion solution for the company's Amazon S3-based data lake.
The Specialist wants to create a set of ingestion mechanisms that will enable future capabilities comprised of:
* Real-time analytics
* Interactive analytics of historical data
* Clickstream analytics
* Product recommendations
Which services should the Specialist use?

  • A. AWS Glue as the data catalog; Amazon Kinesis Data Streams and Amazon Kinesis Data Analytics for real- time data insights; Amazon Kinesis Data Firehose for delivery to Amazon ES for clickstream analytics; Amazon EMR to generate personalized product recommendations
  • B. Amazon Athena as the data catalog: Amazon Kinesis Data Streams and Amazon Kinesis Data Analytics for near-real-time data insights; Amazon Kinesis Data Firehose for clickstream analytics; AWS Glue to generate personalized product recommendations
  • C. AWS Glue as the data catalog; Amazon Kinesis Data Streams and Amazon Kinesis Data Analytics for historical data insights; Amazon Kinesis Data Firehose for delivery to Amazon ES for clickstream analytics; Amazon EMR to generate personalized product recommendations
  • D. Amazon Athena as the data catalog; Amazon Kinesis Data Streams and Amazon Kinesis Data Analytics for historical data insights; Amazon DynamoDB streams for clickstream analytics; AWS Glue to generate personalized product recommendations

Answer: A

 

NEW QUESTION 19
IT leadership wants Jo transition a company's existing machine learning data storage environment to AWS as a temporary ad hoc solution The company currently uses a custom software process that heavily leverages SOL as a query language and exclusively stores generated csv documents for machine learning The ideal state for the company would be a solution that allows it to continue to use the current workforce of SQL experts The solution must also support the storage of csv and JSON files, and be able to query over semi-structured data The following are high priorities for the company:
* Solution simplicity
* Fast development time
* Low cost
* High flexibility
What technologies meet the company's requirements?

  • A. Amazon Redshift and AWS Glue
  • B. Amazon S3 and Amazon Athena
  • C. Amazon DynamoDB and DynamoDB Accelerator (DAX)
  • D. Amazon RDS and Amazon ES

Answer: A

 

NEW QUESTION 20
A Machine Learning Specialist kicks off a hyperparameter tuning job for a tree-based ensemble model using Amazon SageMaker with Area Under the ROC Curve (AUC) as the objective metric This workflow will eventually be deployed in a pipeline that retrains and tunes hyperparameters each night to model click-through on data that goes stale every 24 hours With the goal of decreasing the amount of time it takes to train these models, and ultimately to decrease costs, the Specialist wants to reconfigure the input hyperparameter range(s) Which visualization will accomplish this?

  • A. A scatter plot showing (he performance of the objective metric over each training iteration
  • B. A scatter plot with points colored by target variable that uses (-Distributed Stochastic Neighbor Embedding (I-SNE) to visualize the large number of input variables in an easier-to-read dimension.
  • C. A scatter plot showing the correlation between maximum tree depth and the objective metric.
  • D. A histogram showing whether the most important input feature is Gaussian.

Answer: C

 

NEW QUESTION 21
An Machine Learning Specialist discover the following statistics while experimenting on a model.

What can the Specialist from the experiments?

  • A. The model in Experiment 1 had a high bias error that was reduced in Experiment 3 by regularization Experiment 2 shows that there is minimal variance error in Experiment 1
  • B. The model In Experiment 1 had a high variance error lhat was reduced in Experiment 3 by regularization Experiment 2 shows that there is minimal bias error in Experiment 1
  • C. The model in Experiment 1 had a high random noise error that was reduced in Expenment 3 by regularization Expenment 2 shows that random noise cannot be reduced by increasing layers and neurons in the model
  • D. The model in Experiment 1 had a high bias error and a high variance error that were reduced in Experiment 3 by regularization Experiment 2 shows thai high bias cannot be reduced by increasing layers and neurons in the model

Answer: D

 

NEW QUESTION 22
A Machine Learning Specialist is using an Amazon SageMaker notebook instance in a private subnet of a corporate VPC. The ML Specialist has important data stored on the Amazon SageMaker notebook instance's Amazon EBS volume, and needs to take a snapshot of that EBS volume. However, the ML Specialist cannot find the Amazon SageMaker notebook instance's EBS volume or Amazon EC2 instance within the VPC.
Why is the ML Specialist not seeing the instance visible in the VPC?

  • A. Amazon SageMaker notebook instances are based on AWS ECS instances running within AWS service accounts.
  • B. Amazon SageMaker notebook instances are based on EC2 instances running within AWS service accounts.
  • C. Amazon SageMaker notebook instances are based on the Amazon ECS service within customer accounts.
  • D. Amazon SageMaker notebook instances are based on the EC2 instances within the customer account, but they run outside of VPCs.

Answer: B

Explanation:
https://docs.aws.amazon.com/sagemaker/latest/dg/gs-setup-working-env.html

 

NEW QUESTION 23
A company's Machine Learning Specialist needs to improve the training speed of a time-series forecasting model using TensorFlow. The training is currently implemented on a single-GPU machine and takes approximately 23 hours to complete. The training needs to be run daily.
The model accuracy js acceptable, but the company anticipates a continuous increase in the size of the training data and a need to update the model on an hourly, rather than a daily, basis. The company also wants to minimize coding effort and infrastructure changes What should the Machine Learning Specialist do to the training solution to allow it to scale for future demand?

  • A. Move the training to Amazon EMR and distribute the workload to as many machines as needed to achieve the business goals.
  • B. Change the TensorFlow code to implement a Horovod distributed framework supported by Amazon SageMaker. Parallelize the training to as many machines as needed to achieve the business goals.
  • C. Do not change the TensorFlow code. Change the machine to one with a more powerful GPU to speed up the training.
  • D. Switch to using a built-in AWS SageMaker DeepAR model. Parallelize the training to as many machines as needed to achieve the business goals.

Answer: B

 

NEW QUESTION 24
An insurance company needs to automate claim compliance reviews because human reviews are expensive and error-prone. The company has a large set of claims and a compliance label for each.
Each claim consists of a few sentences in English, many of which contain complex related information. Management would like to use Amazon SageMaker built-in algorithms to design a machine learning supervised model that can be trained to read each claim and predict if the claim is compliant or not.
Which approach should be used to extract features from the claims to be used as inputs for the downstream supervised task?

  • A. Apply Amazon SageMaker Object2Vec to claims in the training set. Send the derived features space as inputs for the downstream supervised task.
  • B. Apply Amazon SageMaker BlazingText in Word2Vec mode to claims in the training set. Send the derived features space as inputs for the downstream supervised task.
  • C. Apply Amazon SageMaker BlazingText in classification mode to labeled claims in the training set to derive features for the claims that correspond to the compliant and non-compliant labels, respectively.
  • D. Derive a dictionary of tokens from claims in the entire dataset. Apply one-hot encoding to tokens found in each claim of the training set. Send the derived features space as inputs to an Amazon SageMaker builtin supervised learning algorithm.

Answer: A

Explanation:
Amazon SageMaker Object2Vec generalizes the Word2Vec embedding technique for words to more complex objects, such as sentences and paragraphs. Since the supervised learning task is at the level of whole claims, for which there are labels, and no labels are available at the word level, Object2Vec needs be used instead of Word2Vec.

 

NEW QUESTION 25
A Data Scientist needs to create a serverless ingestion and analytics solution for high-velocity, real-time streaming data.
The ingestion process must buffer and convert incoming records from JSON to a query- optimized, columnar format without data loss. The output datastore must be highly available, and Analysts must be able to run SQL queries against the data and connect to existing business intelligence dashboards.
Which solution should the Data Scientist build to satisfy the requirements?

  • A. Write each JSON record to a staging location in Amazon S3. Use the S3 Put event to trigger an AWS Lambda function that transforms the data into Apache Parquet or ORC format and writes the data to a processed data location in Amazon S3. Have the Analysts query the data directly from Amazon S3 using Amazon Athena, and connect to BI tools using the Athena Java Database Connectivity (JDBC) connector.
  • B. Write each JSON record to a staging location in Amazon S3. Use the S3 Put event to trigger an AWS Lambda function that transforms the data into Apache Parquet or ORC format and inserts it into an Amazon RDS PostgreSQL database. Have the Analysts query and run dashboards from the RDS database.
  • C. Use Amazon Kinesis Data Analytics to ingest the streaming data and perform real-time SQL queries to convert the records to Apache Parquet before delivering to Amazon S3. Have the Analysts query the data directly from Amazon S3 using Amazon Athena and connect to BI tools using the Athena Java Database Connectivity (JDBC) connector.
  • D. Create a schema in the AWS Glue Data Catalog of the incoming data format. Use an Amazon Kinesis Data Firehose delivery stream to stream the data and transform the data to Apache Parquet or ORC format using the AWS Glue Data Catalog before delivering to Amazon S3. Have the Analysts query the data directly from Amazon S3 using Amazon Athena, and connect to BI tools using the Athena Java Database Connectivity (JDBC) connector.

Answer: D

 

NEW QUESTION 26
An Amazon SageMaker notebook instance is launched into Amazon VPC The SageMaker notebook references data contained in an Amazon S3 bucket in another account The bucket is encrypted using SSE-KMS The instance returns an access denied error when trying to access data in Amazon S3.
Which of the following are required to access the bucket and avoid the access denied error? (Select THREE )

  • A. An 1AM role that allows access to the specific S3 bucket
  • B. An AWS KMS key policy that allows access to the customer master key (CMK)
  • C. A SegaMaker notebook subnet ACL that allow traffic to Amazon S3.
  • D. A SageMaker notebook security group that allows access to Amazon S3
  • E. An S3 bucket owner that matches the notebook owner
  • F. A permissive S3 bucket policy

Answer: A,B,C

 

NEW QUESTION 27
A Data Science team is designing a dataset repository where it will store a large amount of training data commonly used in its machine learning models. As Data Scientists may create an arbitrary number of new datasets every day, the solution has to scale automatically and be cost-effective. Also, it must be possible to explore the data using SQL.
Which storage scheme is MOST adapted to this scenario?

  • A. Store datasets as files in an Amazon EBS volume attached to an Amazon EC2 instance.
  • B. Store datasets as tables in a multi-node Amazon Redshift cluster.
  • C. Store datasets as files in Amazon S3.
  • D. Store datasets as global tables in Amazon DynamoDB.

Answer: C

 

NEW QUESTION 28
A Machine Learning Specialist is preparing data for training on Amazon SageMaker The Specialist is transformed into a numpy .array, which appears to be negatively affecting the speed of the training What should the Specialist do to optimize the data for training on SageMaker'?

  • A. Use the SageMaker batch transform feature to transform the training data into a DataFrame
  • B. Use AWS Glue to compress the data into the Apache Parquet format
  • C. Transform the dataset into the Recordio protobuf format
  • D. Use the SageMaker hyperparameter optimization feature to automatically optimize the data

Answer: C

 

NEW QUESTION 29
......

MLS-C01 Dumps PDF and Test Engine Exam Questions: https://certtree.2pass4sure.com/AWS-Certified-Specialty/MLS-C01-actual-exam-braindumps.html