Free CCA-500 Exam Dumps

No Installation Required, Instantly Prepare for the CCA-500 exam and please click the below link to start the CCA-500 Exam Simulator with a real CCA-500 practice exam questions.
Use directly our on-line CCA-500 exam dumps materials and try our Testing Engine to pass the CCA-500 which is always updated.

Exam Code: CCA-500
Exam Title: Cloudera Certified Administrator for Apache Hadoop (CCAH)
Vendor: Cloudera
Exam Questions: 60
Last Updated: April 2nd,2025

Question 1

Given:
CCA-500 dumps exhibit
You want to clean up this list by removing jobs where the State is KILLED. What command you enter?

A. Yarn application –refreshJobHistory
B. Yarn application –kill application_1374638600275_0109
C. Yarn rmadmin –refreshQueue
D. Yarn rmadmin –kill application_1374638600275_0109

Correct Answer:B
Reference:http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1-latest/bk_using-apache-hadoop/content/common_mrv2_commands.html

Question 2

Which command does Hadoop offer to discover missing or corrupt HDFS data?

A. Hdfs fs –du
B. Hdfs fsck
C. Dskchk
D. The map-only checksum
E. Hadoop does not provide any tools to discover missing or corrupt data; there is not need because three replicas are kept for each data block

Correct Answer:B
Reference:https://twiki.grid.iu.edu/bin/view/Storage/HadoopRecovery

Question 3

A user comes to you, complaining that when she attempts to submit a Hadoop job, it fails. There is a Directory in HDFS named /data/input. The Jar is named j.jar, and the driver class is named DriverClass.
She runs the command:
Hadoop jar j.jar DriverClass /data/input/data/output The error message returned includes the line:
PriviligedActionException as:training (auth:SIMPLE) cause:org.apache.hadoop.mapreduce.lib.input.invalidInputException:
Input path does not exist: file:/data/input What is the cause of the error?

A. The user is not authorized to run the job on the cluster
B. The output directory already exists
C. The name of the driver has been spelled incorrectly on the command line
D. The directory name is misspelled in HDFS
E. The Hadoop configuration files on the client do not point to the cluster

Correct Answer:A

Question 4

You have a cluster running with a FIFO scheduler enabled. You submit a large job A to the cluster, which you expect to run for one hour. Then, you submit job B to the cluster, which you expect to run a couple of minutes only.
You submit both jobs with the same priority.
Which two best describes how FIFO Scheduler arbitrates the cluster resources for job and its tasks?(Choose two)

A. Because there is a more than a single job on the cluster, the FIFO Scheduler will enforce a limit on the percentage of resources allocated to a particular job at any given time
B. Tasks are scheduled on the order of their job submission
C. The order of execution of job may vary
D. Given job A and submitted in that order, all tasks from job A are guaranteed to finish before all tasks from job B
E. The FIFO Scheduler will give, on average, and equal share of the cluster resources over the job lifecycle
F. The FIFO Scheduler will pass an exception back to the client when Job B is submitted, since all slots on the cluster are use

Correct Answer:AD

Question 5

You want to understand more about how users browse your public website. For example, you want to know which pages they visit prior to placing an order. You have a server farm of 200 web servers hosting your website. Which is the most efficient process to gather these web server across logs into your Hadoop cluster analysis?

A. Sample the web server logs web servers and copy them into HDFS using curl
B. Ingest the server web logs into HDFS using Flume
C. Channel these clickstreams into Hadoop using Hadoop Streaming
D. Import all user clicks from your OLTP databases into Hadoop using Sqoop
E. Write a MapReeeduce job with the web servers for mappers and the Hadoop cluster nodes for reducers

Correct Answer:B
Apache Flume is a service for streaming logs into Hadoop.
Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of streaming data into the Hadoop Distributed File System (HDFS). It has a simple and flexible architecture based on streaming data flows; and is robust and fault tolerant with tunable reliability mechanisms for failover and recovery.