hello friends if you are looking for Bigdata Hadoop MCQ with Answers | Bigdata Hadoop Multiple Choice Questions | Bigdata Hadoop Objective Type Questions | Bigdata Hadoop Accenture Test Answers you will get there
Q1. What is the default replication factor to achieve fault tolerance?
Ans – 3
Q2. HBase is ______defines only column families.
Ans – Schema-less
Q3.which is a distributed machine learning framework on top of Spark.
Ans – ML, MLlib
Q4. What is the language in which Hadoop framework is written?
Ans – Java
Q5. How can we load data into Hive tables?
A.From files
B.From other tables
C.from Databases
D.none of the above
Ans: ac
Q6. Which of the following is not a category of NoSQL db?
Ans – Scientific
Q7. The Shuffle and Sort process will happen
A.Before the Mapper Phase
B.After the Mapper Phase
C.After the Reducer Phase
D.Before the Reducer phase
Ans: ac
Q8. Which datatype can be used to implement a bounded variable length character string?
Ans – varchar
Q9. Which file(s) is/are used for storing meta data in Name node?
Ans – editlogs fsimage
Q10.What is/are machine learning algorithm? Select all that apply. (Check Box)
A.Classification
B.Regression
C.Clustering
D.None
Ans: ab
Q11. Which of the following statement is True about Apache Kafka ?
Ans -Apache Kafka retains all published messages regardless of consumption.
Q12. You have developed one application based on windowing operation, if you have set batch interval 10 seconds and window length 60 seconds then how many RDD’s will be processed in a window?
Ans – 10
Q13.What will hdfs dfs -chmod 600 filename.txt result in?
Ans – Give user access to read and write and execute with no access to others
Q14. Which of the following process schedules the tasks across nodes in Hadoop system
Ans – Scheduler
Q15. Which processes need to be active for successful call to Hive?
Ans – Namenode Datanode Secondary namenode ResourceManager NodeManager
Q16. Which of the following phases occur simultaneously in Map-Reduce Framework? Choose the most appropriate option.
Ans – shuffle and sort
Q17. What are the capabilities of Kafka?
A.Kafka runs as a cluster on one or more servers that can span multiple datacenters
B.The Kafka cluster stores streams of records in categories called topics.
C.Each record consists of a key, a value, and a timestamp.
D.All of the above
Ans: d
Q18. A network of 10000 sensors has been deployed in on connectd fleet of cars. Which of the following technology can be used to ingest the generated data into the analytics platform?
Ans – Hadoop
Q19. hadoop fs-put emp.csv /dataset command? What will happen when you execute above
Ans – emp.csv file copied from Local to HDFS /dataset directory
Q20. Assume that we have a the below Employee data in csv file names “EmployeeData.csv” onHDFS folder “/SparkDemos”101,Alex, 20000102,Adam,30000103 Rose 40000101 Linda, 35000102,Jack,45000103,Sam 50000turematerials09@accentureWhich of the below code fragment can will help create an appropnate Dataframe named”EmployeeDF” for the same?
Ans – EmployeeDF = spark.read.(“/Spark Demos/EmployeeData.csv”)schema1 = StructType([StructField(“Dept_ID”, IntegerType(), True)StructField(“EmpName” StringType() True),\StructField(“Salary”,IntegerType(), True)])EmployeeDF = spark read schema(schema1).load (“/ SparkDemos / E,
Q21. Which APIs are available in PySpark for data processing?
A.RDD
B.DataFrame
C.DataSet
D.GraphX
E.Mlib
Ans: abd
Q22. Hadoop is a Storage as well as processing framework – True/False?
Ans – True
Q23. Which type of Programming does Python support?
a.object-oriented programming
B.structured programming
C. functional programming
D.all of the mentioned
Ans: d
Q24. Which of the followings are a valid aggregate functions?
A.COUNT
B.COMPUTE
C.SUM
D.MAX
Ans: acd
Q25. Which of the following are examples of ETL tools?
Ans – Oracle
Q26. Select True or False. We can secure data in a datalake with mechanisms such as authentication and authorization
Ans – True
Q27. Which of the following is the correct extension of the Python file?
Ans – .py
Q28. Which of the following language is not supported by Spark?
Ans – pascal
Q29. Which of the following is not a DDL command?
Ans – Update
Q30. Which statement is used to delete all rows in a table without having the action logged?
Ans – Truncate
Q31. What are the different types of partitioning Methods?
Ans – row and column based
Q32. Select True or False. Primary Key column supports NULL values.
Ans – false
Q33. Which Talend components are used to process input/output for delimited files?
A.tFileInputDelimited
B.tMySQLInput
C.tFileInputXML
D.tFileOuputDelimited
Ans: ad
Q34. ____is a distributed graph processing framework on top of Spark.
Ans – graphX
Q35. Which of the following built-in function is used to identify datatype of a variable in python?
Ans – type()
Q36. What will be the value of the following Python expression? 4+3%5
Ans – 7
Q37. What will be the output of the following Python code snippet if x = 1 ? x <<2
Ans – 4
Q38. The different types of analysis which are supported through data lakes are:
A.Realtime
B.Interactive
C.Batch
D. Continuous
Ans: ac
Q39. Which of the following is not Constraint in SQL? .
Ans – union
Q40. Identify the advantages of in-memory processing?
A.Decreasing costs
B.High performance hardware
C.Inefficient to scale up
D.Esay to scale up
E.Easy to maintain and deploy
F.Easy access to data analytics
Ans: abef