Processing of data
User friendly representation
Both A and B
None of the mentioned above
C. Both A and B
Hive
HBase
Analysis and reporting
All of the mentioned above
Algorithms
Flowchart
System flow
None of the mentioned above
Virtualization software
System software
Integrated approach
None of the mentioned above
A statement that the researcher wishes to put to the test using the information gathered during a study.
A research question that will be answered as a result of the findings.
A theory that serves as the foundation for the research.
the application of statistics to determine the extent to which the outcomes could have been caused by chance
sqoop-list-databases
Hbase-list
hive schema
sqoop-list-columns
Robust and Scalable
Affordable and Cost Effective
Adaptive and Flexible
All of the mentioned above
HDFS Shell
DFS Shell
K Shell
FS Shell
Internally managed data
Data feeds from external sources.
It provides access to each and every layer & components of big data stack
All of the mentioned above
Data Node
Block Size
Data block
NameNode
Operational data source
Qualitative data source
Both A and B
None of the mentioned above
Oozie
Zookepper
MapReduce
All of the mentioned above
Predictive models
Descriptive models
Decision models
All of the mentioned above
Heterogeneous
Storage
Network
None of the mentioned above
Distributed computing
Cluster computing
Parallel computing
All of the mentioned above
HDFS file system is well suited for storing data associated with applications that require low latency data access.
HDFS is well-suited for storing data connected to applications that require low-latency data access to be performed.
HDFS is not suited for instances in which multiple/simultaneous writes to the same file are required.
None of the mentioned above
Worth in information
Useless data
Useless information
None of the mentioned above
Application software
System software
Operating System
None of the mentioned above
Cloud-based
Data warehouse
System ingestion
All of the mentioned above
Spark
HBase
Hive
Pig
Discrete variable
Quantitative variable
Qualitative variable
Superlative variable
Python
C++
R
Java
Query
Statement
Function
None of the mentioned above
Virtualization layer
Storage layer
Abstract layer
None of the mentioned above
Big data
Data science
Data integration
None of the mentioned above
Parallel data processing
Single channel processing
Multi data processing
None of the mentioned above
Big Data ingestion pipeline is divided into different layers
Each layer performs a particular function
Both A and B
None of the mentioned above
Structured Data
Unstructured Data
Semi-structured Data
None of the mentioned above
Maptask
Task execution
Mapper
All of the mentioned above
Meet compliance requirements
Protect the privacy
Both A and B
None of the mentioned above
Quality or fidelity of data
Large size of the data that cannot be process
Small size of the data that can easily process
All of the mentioned above