Home
Current Affairs January 2024

What is the correct answer?

4

_____ is a platform for developing data flows for the extraction, transformation, and loading (ETL) of huge datasets, as well as for data analysis.

A. Spark

B. HBase

C. Hive

D. Pig

Correct Answer :

D. Pig


Pig is a high-level platform or tool that is used to process massive amounts of data at a high level. When processing via the MapReduce framework, it provides a high level of abstraction for the user. It includes a high-level scripting language, known as Pig Latin that is used to construct the data analysis scripts that are employed in the system.

Related Questions

What is the correct answer?

4

Data in ____ bytes size is called Big Data.

A. Tera

B. Giga

C. Peta

D. Meta

What is the correct answer?

4

In Big Data environments, Variety of data includes

A. Includes multiple formats and types of data

B. Includes structured data in the form of financial transactions,

C. Includes semi-structured data in the form of emails and unstructured data in the form of images

D. All of the mentioned above

What is the correct answer?

4

The _____ tool has the capability of listing all of the possible database schemas.

A. sqoop-list-databases

B. Hbase-list

C. hive schema

D. sqoop-list-columns

What is the correct answer?

4

Amongst which of the following is / are the examples of descriptive analytics,

A. Traffic and Engagement Reports

B. Financial Statement Analysis

C. Demand Trends and Aggregated Survey Results

D. All of the mentioned above

What is the correct answer?

4

Reporting and visualization enables.

A. Processing of data

B. User friendly representation

C. Both A and B

D. None of the mentioned above

What is the correct answer?

4

Custom extensions built in the ____ programming language are also supported by Hive.

A. Java

B. C#

C. C

D. C++

What is the correct answer?

4

Amongst which of the following shows an example of unstructured data,

A. Students roll number, age

B. Videos

C. Audio files

D. Both B and C

What is the correct answer?

4

Amongst which of the following is / are the types of predictive analytics techniques,

A. Predictive models

B. Descriptive models

C. Decision models

D. All of the mentioned above

What is the correct answer?

4

Variety describes one of the biggest challenges of ______.

A. Big data

B. Data science

C. Data integration

D. None of the mentioned above

What is the correct answer?

4

In Big Data environment, Veracity of data refers -

A. Quality or fidelity of data

B. Large size of the data that cannot be process

C. Small size of the data that can easily process

D. All of the mentioned above

What is the correct answer?

4

Amongst which of the following is /are most suitable with reference to the data collector layer,

A. Transportation of data from the ingestion layer to the rest of the data pipeline

B. Data storage

C. Data identification

D. None of the mentioned above

What is the correct answer?

4

HQL is a query language that is used to construct the custom map-reduce framework in Hive, which is written in ______.

A. Java

B. PHP

C. C#

D. None of the mentioned above

What is the correct answer?

4

______ is best described as a programming model that is used to construct Hadoop-based applications that can be scaled up and down.

A. Oozie

B. Zookepper

C. MapReduce

D. All of the mentioned above

What is the correct answer?

4

Amongst which of the following is / are true to run MongoDB?

A. High availability through built-in replication and failover

B. Management tooling for automation, monitoring, and backup

C. Fully elastic database as a service with built-in best practices

D. All of the mentioned above

What is the correct answer?

4

Amongst which of the following is not aligns as a characteristic of HDFS?

A. HDFS file system is well suited for storing data associated with applications that require low latency data access.

B. HDFS is well-suited for storing data connected to applications that require low-latency data access to be performed.

C. HDFS is not suited for instances in which multiple/simultaneous writes to the same file are required.

D. None of the mentioned above

What is the correct answer?

4

Amongst which of the following can be considered as the main source of unstructured data.

A. Twitter

B. Facebook

C. Webpages

D. All of the mentioned above

What is the correct answer?

4

Predictive analytics relies on capturing relationships between explanatory variables and the _____.

A. Predicted variables

B. Descriptive variables

C. Prescriptive variables

D. All of the mentioned above

What is the correct answer?

4

In the given Virtual Architecture, name the missing layer,

A. Virtualization layer

B. Storage layer

C. Abstract layer

D. None of the mentioned above

What is the correct answer?

4

Amongst which of the following is /are the techniques that are used for predictive analytics,

A. Linear Regression

B. Time series analysis and forecasting

C. Data Mining

D. All of the mentioned above

What is the correct answer?

4

Scalability is prioritized over latency in jobs such as _____.

A. HBase

B. HDFS

C. Hive

D. Mapreduce

What is the correct answer?

4

_____ maps input key/value pairs to a set of intermediate key/value pairs.

A. Reducer

B. Mapper

C. File system

D. All of these

What is the correct answer?

4

Prescriptive analytics utilizes business rules, artificial intelligence, and ____ to simulates various approaches to these numerous outcomes.

A. Algorithms

B. Flowchart

C. System flow

D. None of the mentioned above

What is the correct answer?

4

Amongst which of the following is / are true with reference to hypothesis?

A. A statement that the researcher wishes to put to the test using the information gathered during a study.

B. A research question that will be answered as a result of the findings.

C. A theory that serves as the foundation for the research.

D. the application of statistics to determine the extent to which the outcomes could have been caused by chance

What is the correct answer?

4

Hypervisors are used for many different tasks, including ____ server management, and simply running programs.

A. Cloud computing

B. Security management

C. Integrated approach

D. None of the mentioned above

What is the correct answer?

4

Amongst which of the following represents the Use of Hadoop,

A. Robust and Scalable

B. Affordable and Cost Effective

C. Adaptive and Flexible

D. All of the mentioned above

What is the correct answer?

4

Query processing system refers to the entire process from translating a ___ to the database system.

A. Query

B. Statement

C. Function

D. None of the mentioned above

What is the correct answer?

4

_____ are two techniques used in descriptive analytics to discover historical data.

A. Data ingestion and data mining

B. Data warehouse and data storage

C. Data aggregation and data mining

D. All of the mentioned above

What is the correct answer?

4

______ involves the simultaneous execution of multiple sub-tasks that collectively comprise a larger task.

A. Parallel data processing

B. Single channel processing

C. Multi data processing

D. None of the mentioned above

What is the correct answer?

4

Operational Database with distributed systems and ___ based system can harness the true potential with big data.

A. SQL

B. NoSQL

C. PL / SQL

D. None of the mentioned above

What is the correct answer?

4

Data warehouse modeling is the initial stage of building a data warehouse wherein the ___ is designed.

A. Schema

B. Table

C. Both A and B

D. None of the mentioned above