SAP HANA Interview Questions And Answers
1.What is the reason for going In-memory? One reason is the
number of CPU cycles per second is increasing and the cost of processors is
decreasing. For managing the data in memory, there is five-minute rule which is
based on the suggestion that it costs more to wait for the data to be fetched
from disk than it costs to keep data in memory so it depends on how often you
fetch the data. For example there is a table and no matter how large it is and
this table is touched by a query at least once every 55 minutes, it is less
expensive (in hardware costs) to keep it in memory than to read it from memory
and if it is frequently accessed it is less expensive to store it in memory.
2.What is a Five-minute rule? It is a rule of thumb for
deciding whether a data item should be kept in memory, or stored on disk and
read back into memory when required. The rule is “randomly accessed disk pages
of cache are re-used every 5 minutes”.
3.What is multi-core CPU? Multiple CPU‟s on one chip or in one
package is called multi-core CPU. . Traditional databases for online
transaction processing (OLTP) do not use current hardware efficiently.
4.What is Stall? Waiting for data to be loaded from main
memory into the CPU cache is called as Stalls.
5.What is SAP In-Memory Appliance (SAP HANA)? HANA is an
in-memory technique to store data that is particularly suited for handling very
large amounts of tabular, or relational, data with extra ordinary performance.
Common databases store tabular data row-wise. Reorganizing the data in memory
column-wise brings a tremendous speed increase when accessing a subset of the
data in each table row.
6.What are the components or products of HANA? SAP HANA
contains the following components and administration tools: • SAP® In-Memory
Computing Engine (IMCE) Server 1.0 • SAP® IMCE Clients 1.0 – The IMCE clients
are the interfaces by which the IMCE can communicate with other components. The
following subcomponents are included: IMCE ODBO 1.0 IMCE ODBC 1.0 IMCE JDBC 1.0
IMCE SQLDBC 1.0 • SAP® IMCE Studio1.0 (includes SAP HANA Modeler) • Sybase
Replication Server 15 + Sybase Enterprise Connect Data Access (ECDA) • Sybase
Replication Agent • SAP® HANA Load Controller 1.0 (includes R3Load, RepServer
De-cluster Add-On) • SAP® Host Agent 7.20
7.What are the different editions available in HANA appliance
software? Platform , and Enterprise edition. Platform edition is intended for
customers who want to use ETL-based replication and already have
a license for SAP BO Data Services.
Enterprise edition is intended for customers who want to use either
trigger-based replication or ETL-based replication and do not already have all
of the necessary licenses for SAP BO Data Services. Extended edition is
intended for customers who want to use the full potential of all available
replication scenarios including log-based replication.
8.What is columnar and Row-Based Data Storage?
Fig: Row and Column-based storage A database table contains
data in the form of rows and columns. However Computer memory is organized as a
linear structure. To store a table in linear memory, there are two options. A
row-based storage stores a table as a sequence of records, each of which
contains the fields of one row. In a columnar storage the entries of a column
are stored in contiguous memory locations.
The SAP HANA database allows to specify whether a table is to
be stored column-wise or row-wise. It is also possible to alter an existing
table from columnar to row-based and vice versa. Search operations in tabular
data can be accelerated by organizing data in columns instead in rows.
9.What are the advantages of Column based tables? Calculations
are typically executed on single or a few columns only. The table is searched
based on values of a few columns. The table has a large number of columns. The
table has a large number of rows and columnar operations are required
(aggregate, scan, etc.). High compression rates can be achieved because the
majority of the columns contain only few distinct values (compared to number of
rows).
10.What are the advantages of Row-based tables? The
application needs to only process a single record at one time (many selects
and/or updates of single records). The application typically needs to access a
complete record (or row). The columns contain mainly distinct values so that
the compression rate would be low. Neither aggregations nor fast searching are
required. The table has a small number of rows (e. g. configuration tables).
11.In which case the data to be stored in columnar storage? To
enable fast on-the-fly aggregations, ad-hoc reporting, and to benefit from
compression mechanisms it is recommended that transaction data to be stored in
a column-based table.
12.Is it possible to join tables of row-based with column-based
tables? Yes
13.Are column-based tables always the better choice than
row-based tables? No. There are also situations in which row based tables are
advantageous.
14.What are the advantages of Columnar tables? Higher Data
Compression Rates Higher Performance for Column Operations Elimination of
Additional Indexes Parallelization Elimination of Materialized Aggregates
15.What are
the different Compression Techniques you know? Run-length encoding Cluster
encoding Dictionary encoding