Apache Spark backgroundApache Spark
Apache Spark backgroundApache Spark
Data Analysis Tools
NiFiApache SparkApache Hadoop HDFSApache Hadoop YarnKibana

Apache Spark

Apache Spark is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters. Moein monitoring system is capable of monitoring Spark in both mode of operation. Performance metrics of master and workers, executes in workers, JVM metrics such as heap and garbage collectors and RDD are collected. The following is a list of the performance metrics of Apache Spark:

General Information:

  1. Alias Name
  2. Address
  3. Port Number
  4. Node Role

Master Node Information:

  1. Master Node Address
  2. Master Number Of Cores
  3. Master Number Of Used Cores
  4. Master Total Memory
  5. Master Used Memory
  6. Total Number Of Wokers
  7. Number Of Alive Workers
  8. Number Of Active Applications
  9. Number Of Completed Applications
  10. Status
  11. Master Used Memory Percentage
  12. Master Core Used Percentage
  13. Number Of Active Drivers
  14. Number Of Completed Drivers

Spark Master.overview.en

Workers Mertrics:

  1. Worker ID
  2. Worker Host Address
  3. Worker Port Number
  4. Worker Web UI Address
  5. Worker Number Of Cores
  6. Worker Number Of Used Cores
  7. Worker Number Of Free Cores
  8. Worker Total Memory
  9. Worker Used Memory
  10. Worker Free Memory
  11. Elapsed Time Since Last Heartbeat
  12. Worker Status
  13. Worker Used Memory Percentage
  14. Worker Core Used Percentage

Spark Master.workers.en

Applications Mertrics:

  1. Application ID
  2. Application Name
  3. Application User
  4. Application Start Time
  5. Application Submit Time
  6. Application Number Of Allocated Cores
  7. Application Running Duration
  8. Application Status
  9. Application Running Status

Worker Mertrics:

  1. Worker ID
  2. Master Node Address
  3. Master Web Service Address
  4. Worker Number Of Cores
  5. Worker Number Of Used Cores
  6. Worker Total Memory
  7. Worker Used Memory
  8. Worker Used Memory Percentage
  9. Worker Core Used Percentage
  10. Total Number Of Running Executors
  11. Total Number Of Finished Executors

Spark Worker.over View.en

Execute in Workers Mertrics:

  1. Executor ID
  2. Executor Total Memory
  3. Executor Application ID
  4. Executor Application Name
  5. Number Of Executor Application Cores
  6. Executor Application User
  7. Executor Application Memory Per Slave
  8. Executor Status

Memory Mertrics:

Heap‌ and Non Heap memory:

  1. Committed Heap Memory
  2. Initial Heap Memory
  3. Maximum Heap Memory
  4. Used Heap Memory
  5. Committed Non-Heap Memory
  6. Initial Non-Heap Memory
  7. Maximum Non-Heap Memory
  8. Used Non-Heap Memory
  9. Heap Memory Used Percentage
  10. Non-Heap Memory Used Percentage

 Memory Pools KPIs:

  1. Memory Pool Name
  2. Memory Pool Committed Memory
  3. Memory Pool Initial Memory
  4. Memory Pool Maximum Memory
  5. Memory Pool Used Memory
  6. Memory Pool Used Percentage

GC Mertrics:

  1. Garbage Collection Count
  2. Garbage Collection Rate
  3. Garbage Collection Time
  4. Average Garbage Collection Time
  5. GC Name

Spark Worker.memory.en

RDD Mertrics:

  1. File Cache Hits
  2. Discovered Files 
  3. Hive Client Calls
  4. Parallel Listing Job Count
  5. Fetched Partitions
  6. Compilation Mean Time
  7. Total Number Of Compilation
  8. Generated Class Size
  9. Generated Class Count
  10. Generated Method Size
  11. Generated Method Count
  12. Source Code Size
  13. Source Code Count

Spark Worker.kpi2.en

Communication Protocols:

  • REST
Data Analysis Tools
NiFiApache SparkApache Hadoop HDFSApache Hadoop YarnKibana
3rd floor, No. 8, 2nd dead-end, Sadeghi St., Azadi Ave., Tehran, Iran, Postal code 1458846155
Behpaya Co. All rights reserved