References > Data collection and rollup calculation in Virtualization Manager

Data collection and rollup calculation in Virtualization Manager

Data in Virtualization Manager are collected in intervals based on collection schedules.

The two primary types of data collection are:

  • Configuration
  • Performance

Configuration data include properties such as CPU speed, CPU or network interface count, and host name. By default, configuration data are collected in 12-hour intervals.

Performance data include properties such as total latency, CPU idle, and throughput. By default, performance data are collected in 10-minute intervals. Performance data are collected for new data sources immediately, but are not displayed until configuration data are collected.

If you have calculations based on both configuration and performance data, the configuration data will be older than the performance data, and this can affect your expected calculations. For example, if you add a cluster, it can take up to 12 hours (or your configuration collection interval) until the information is displayed. To immediately collect configuration information, click Run Now in the Collection Schedules page of Virtualization Manager.

To see how specific properties are collected, see the KB article about Virtualization Manager Properties.

You can change the frequency of individual collections by modifying the collection schedule. For more information, see Schedule data collection.

Sample collection in a VMware environment

The VMware API provides Virtualization Manager with recent sample values every 20 seconds, from approximately the previous hour. The sample data are collected every 20 seconds. These data represent the average value during the 20 second period.

If the data collection interval in Virtualization Manager is set to the default 10 minutes, Virtualization Manager collects the 20 second samples from the last 10 minutes. This way it gets 30 values for each performance counter. According to the type of the sample value, the average, maximum, or last value is presented in Virtualization Manager.

Most of the raw values stored in Virtualization Manager are the average values from the data collection interval, that is, the average values during the last 10 minutes by default.

Virtualization Manager also calculates peak sample values. Peak values are the maximum values from the 20 second samples. For example, if the default 10 minute data collection interval is used, the peak value is the maximum value from the 30 values received in the previous 10 minutes. The sample values collected every 20 seconds are not stored in Virtualization Manager. These values are used to compute the average or maximum value from the raw values.

Sample collection in a Hyper-V environment

Virtualization Manager collects two sets of samples for hosts, clusters, and data stores, with one minute delay between the two sample sets. The average values are calculated from the difference between the two sample sets, and from the time that elapsed between them.

For VMs, the current values are collected, that is, the values that are available at the moment of the request.

There are no peak values calculated for a Hyper-V environment.

Data rollup

Raw performance data are rolled up over time to provide hourly, daily, weekly, monthly, and quarterly averages, maximal, and other statistics. The rollup periods are based on the local server time and do not take business hours into account.

The raw and hourly performance data are discarded after a configurable amount of time. Higher level rollups are retained indefinitely to provide long-term trends in resource consumption.

Peak values are calculated according to these methods:

  • Latest Value (peak): The highest values of each sample collected from VMware. VMware collects raw data in 20-second intervals.
  • Hourly Rollup (peak): The highest values from the data collected during the last hour.
  • Daily Rollup (peak): The highest values from the hourly rollups during the last 24-hour period.
  • Weekly Rollup (peak): The highest values from the daily rollups during the last seven days.
  • Monthly Rollup (peak): The highest values from the weekly rollups during the last calendar month.
  • Quarterly Rollup (peak): The highest values from the monthly rollups during the last quarter.

Monthly and quarterly rollups are not calculated daily.

Although average and peak are the two most common metrics, other metrics collected by the performance job use different rollups. For example, when the Powered On status is rolled up, Virtualization Manager only retains whether the system was mostly on or off during the rollup period.

Data retention

Raw performance data are saved for 14 days. Hourly rollups are saved for 90 days by default. You can change how long Virtualization Manager retains this data under Setup > Advanced Setup > System Properties. Increasing the length of time data are saved can slow down the application or the database. Any increase to the raw performance data and hourly rollup retention significantly impacts the database size.

Data are retained according to this method.

period default retention period
Latest values (raw data) 14 days (configurable)
Hourly rollup 90 days (configurable)
Daily rollup Indefinite
Weekly rollup Indefinite
Monthly rollup Indefinite
Yearly rollup Indefinite

Data aggregation

Aggregation combines the performance data collected during the same time across the virtual environment. To calculate overall performance and load statistics for data stores, for example, Virtualization Manager collects partial data from each host and VM which accesses the data store, and then aggregates that data.

Raw data are aggregated in real time. When aggregate raw data points are available, they are stored, processed, indexed, and rolled up by Virtualization Manager.

Infrastructure aging

If no data are collected from an item in the virtual infrastructure after 24 hours, that item is considered stale, and the related data are grayed out. After 48 hours, the item is considered to be removed from your infrastructure, and Virtualization Manager stops displaying information about it. The data are not deleted from the database, and if the item reappears in the virtual environment, the new information is linked to the information already available in the database.