Skip to main content
Data Anomalies Settings
Updated over 3 weeks ago

Introduction

Suspicious data modification on a resource is called Data Anomalies. A user or malicious software can make such changes. For example, if a resource in your organization is under attack, the malicious software on the resource can start modifying and deleting files present in the resource. A resource is a device or server or Sharepoint site where data is stored.


❗ Important

  • Data Anomalies displays insights about the data protected only for the following resources:

  • Data is displayed for up to the last 30 days.


When such a potential threat manipulates the data on a resource, it is suspicious in nature and is unlike how the resource owner works with data on that resource. Since anomalies of this type often indicate issues that require attention, Druva flags any such anomalous behavior in a resource and generates an alert.

Prerequisites for VMware Data Anomalies

If you are using Data Anomalies for virtual machines, ensure that the following prerequisites are met:

  • VMware tools are installed and enabled on the virtual machine. For more information, see Install and Upgrade VMware tools.

  • Keep the Guest OS credentials handy as you need to provide these details

  • For Windows virtual machines: The user credentials provided must have administrator privileges or access rights

  • For Linux virtual machines: The user credentials provided must have either root privileges or sudo user access rights. For more information about configuring and managing sudo user credentials, see Manage credentials for VMware servers.

  • The Data Anomalies algorithm requires a minimum 7.0.0::r438902 proxy version to detect anomalous file actions such as bulk file creation, deletion, and modification.

  • The Data Anomalies algorithm requires a minimum 7.0.2::r518961 proxy version to detect anomalous encryption file actions. Contact support for assistance related to anomalous encryption file actions and alerts.

  • For Windows virtual machines: Enable USN journal for each drive with enough storage.

The default Windows USN journal size for most Windows versions is 32 MB which is insufficient for Data Anomalies on large virtual machines. Druva recommends the following USN journal sizes for different disk sizes:

File Count

Disk Size

Maximum Size

Files > 10 million

500 GB

2 GB

Files > 5 million

200 GB

1 GB

Files > 2 million

50 GB

512 MB

Files > 1 million

10 GB

256 MB

To increase the USN journal manually, see Microsoft 365 documentation.

  • For Linux virtual machines: The iNotify watches maximum limit value must be more than the number of directories on the virtual machine

  • For Linux virtual machines: Any one of these file system types should be present on the virtual machine - ‘xfs’ , ‘ext4’, ‘ext3’

For more information about the software requirements for VMware, see the Support matrix for VMware.

  • Ensure that the following URLs are whitelisted and allowed for a successful VMware Data Anomalies scan:

*s3.amazonaws.com/*

s3-*.amazonaws.com

s3*.*.amazonaws.com

For more information, see,

Support matrix for VMware Data Anomalies

The following are the supported windows versions for VMware Data Anomalies:

  • Windows 10 (32 and 64-bit)

  • Windows Server 2012 (64-bit)

  • Windows Server 2016 (64-bit)

  • Windows Server 2019 (64-bit)

  • Windows Server 2022 (64-bit)

The following are the supported Linux (64-bit) versions for VMware Data Anomalies:

  • Red Hat Enterprise Linux (RHEL) 7.0 , 7.1, 7.2, 7.3, 7.4, 7.5

  • CentOS 7.0 , 7.1, 7.2, 7.3, 7.4, 7.5

  • Ubuntu 16.04, 18.04

Things to Consider

Following are a few limitations that you should know before using Data Anomalies for VMware:

Error: Data Anomalies scan fails with the following error: Invalid pid for Guest VM execution.

Description: This error is observed in the following scenarios:

  • The glibc library version of the guest virtual machine is lower than 2.14

  • The default SELinux restriction enforced by Red Hat. This is specifically observed for the SELinux policy version- selinux-policy-3.13.1-268.el7_9.2.noarch

Workaround: To resolve this issue, do the following:

  • Upgrade the glibc library version of the guest virtual machine to 2.14 or above

  • To bypass the SELinux restriction enforced by Red Hat, perform the following steps:

    1. Run the following command to check the SELinux status : # sestatus

    2. Set SELinux policy to permissive using # setenforce 0 command

    3. To persist enforcement policy, update selinux config file using # sudo vi /etc/sysconfig/selinux command.

For more information, see Red Hat documentation.

  • Data Anomalies for VMware - Linux: A Modified alert displays an event count in case of a change made only to file permissions without any modification in the file contents. You can safely ignore those events.

  • Data Anomalies for VMware -Windows: When you delete files, Data Anomalies scan cannot find file metadata from USN Journal or Windows with the given file ID. Data Anomalies scan displays the timestamp for such files as the Data Anomalies scan launch timestamp.

How does Druva detect Data Anomalies

Druva’s automated intelligence analyzes and monitors the data activity trend for a given resource, and after a sufficient sample size, it builds the anomaly baseline. An alert is automatically generated and reported in case of any anomalous activity.

What do we mean by baseline?

In the Data Anomalies feature context, a baseline refers to the expected pattern of data behavior over a specific period. It serves as a reference point or benchmark against which you can detect deviations or anomalies.

Step 1 Learning period: In this step, Druva performs a data backup pattern analysis. See Data backup pattern analysis period.

Step 2 Data Anomalies detection process: In this step, Druva checks the backed-up files to detect anomalous file actions such as creation, update, deletion, and encryption.


❗ Important

For VMware resources, backup and the Data Anomalies detection process run simultaneously.


Step 3: Generate and send a Data Anomalies alert: If any data anomalous activity is detected, a Data Anomalies alert is sent.

Following are the algorithm input parameters that Druva requires and uses to analyze the data activity trend and generate alerts in case of any suspicious data activity:

  • Data backup pattern analysis period for resources - Endpoints, File Server, NAS, VMware, Microsoft 365 (OneDrive and SharePoint): Displayed in Days or Snapshots

  • Number of files in a snapshot: A minimum number of files required within a snapshot to initiate Data Anomalies learning and scanning.


💡 Tip

If the total number of files in a snapshot is less than the minimum number of files, then that snapshot is not scanned for Data Anomalies detection.


  • Deviation in the files from the baseline and total files in a snapshot: Percentage deviation threshold compared to the baseline and total files in a snapshot required to qualify as anomalous data.

Recommended Data Anomalies settings

Following are the recommended Data Anomalies settings for analysis period to start encryption checks for a resource and generating Data Anomalies alerts.


❗ Important

We recommend that you keep the default - Recommended Data Anomalies Settings if you are not sure about the data backup pattern of your organization.


Data backup pattern analysis period for resources

  1. 30 days (For Endpoints and OneDrive): The default and recommended setting for Endpoints and OneDrive data backup pattern analysis. The Data Anomalies detection for Endpoints and OneDrive will start only if data has been successfully backed for the past 30 days. The permissible settings for days or snapshots are between 2 and 45.

  2. 30 days (For File Server/NAS/VMware/SharePoint): The default and recommended setting for File Server/NAS/VMware/SharePoint data backup pattern analysis. The Data Anomalies detection for File Server/NAS/VMware/SharePoint will start only if data has been successfully backed up for the past 30 days. The permissible settings for days or snapshots are between 2 and 45.

  3. 100 or more files in a snapshot: The default and recommended setting for the minimum required files in a snapshot of a resource to initiate Data Anomalies detection for resources - Endpoints, OneDrive, File Server, NAS, VMware, and SharePoint. The permissible setting for a minimum count of files is between 20 and 500.

  4. 75% of baseline in the snapshot: The default and recommended maximum setting for the file actions (Create, Update, and Delete) in a snapshot for a resource to generate a Data Anomalies alert. Data Anomalies alert is generated if the deviation is observed beyond the set baseline value. The permissible setting for baseline is between 50 and 99%.

  5. % of the total files in a snapshot: The default and recommended setting for the minimum change in the count of files out of the total files in a snapshot to generate Data Anomalies alert.

Endpoints, File Server, NAS, and OneDrive: 70% of the total files in a snapshot

VMware and SharePoint: 20% of the total files in a snapshot

The permissible setting for a minimum change in the count of files is between 5 and 90% for Endpoints, File Server, NAS, and OneDrive.

The permissible setting for a minimum change in the count of files is between 5 and 90% for VMware and SharePoint.

Both the 4th and 5th conditions should be met for Data Anomalies alert to get generated.

You can use the Data Anomalies Settings > Edit option to customize and update the Data Anomalies configuration settings as per your organizational requirements and if you are aware of the data backup patterns.


❗ Important

If you have selected snapshots as your data backup pattern learning period criteria, ensure that the learning duration is completed within 45 days.


The following table explains the Data Anomalies behavior for Endpoints, OneDrive, and File Server/NAS/VMware/SharePoint resources:


❗ Important

First backup is not considered for Data Anomalies detection.


Example

Scenario: Data Anomalies is enabled for a resource with the following Data Anomalies settings with total 500 files.

Backup Pattern learning period

Minimum number of files required in a snapshot for Data Anomalies detection

Maximum Deviation

Minimum percent of total file change

05 snapshots

125

50%

20%

The following example explains the Data Anomalies behavior using the Data Anomalies settings mentioned in the table above.

For the first backup, there were 500 files backed up. Being the first backup, this will be excluded by the Data Anomalies algorithm.

Let's consider subsequent backups in the following trend:

Snapshot#

Created

Modified

Deleted

2

20

5

8

3

12

7

1

4

0

0

10

5

0

0

0

6

5

0

8

We have a total of 520 files after the 6th backup. Learning duration is complete - 05 Snapshots. Data Anomalies detection starts and alerts can be generated in case of anomaly.

Now, the baseline is as follows:

  • Baseline for creation = maximum of new files created in the last learning duration of snapshots. i.e. Maximum of 20, 12, 0, 0, 5 which is 20

  • Baseline for modification/update= maximum of modified/updated files in the last learning duration of snapshots. i.e. Maximum of 5, 7, 0, 0, 0 which is 7

  • Baseline for delete=maximum of deleted files in the last learning duration of snapshots. i.e. Maximum of 8, 1, 10, 0, 8 which is 10

The baseline for creation, modification, and deletion is 20, 7, and 10 respectively.

Let's proceed with the next round of backups in the following trend:

Snapshot#

Created

(Baseline for Creation)

Modified

(Baseline for Modification)

Deleted

(Baseline for Deletion)

Total files in last backup

7

10

(20)

5

(7)

7

(10)

520

8

2

(Max of 12, 0, 0, 5, 10 = 12 )

1

(Max of 7, 0, 0, 0, 5 = 7 )

2

(Max of 1, 10, 0, 8, 7 = 10 )

523

9

100

(Max of 0,0,5,10,2 = 10 )

0

(Max of 0,0,0,5,1 = 5 )

8

(Max of 10, 0, 8, 7,2 = 10 )

523

10

80

(Max of 0,5,10,2, 100 = 100 )

4

(Max of 0,0,5,1,0 = 5 )

10

(Max of 0, 8, 7,2,8 = 8 )

615

11

0

(Max of 5,10,2, 100, 80 = 100 )

12

(Max of 0,5,1,0, 4 = 5 )

8

(Max of 8,7,2,8, 10 = 10 )

685

12

5

(Max of 10,2, 100, 80, 0 = 100 )

50

(Max of 5,1,0, 4, 12 = 12 )

70

(Max of 7,2,8, 10, 8 = 10 )

677

13

200

(Max of 2, 100, 80, 0,5 = 100 )

0

(Max of 1,0, 4, 12, 50 = 50)

0

(Max of 2,8, 10, 8, 70 = 70 )

612

At the 9th snapshot, a creation alert is generated wherein 100 files are created and all the three required conditions are met:

  • Total number of files > minimum number of files required i.e.125

  • Baseline for creation = 10; number of files created > Baseline * max deviation

  • New files created > minimum percent of total files change

Similarly, at the 12th snapshot, modification and deletion alerts are generated as all three required conditions are met for both.

Administrators can take action based on the security policies of the organization to identify and isolate a possible threat and prevent additional losses.


❗ Important

Anomaly detection kicks in only after the backup job is complete and a snapshot is created. For incomplete backup jobs or interrupted backup jobs, no anomalous behavior is tracked.


View Data Anomalies alerts


📝 Note
In the case of deleted resources (devices, sites, and backupsets) you cannot view the alerts for those resources. However, you can retrieve the deleted resources and view their alerts with the Rollback Action option.


Log in to the Management Console and go to Cyber Resiliency > Posture & Observability > Data Anomalies > Anomalies tab to view Data Anomalies details.

Take action on an alert

For any Data Anomalies alert, you can do any of the following:

  • Ignore the alert : If you deem any alert as a false positive, click the resource name and select the false positive alert. Click Ignore to resolve the alert.

  • Quarantine the resource: Select an alert and click Quarantine Resource to stop the ransomware from spreading further. Before you quarantine, see Know the impact of quarantining to learn more about the effects of quarantining the resource. To learn about the options to quarantine a resource, see Quarantine Response.

  • You can also download the logs for a particular alert and use them for further inspection.


📝 Note
For each backup of all workloads, you can download logs for up to 1.5 million files.


The downloaded logs provide information about the following:

  • File Name: Name of the file

  • Full Path: Path of the file.

  • File Type: The type of file. For example, .txt

  • File Size (Bytes): Size of the file

  • File Modified Timestamp: The date and time when the file was modified

  • Operation: The operation performed on the file. For example: File created, file modified, file deleted, files encrypted

  • SHA1 Checksum (Only for Endpoints, OneDrive, and SharePoint): The SHA1 Checksum value of the file

  • File Owner (Only for Endpoints, OneDrive, and SharePoint): The details of the file owner

  • File Created Timestamp (Only for Endpoints, OneDrive, and SharePoint): The date and time when the file was created

  • File Modified By (Only for OneDrive, and SharePoint): The date and time when the file was modified.

  • Alert Reason: The reason for encryption alerts.


❗ Important

In case of encryption, the downloaded logs will contain details for a maximum of 100 encrypted files.


After you have taken an action, the status of the alert changes to Resolved.

Related Keywords:

Unusual Data Activity

UDA

unusualdataactivity

Data Anomalies

dataanomalies

data anomaly

Data Anomaly

Did this answer your question?