Fine Tuning

In This Section:

Customized Deployment

Check Point DLP provides the MultiSpect set of features. These features provide the flexibility you need to monitor and ensure accuracy of your DLP deployment. For example, if you find incidents that called for actions but should have passed without delay, you can change the Data Types and/or the rules to ensure that this does not occur again. In this way you fine-tune DLP over a relatively short amount of time to create a trustworthy implementation.

You can also include User Decisions to fine-tune Data Types and rules. How useful this information is depends on how well you communicate with users. Make sure they know that their input can influence the DLP - if they want a type of data to be sent without delay, and can explain why, you will use their logged decisions to change the rules.

MultiSpect includes:

Compound Data Type - This data type enables you to join multiple Data Types in AND and NOT checks. A rule using this a compound data type will match transmissions that have all the AND types, but does not include any of the NOT types.
Data Type Groups - You can group together multiple Data Types of any category. The Data Types, when used in a rule, match transmissions on an OR check.
CPcode Data Type - The CPcode syntax provides unmatched flexibility. You create the data type and its features, with all the power of an open programming language. Change the code as needed to improve accuracy, and to allow messages that user decisions tell you should be passed.
Flags for Data Types and Rules - While managing Data Types and reading the logs and analysis of DLP usage, use the flags on Data Types and on rules to help ensure accuracy. Flagged Data Types and rules are added to the Overview page for efficient management.
Placeholder Data Types - Several provided Data Types describe dictionaries and keywords that you should customize with your own lists. For example, the empty placeholder Employee Names should be replaced with your own list of employees. This Data Type is used in compound Data Types and provided rules. Placeholders are flagged with the Improve Accuracy flag out-of-the-box.

In this stage, you may decide to set some rules to Prevent. When DLP captures a Prevent incident, the data transmission is stopped completely; the user has no option to continue the send. (It is recommended that such rules include notification to data owner and to user.)

Setting Rules to Prevent

To have full Data Loss Prevention, you might think that data transmissions with protected data should all prevented from leaving the organization. However, putting all your rules to Prevent from the start will surely cause so many disruptions in mission-critical work of your organization, that the protection will become worse than meaningless. The best practice is to set rules to Prevent only after users have become familiar with the Organization Guidelines and audits of your logs have shown that automated prevention of user initiated actions is necessary - and then, only for specific Data Types, users, or other parameters.

Note - This is one reason why you might want to create a user group for new employees, so that they can learn from the UserCheck stage before having their transmissions automatically prevented.

Another user group you will probably find useful is one for terminating employees.

It is recommended that for rules set to Prevent that also have a High or Critical severity, you also set Email in the Track parameter. This will ensure that the data owners are notified by email as soon as such an incident is prevented.

To set a rule to Prevent:

Open Data Loss Prevention > Policy.
In the Action column of the rule to change, right-click and select Prevent.

Multi-Realm Authentication Support

One of the ways DLP authenticates users is by querying the Active Directory servers configured in SmartDashboard. If a legitimate user has multiple accounts on different AD servers, each account associated with a different password, the user may fail to authenticate. DLP validates the user according to the credentials supplied by the first AD server to respond. To help prevent this error, and decrease the load created by constantly querying all AD servers, you can define which AD servers DLP queries when:

A user enters credentials for the DLPportal or UserCheck agent
DLP looks up an email address extracted from SMTP traffic to identify a user

To define AD servers Using GuiDBedit:

Open GuiDBedit.
On the Tables tab, open Other > authentication_objects.
In the Object Name column, select DLPSenderRealm.
In the Field Name column, double-click the ldap_au container.
The Add/Edit Element window opens.
In the Object list, select only those servers DLP must query for authentication purposes.
On a network that contains ten AD servers, perhaps only two of them must be queried. Edit the list to include only the required AD servers.

Note - These AD servers must first be defined in SmartDashboard.
Click OK.
Save the database and close GuiDBedit.
Install the updated policy on the DLP enabled gateway.

Troubleshooting DLP Related Authentication Issues

The Check Point database tool, GuiDBedit, has a number of properties that set default authentication values. These properties can be used in troubleshooting DLP related authentication issues. These objects are found under: GuiDBedit > Tables > Other > authentication_objects:

Object	Description
`DLPSenderRealm`	Controls authentication for the DLP portal and the UserCheck agent. This object contains: `Fetch_options > do_internal_fetch` True by default, meaning DLP does the email look up against user accounts in SmartDashboard. `Fetch_options > do_ldap_fetch` True by default, meaning if DLP fails to identify the user through a user account in SmartDashboard, it then queries the AD servers defined in the `ldap_au` container object. The `ldap_au` container holds objects that represent AD servers. Use `DLPSenderRealm` to solve authentication problems.
`dlp_ldap_auth_settings`	This object controls how DLP identifies users by querying the email address attribute in the Active Directory. Use this object to troubleshoot problems involving email look up in the Active directory. The `CustomLoginAttr` string lets you enter a custom LDAP query with a specified email address. The default query is: `\|(mail=<<>>)(proxyAddresses=smtp:<<>>)` By default, it searches for the user with the specified email address. To refine the query, you can add other AD attributes to the query or change existing ones. WARNING: Changing this default query might affect DLP rules that enforce a policy according to users or user groups defined by access roles. Known users may become Unknown and the data they send allowed to leave the organization.
`dlp_internal_auth_settings`	This object controls how DLP identifies users by querying the email address attribute in the database of internal users defined in SmartDashboard.

Object

Description

DLPSenderRealm

Controls authentication for the DLP portal and the UserCheck agent. This object contains:

Fetch_options > do_internal_fetch
True by default, meaning DLP does the email look up against user accounts in SmartDashboard.
Fetch_options > do_ldap_fetch
True by default, meaning if DLP fails to identify the user through a user account in SmartDashboard, it then queries the AD servers defined in the ldap_au container object.
The ldap_au container holds objects that represent AD servers.

Use DLPSenderRealm to solve authentication problems.

dlp_ldap_auth_settings

This object controls how DLP identifies users by querying the email address attribute in the Active Directory. Use this object to troubleshoot problems involving email look up in the Active directory.

The CustomLoginAttr string lets you enter a custom LDAP query with a specified email address. The default query is:

|(mail=<<>>)(proxyAddresses=smtp:<<>>)

By default, it searches for the user with the specified email address.

To refine the query, you can add other AD attributes to the query or change existing ones.

WARNING: Changing this default query might affect DLP rules that enforce a policy according to users or user groups defined by access roles. Known users may become Unknown and the data they send allowed to leave the organization.

dlp_internal_auth_settings

This object controls how DLP identifies users by querying the email address attribute in the database of internal users defined in SmartDashboard.

Defining Data Types

The optimal method for defining new data type representations is to use the Data Type Wizard.

First, review the predefined Data Types: you might not need to add more. If the data assets that you want to protect from leakage are not represented in the Data Types page, open the Data Type Wizard.

To add a new data type:

On the SmartDashboard, open the Data Loss Prevention tab.
Open Data Types and click New; or in Policy > Data column, double-click and in the Add Data Types window, click New.
The Data Type Wizard opens.
Enter a name for the new data type.
Choose an option that defines the type of traffic that will be checked against a rule containing this data type.
Fill in the properties as required in the next step (each step is relevant to the option selected in the previous step).
Click Finish.

Protecting Data By Keyword

You can create a list of keywords that will be matched against data transmissions. Transmissions that contain this list of words in their data are matched. You define whether it should match it on an ALL or ANY basis.

To create a data type representation of specified keywords:

In the Data Type Wizard, select Keywords.
Click Next.
The next step is the Specify Keywords window.
Enter a keyword to protect.
Click Add.
Enter as many keywords or phrases as you want in this data type.
Decide whether data should be matched if all the keywords in this list are matched, if only one match is necessary, or a specific number should be matched.
For example, if you want to ensure that no one can send an email that contains any of the names of congressmen in a committee, their names would be the keywords and you would set the Threshold to At least 1. (Note that the higher the threshold, the more precise the results will be.)

If you wanted to allow emails mentioning the congressmen, but decided that all of their names in one email would be suspicious, then set Threshold to All words must appear.
Click Next.
Click Finish; or if you want to add more parameters to the data type, select the checkbox and then click Finish.

Protecting Documents by Template

Confidential and sensitive documents are often based on templates. A template defines the headers, footers, seals, and formatting of related documents. This is what makes all court orders, for example, look the same.

You can create a Data Type that protects documents based on a specific template. You then add the Data Type to a rule and connections that contain such a document are matched by the policy.

Important - When a template including images is attached to a DLP Template Data Type, the image file format is important. The file format used in the template must match the file format in the user document. If the file formats are different, the rule will not trigger a DLP response.

For example, if the template contains a JPG image and the user document contains the image in GIF format, there is no DLP response.

templateExample

To create a Data Type representation of documents based on a template:

In the Data Type Wizard, select Documents based on corporate template.
Click Next.
Browse to the template file on your system.
This file does not have to be known as a template in the application: the template for the Data Type may be a *.doc file and does not have to be a *.dot file. Choose any file that is a basic example of documents that might be sent.
Move the Similarity slider to determine how closely a document must match the given template to be considered protected.
It is recommended that you first set this slider quite low; the higher it is, the less the rule will catch. After completing the wizard, send a test email with such a document, and check the SmartView Tracker logs to see if the document was caught. Slowly increase the Similarity level until the rule is catching the documents you want. This will be different for each template.
Click Next.
Click Finish.
To configure additional properties for the Data Type, select Configure additional Data Type properties clicking Finish.

Property	Description
Match empty templates	Select this option if you want DLP to match the Data Type on an empty template. An empty template is a template that is identical to the uploaded corporate template. If the option is not selected, an empty template is detected but the Data Type is not matched. The template is not considered confidential until it contains inserted private data. Note: the rule is bypassed for this document, but the document may still be matched by another DLP rule in the policy.
Consider template's images	Incorporates a template's graphic images into the matching process. Including template images increases the similarity score calculated between the template and the examined document. The higher the score, the more accurate the match. Select this option if the graphic images used in a template document suggest that the document is confidential.

Property

Description

Match empty templates

Select this option if you want DLP to match the Data Type on an empty template. An empty template is a template that is identical to the uploaded corporate template.
If the option is not selected, an empty template is detected but the Data Type is not matched. The template is not considered confidential until it contains inserted private data.
Note: the rule is bypassed for this document, but the document may still be matched by another DLP rule in the policy.

Consider template's images

Incorporates a template's graphic images into the matching process. Including template images increases the similarity score calculated between the template and the examined document. The higher the score, the more accurate the match.
Select this option if the graphic images used in a template document suggest that the document is confidential.

Alternative to slider testing:

If you want to catch documents that match on different levels with different actions, you may try this procedure:

Create the Data Type for the template, setting the slider to 10%.
In the Policy window, create a Detect rule that tracks matching documents but does not stop them.
Create another Data Type, just like the first, but set the slider to 50%.
Create an Ask User rule that tracks matching documents and holds the transmission until the user decides whether it should be sent or is too sensitive and should be deleted.
Create a third Data Type, with the slider set to 90%.
Create a Prevent rule that tracks matching documents and blocks the transmission.

Protecting Files by Attributes

Create a data type that protects files based on file type, file name, and file size. Transmissions that contain a file that matches the parameters are matched.

To create a data type representation of files:

In the Data Type Wizard, select Files.
Click Next.

Select the appropriate parameters:

Note - A file must match all the parameters that you define here, for it to be matched to the rule. Thus, the more parameters you can set here with assurance, the more accurate the results will be.

The file type is any of these types - Click the add button to select from the Add File Types window.
The file name contains - Enter a string or regular expression to match against file names.
The file size is larger than - Enter the threshold size in KB.

Click Next.
Click Finish, or if you want to add more parameters to the data type, select the checkbox and then click Finish.

Protecting Data by Pattern

You can create a regular expression that will be matched against content in data transmissions. Transmissions that contain strings that match the pattern in their data are matched.

Note - Use the Check Point supported regular expression syntax.

To create a data type representation of a pattern:

In the Data Type Wizard, select Pattern (regular expressions).
Click Next.
Enter a pattern to match against content.
Click Add.
Enter as many regular expressions as you want in this data type.
Decide whether data should match the data type if the pattern is matched even once, or if it should be allowed until a given number of times.
For example, if you want to ensure that no one can send an email that contains a complete price-list of five products, you would set the pattern to "^[0-9]+(\.[0-9]{2})?$" and you would set the Number of occurrences to 5.
Click Next.
Click Finish; or if you want to add more parameters to the data type, select the checkbox and then click Finish.

Defining Compound Data Types

You can create a complex data type representation. A compound data type includes multiple Data Types, which are matched either on AND (a number of Data Types are matched), or NOT (necessary Data Types are not present), or both.

For example, you can look for files or emails that contain patient records. You could create a data type that combines documents that match a patient record template, with a dictionary data type that contains a group of patient names who have not signed release forms. Now you have a single data type that will match emails or FTP that contain patient records of patients who have not signed a release form.

To create a compound data type representation:

In the Data Type Wizard, select Compound.
Click Next.
In the first section, click Add and select Data Types to match on AND.
In the second section, click Add and select Data Types to match on NOT.
If a transmission is sent that matches all the Data Types of the first section and none of the Data Types in the second section, the data of the transmission is matched to the compound Data Types.
Click Next.
Click Finish; or if you want to add more parameters to the Data Type, select the checkbox and then click Finish.

Protecting Data by Fingerprint

Many Data Types identify data by classifying it according to keywords or file attributes such as document type, name, or size. Classifications and attributes are used to describe the data. The fingerprint Data Type does not rely on a description of the data. The fingerprint Data Type identifies the data according to a unique signature known as a fingerprint. A fingerprint accurately identifies confidential files or parts of confidential files.

Fingerprint Data Type can accurately identify files that the organization considers confidential. This Data Type will accurately match files or parts of it.

Generating the unique signature

First you identify a repository. A repository is a network location that contains files that must not go outside of the organization. The DLP blade scans these data files and generates a unique signature for each file.
When a file passes through a DLP gateway, the file is scanned and a signature generated.
The signature of the file passing through the DLP gateway is compared against the signatures of files in the repository. If there is a signature match, the file scanned by the gateway is prevented from going outside of the organization.

Repository Scanning

Files in the repository are constantly changing. New files are added, existing files modified or deleted. To keep file signatures up to date, the repository must be scanned on a regular basis. By default, the repository is automatically scanned every day. If a file is added or modified after a scan, the file's signature will not be updated until the next scheduled scan occurs.

Supported file shares for repositories:

CIFS
NFS

Note - Scans of a repository that has already been scanned takes less time. Unchanged files in a repository are skipped.

Filtering for Efficiency

A large repository might also contain many files that are not confidential and do not need to be scanned. The scan can be made more efficient by:

Accurately defining the location of data in the repository
Select only those folders that are known to contain confidential files. You may need help from the related department heads to do this. For example not all the folders in the Finance department may contain confidential information. These folders do not have to be included in the scan.
Only scanning files that match specific Data Types, for example spreadsheet files or credit card numbers.
If you add Credit Card Numbers as the Data Type in the filter, all the files in the repository that contain credit card numbers are scanned and fingerprinted. If Spreadsheet file is selected as the Data Type in the filter, only spreadsheet files in the repository will be scanned and fingerprinted.

Granularity

Complete files do not have to go outside of an organization for data to be lost. Confidential data can be lost if sections from files in the repository are copied into other files, copied to email or posted to the web. A file in the repository may be saved locally and then modified in a way that it no longer matches the unique fingerprint signature. To identify such incidents, a partial match between files scanned by the DLP gateway and files in the repository can be configured. A partial match can be:

According to a percentage value
The number of text segments in the sent file is divided by the number of text segments in the repository file, and the result expressed as a percentage. A match occurs if this percentage is higher than the percentage configured on the General Properties page of the Data Type.
A number of identical text segments
A match occurs when the number of identical text segments in a scanned file and a file in the repository is higher than the number configured on the General Properties page of the Data Type.

Scan Times

Large repositories might cause a scan to run all day. To prevent this, you might want to limit the scan to a specified range of hours. If a scan does not complete before the time range expires, the scan will recommence where it stopped when the next scheduled scan occurs.

Logging

Repository scans generate logs that can be viewed in SmartLog or SmartView Tracker. In SmartView Tracker, the Fingerprint Scans query shows all logs generated by a scan.

Logs are generated when:

The fingerprint Data Type is matched.
In the log:
- The Matched File field shows which file in the repository matches the scanned data.
- The Matched File Percentage field shows percentage of segments in the scanned data that match segments from the file in the repository. A 100% match means the scanned data and the file in the repository are identical.
- The Matched File Text Segments shows how many segments of the scanned data were matched to segments in the repository file.
A Whitelist files scan has been started
A whitelist repository scan is running
A Whitelist files scan has ended successfully
A repository scan has been started
A repository scan is running
A repository scan ends successfully

Note - Running logs are generated every two hours. For a scan that lasts less than two hours, you will only see the start and finish logs.

Log Details

Fingerprint

Scan ID	A unique scan identification to distinguish between logs
Next Scheduled Scan Date	Time the scan started
Duration	How long the scan lasted
Scan Status	The status can be Running, Paused, Canceled, or Success
Number of errors	Number of errors encountered.

Fingerprint scan details

Repository root path	The upper level repository
Current directory	Current directory being scanned
Directories	The total number of directories in the repository selected in data locations.
Repository size (MB)	The size of the repository
Repository Files	The number of files in the repository
Directories scanned	The number of directories scanned so far
Scanned size (MB)	The number MBs scanned so far
Scanned files	The number of files scanned so far
Unreachable directories	Number of sub directories in the repositories that could not be opened during the scan.
Fingerprinted files	The number of files with a fingerprint signature
Filtered files	The number of files that were not scanned because they did not meet the criteria set on the Repository Scan Filter page. For example file size, modification date, or Data Type.
Scan speed (KBs)	The speed of the scan
Progress	Percentage of the repository so far scanned
Remaining time	Estimated time to scan completion

To create a fingerprint Data Type:

In the Data Type Wizard, select Fingerprint.
Enter a name and informative comments for the Data Type.
This is the name that will show on the Data Loss Prevention > Repositories page.
Click Next.
In the Fingerprint window:
1. Click the Gateways arrow button to select gateways with the DLP blade enabled.
  By default, The DLP Blades object shows. This object represents all gateways that have the DLP blade enabled. Only gateways selected here scan the repository and enforce the fingerprint data type.
2. Define a network path to the repository
3. If the repository defined in the network path requires a username and password to access it, enter the relevant authentication credentials.
Click Test Connectivity.
This tests that DLP gateways defined in the gateways list (step 4a) can access the repository using the (optional) assigned authentication credentials.
Click the Match Similarity arrow.
This option matches similarity between the document in the repository and the document being examined by the DLP gateway. You can specify an exact match with a document in the repository, or a partial match based on:
- A percentage value or
- Number of matched text segments.
Click Next.
Select Configure additional Data Type Properties after clicking Finish if you want to configure more properties.
Click Finish.
The New data type wizard closes. The data type shows in the list of data types and also on the Repositories page.

To configure more fingerprint properties:

In the Data Types window or Repositories window, double-click fingerprint object to open it for editing. These properties can be configured:

General
Change the data entered in the Data Type wizard.
Data Owners
Add users or user groups that own the data. Data owners can be notified when the fingerprint data type is matched by a rule in the DLP policy.
Advanced Matching
Add CPcode scripts to apply more match criteria after the fingerprint data type is matched by a rule.
Scan Scheduling
Configure when the document repository is scanned to update the fingerprint data type. The default time object (Every-Day) has no time restrictions configured. This means that a scan runs without time restrictions after the fingerprint data type is added to a policy rule. If gateway resources and network bandwidth are an issue, limit the scan to off-peak hours.

Repository Scan Filter

This page offers more scanning criteria:

Scan files matching the following data types
This property lets you scan documents in the repository according to more data types, for example credit card numbers. If you add credit card numbers as the data type, all the files in the repository that contain credit card numbers are fingerprinted. If "spreadsheet files" are selected as the data type, only spreadsheet files in the repository are fingerprinted.
Scan files according to size
Only files of the specified maximum and minimum size are included in the fingerprint.

Scan files according to modification date

Only files that match the specified modification dates are included in the fingerprint.

Note - After a change to the filters (adding or removing a data type, selecting a different file size or modification date) the DLP gateway regards all files in the repository as new. In a large repository, this will result in a long scan. The fingerprint will only be enforced after this scan has ended.

Data locations
Use the Data Locations tree to include or not include repository sub-folders. If you want the fingerprint data type to prevent only one document type from leaving the organization, put that document in a folder that contains no other document. Select only that folder as the data location.

Using the Fingerprint Data Type

To use the fingerprint Data Type, you must:

Add the fingerprint Data Type to a DLP rule
Install a policy on the DLP enabled gateway
After the fingerprint Data Type is included in a policy, a scheduled scan occurs. After the scan successfully finishes, the fingerprint Data Type is enforced.

If you want to manually start a scan of the repository:
1. On the Repositories window, select the fingerprint Data Type.
2. In the summary pane for the Data Type, click Start.

NFS Repository scanning in NATed Environments

NATing, for example in a clustered environment where each member's connections are translated to the Virtual IP address of the cluster, prevents repository scanning when the repository is located on an NFS server. To enable repository scanning you must disable Hide NAT on all NFS services. The members of a cluster must be configured to send NFS related traffic using the member's IP address in the Source field of the packet, and not the Virtual IP of the cluster.

To disable Hide NAT on NFS services:

On the Security Management Server, open $FWDIR/lib/table.def for editing.
Search for the line: no_hide_services_ports.
These are the services and ports not included in Hide NAT.
Enter:
no_hide_services_ports = { <111, 17>, <111, 6>, <4046, 17>, <4046, 6> }

If a list of services and ports already exists, add these numbers to the end of the list.
Save and close the file.
Install the policy onto the ClusterXL object.
Note:
- New settings in table.def globally to all gateways.
- For more, see sk31832.

Advanced Data Types

The Data Type Wizard has four advanced Data Types:

Weight Keywords
Words from a dictionary
Custom CP code match
Message attributes

Protecting Data by Weighted Keyword

If you begin by creating a Data Type for keyword or pattern, and realize that it is not ALL or ANY, but that one word is a sign of protected data in itself, and other word would be a suspicious sign only if it appeared numerous times, you can define this complex data representation as a Weighted Keyword rather than a simple keyword or pattern.

Transmissions that contain this list of words, in the weight-sum that you define, in their data are handled according to the action of the rules that use this Data Type.

To create a Data Type representation of weighted keywords:

In the Data Type Wizard, select Advanced and from the drop-down list, select Weighted Keywords.
Click Next.
Click the arrow of the Add button and select either Word or Phrase or Regular Expression.
(If you click the Add button instead of its sub-menu, the item will be a keyword, not a pattern.)

The Edit Word window opens, for both types of item.
Enter the keyword, phrase, or regular expression.
In the Weight area, set whether each occurrence of matching data content should be counted as 1 (default) or more, and if there is a ceiling to the weight.
- Each appearance of this word contributes the following weight - set to 1 for lowest weight, 2 for double-weight (one instance of this string will be counted as though two), and so on.
- The weight of this word is limited to - set to 0 for no limit, or set to a number higher than the weight in the previous value to set a maximum count (a ceiling) for this one word.
Click OK.
In the Specify Weighted Keywords step, set the Threshold. If data content matches any of the words in this Data Type, with a total weight surpassing this value, the data is matched to the Data Loss Prevention rule.
Click Next.
Click Finish; or if you want to add more parameters to the Data Type, select the checkbox and then click Finish.

Providing Keywords by Dictionary

If you pre-planned the keywords that should flag data as protected, you do not need to enter them one by one in a keyword data representation. Instead, you can upload the list as a dictionary. You decide how many of the items in the list have to be matched to have the data match the rule.

Note - Dictionary files should be one word or phrase per line. If the file contains non-English words, it is recommended that it be a Word document (*.doc). Dictionaries that are simple text files must be in UTF-8 format.

To create a Data Type representation of dictionary:

In the Data Type Wizard, select Advanced and from the drop-down list, select words from a Dictionary.
Click Next.
Browse to the file containing the list of terms.
In the Threshold area, set the number of terms in this list that must be in the content to have the data matched to the rule.
It is recommended that you first set this to the highest reasonable value, and then lower it after auditing the SmartView Tracker logs.

For example, if the dictionary is a list of employee names, you should not set the threshold to 1, which would catch every email that has a signature. You could set an Employee Name Dictionary Data Type to a threshold of half the number of users and its rule to Detect. If no data is caught by the rule after about a week, lower the threshold and check again. When the rule begins to detect this information being sent out, set it to Ask User, so that users have to explain why they are sending this information outside before it will be sent. With this information on hand, you can create a usable, reasonable and accurate enforcement of corporate policy.
Click Next.
Click Finish; or if you want to add more parameters to the Data Type, select the checkbox and then click Finish.

Protecting Data by CPcode

CPcode is a scripting language, similar to C or Perl, specifically for Intrusion Prevention Systems. If you are familiar with this language, you can create your own complex rules. Use CPcode data types to create dynamic definitions of data to protect, or to create data type representations with custom parameters.

For example, you can create a CPcode that checks for a date that is before a public release, allowing you to create rules that stop price list releases before that date, but pass them afterwards. Other common uses of CPcode include relations between rule parameters, such as recipients (match rule to email if sent to too many domains) and protocols (match rule to HTTP if it looks like a web mail).

Note - See the R77 CPcode DLP Reference Guide.
If you write a CPcode function yourself, you should test it first before putting it in production.

To create a Data Type representation of CPcode:

In the Data Type Wizard, select Advanced and from the drop-down list, select a Custom CPcode.
Click Next.
Browse to the CPcode script file.
Click Next.
Click Finish; or if you want to add more parameters to the Data Type, select the checkbox and then click Finish.

Example of CPcode function:

func rule_1 {

   foreach $recipient inside global:DESTS {

       foreach $comp inside CPMPETITORS_DOMAIN {

          if( casesuffix( $recipient , $comp ) ) {

           set_message_to_user(cat("The mail is sent to " ,

                                      $recipient ,

			"which is a competitor's mail address."));

           set_track(TRACK_LOG);

           return quarantine();

Defining the Message Attribute Data Type

In DLP, a message can be sent using the SMTP, HTTP, or FTP protocols.

Message attributes refer to 3 properties of the message:

The total message size in KB
Number of attachments
Total number of words in the message

To create the message attribute Data Type:

Start the Data Type Wizard
Select Advanced and from the drop-down list select Message Attributes.
The Specify Message Attributes window opens.

Configure these message attributes:

Size
The size attribute can have a:

Minimum value	Maximum value	Meaning
Yes	Yes	Messages that fall within the specified range match the message attribute.
Yes	No	A message whose size is greater than the minimum value specified here matches the attribute.
No	Yes	A message whose size is smaller than the maximum value specified here matches the attribute.

Attachments
Define the number of attachments a message can have.

Minimum value	Maximum value	Meaning
Yes	Yes	A Message whose number of attachments falls within the specified range matches the message attribute.
Yes	No	A message with more than the minimum number of attachments specified here matches the attribute.
No	Yes	A message with less attachments that those shown by the maximum value specified here matches the attribute.

Number of words
Scan for a significant amount of text. If an email has a large binary file attached such as a graphic, and the email contains the words "your picture" the email might match the Size attribute but contain no text worth scanning. You will want the email to match a DLP rule only if the email contains enough text that could conceivably result in data loss.

Minimum value	Maximum value	Meaning
Yes	Yes	Messages whose word count falls within the specified range matches the message attribute.
Yes	No	A message whose word count is greater than the minimum value specified here matches the attribute.
No	Yes	A message whose word count is lower than the maximum value specified here matches the attribute.

Click Next.

Click Finish.

If you want to add more parameters to the Data Type, select the Configure additional Data Type properties after clicking finish and then click Finish.

Note - For a message to match the Data Type attribute, it must match the criteria for size and the number of attachments and the number of words. If the message fails to match one of the criteria, it will fail to match the attribute.

Enhancing Accuracy through Statistical Analysis

A number of Data Types, such as credit card numbers, have an option called Enhance accuracy through statistical analysis on their General Properties page.

Credit cards like Visa and Mastercard have sixteen digit numbers arranged in four groups of four. While scanning for this Data Type, all sixteen digit numbers in the data that match the Luhn algorithm will be identified as credit card numbers. The sixteen digits might not represent a credit card number. The sixteen digits might represent spare part numbers, an ordering or sales code.

The Enhance accuracy option applies statistical analysis to increase the accuracy of identifying specified Data Types, for example credit card numbers.

To enhance accuracy through statistical analysis:

In Data Loss Prevention > Data Types select a Data Type that represents numerical data.
Open the Data Type for editing.
On the General Properties page, select Enhance accuracy through statistical analysis.
Click OK.

Note - Enabling statistical analysis does not impact gateway performance.

Adding Data Types to Rules

The data types are the building blocks of the Data Loss Prevention rule base, and the basis of the DLP policy that you install on DLP gateways - the basis of DLP functionality. Each data type defines a data asset that you want to protect.

Data Owners should be aware of the types of data that are under their responsibility and be able to tell you what type of data must be able to move outside of the organization and what data must be protected.

For example, a team leader of a programming team should know that lines of code should not be allowed to move outside the organization, and require that it be protected. A hospital administrator should have an example of a court order releasing patient records to authorized domains.

Focusing on Data

Focus on the Data Types, not on the full rules. Enable and customize Data Types to recognize data to match.
Start with the obvious - with the data that you know by experience should be kept inside the organization - lines of code, employee contact information, passwords, price lists, and so on.
Then create more complex Data Types according to the organization confidentiality and integrity procedures, after communicating with Data Owners.
After you have a Data Type, add it to a rule, and install the policy rule base on the DLP Gateways.

The Compliance Data Category

In the Data Loss Prevention Data Types window, data types are sorted according to category. An important category is the compliance category. The Data Types window lets you create data types that enforce compliance in accordance with regulatory standards.

The compliance category contains built-in data types that represent accepted standards and regulatory requirements. For example, according to Payment Card Industry (PCI) compliance standards, credit card numbers of customers must not be sent to outside sources in clear text.

The Data Loss Prevention Overview window > DLP Featured Data types toolbox lists the data types for:

Compliance
Clicking the Compliance button shows the data types in this category and how many are activated.
Business information
Personally identifiable information
Best Practice
Intellectual Property.
Human Resources
Financial

In the Featured Data Types area of the toolbox, two actions are available:

Action	Use
View rule	Click View rule to see how the compliance data type is used in the DLP policy.
Add to policy	Click Add to policy to add the compliance data type to the DLP policy.

Clicking Compliance on the tool bar in the Data Types window filters out those data types which do not belong to the Compliance category. Check Point regularly adds to the number of built-in data types, but if none of the types is applicable to your needs - you can create a new data type and add it to the compliance category.

Built-in data types exist for:

EU Data Protection Directive
FERPA - Confidential Educational Records
GLBA - Personal Financial Information
HIPAA - Protected Health Information
ITAR - International Traffic in Arms Regulations
PCI DSS - Cardholder Data
PCI - Credit Card Numbers
PCI - Sensitive Authentication Data
U.S. State Laws - Personally Identifiable Information
UK Data Protection Act

To add a new data type to the compliance category

In the Data Loss Prevention Data Types window, click New.
The Data Type Wizard opens.
Select criteria such as keywords or a corporate template
On the last page of the wizard open, select Configure additional Data Type properties after clicking Finish.
Click Finish.
The data type properties window opens on the General Properties page.
Set the category to Compliance.

Note - You cannot change the category of a built-in data type, only add new data types to one of the pre-existing categories.

Editing Data Types

After you define Data Types with the Data Type Wizard, you can fine-tune them if necessary.

Each Data Type in the General Properties window shows only its applicable fields. You only see the options that apply to the currently selected data type.

Section	Description
General Properties	Name - Name of the data type representation. Comment - Optional comments and notes. Categories - Optional assigned category tags, for grouping data types. Flag - Optional custom flag to help management of a large Data Types list. Follow Up - Use this flag as a reminder to check the tracking logs SmartView Tracker and analysis in SmartEvent to see if your changes are catching the expected incidents and otherwise to follow up on maintenance and fine-tuning. Improve Accuracy - After enabling a built-in data type, use this flag as a reminder to replace placeholder data types with real dictionary files or lists or to otherwise make built-in data types more relevant to your organization. After replacing the file with real data, remember to set this flag to Follow Up, to monitor its related incidents, or to No Flag. Description - For built-in data types, the description explains the purpose of this type of data representation. For custom-made data types, you can use this field to provide more details.
Custom CPcode	Add - Click to add CPcode scripts. The default file type is cpc. See the R77 CPcode DLP Reference Guide. View - Click to view a CPcode script in a text editor. Remove - Click to remove CPcode scripts.
Compound	Each one of these data types must be matched - All items in this list must be matched in the data, for the compound data type to match. None of these data types must be matched - If the data matches any item in this list, the compound data type does not match. Add items to a list. Edit selected item. (Changes made from here affect all compound data types and rules that use the edited data type). Remove items from a list.
Dictionary	Replace - Click to browse to a different file. View- Click to view the file. Note that any changes you make here do not affect the file that is used by the data type. Save a Copy- Click to save the file under another name. This data will be matched only if it contains at least - Set the threshold to an integer between 1 and the number of entries in the dictionary. Traffic that contains at least this many names from the dictionary will be matched. Note - If the items in the dictionary are in a language other than English, use a Word document as the dictionary file. Any text file must be in UTF-8 format.
Documents Based on a Corporate Template	Replace - Click to browse to a different file. View- Click to view the file. Note that any changes you make here do not affect the file that is used by the data type. Save a Copy- Click to save the file under another name. Match empty templates - Select this option if you want DLP to match the data type on an empty template. An empty template is a template that is identical to the uploaded corporate template. If the option is not selected, an empty template is detected but the data type is not matched. The template is not considered confidential until it contains inserted private data. Note the rule is bypassed for this document, but the document may still be matched by another DLP rule in the policy. Consider templates images - Incorporates a template's graphic images into the matching process. Including template images increases the similarity score calculated between the template and the examined document. The higher the score, the more accurate the match. Select this option if the graphic images used in a template document suggest that the document is confidential. Similarity - Move the slider to determine how closely a document must match the given template or form to be recognized as matching the data type. This will match header and footer content, as well as boiler-plate text.
File	File - Select the conditions that should be checked on files in data transmissions (including zipped email attachments, as well as other transmissions). A transmitted file must match all selected conditions for the File data type to be matched. The file type is any of these types - Click Add, and select a files type from the list. The file name contains - Enter a string or regular expression to match against file names. The file size is larger than - Enter the threshold size in KB.
Group Members	Add - Add data types to the group. If any of the members are matched, the data is recognized as matching the group data type. In the list that opens, you can click New to create a new data type. Edit - Open the properties window of the selected data type. When you click OK or Cancel, the Data Type Group window is still open. Remove - Remove the selected data type from the group. The data type is not deleted.
Keywords or Phrases	Specify keywords or phrases to search for - Enter the words to match data content. Add - Click to add the keywords to the data type. Search List - Keywords in the data type. Edit - Modify the selected word or phrase in the list. Remove - Remove the selected word or phrase from the list. All keywords and phrases must appear - Select to match data only if all the items in the Search List are found. At least number words must appear - Enter an integer to indicate number of items in Search List to match the Keyword data type.
Pattern	Type a pattern (regular expression) - Enter the regular expression to match data content. Add - Click to add the regular expression to the data type. Pattern List - Regular expressions in the data type. Edit - Modify the selected regular expression in the list. Remove - Remove the selected regular expression from the list. Number of occurrences - Enter an integer to set how many matches between any of the patterns and the data are needed to recognize the data as matching the data type.
Similarity	Similarity - Move the slider to determine how closely a document must match the given template or form to be recognized as matching the data type. This will match header and footer content, as well as boiler-plate text.
Threshold (dictionary)	This data will be matched only if it contains at least - Enter an integer to set how many matches in the data are needed to recognize the data as matching the data type.
Threshold (occurrences)	Number of occurrences - Enter an integer to set how many matches in the data are needed to recognize the data as matching the data type.
Threshold (keywords)	This data will be matched only if it contains: All keywords and phrases - Select to match data only if all the items in the Search List are found. At least number keywords or phrases - Enter an integer to indicate number of items in Search List to match the Keyword data type.
Threshold (recipients)	This data will be matched only if the email contains: At least number internal recipients - Enter the minimum number of email addresses that are defined inside of My Organization that, along with external addresses, should cause the email to be regarded as suspicious of containing confidential information. and no more than number external recipients - If an email is sent to a large distribution list, even if it contains numerous internal recipients, it should be recognized as an email meant for people outside the organization. In this field, enter maximum number of email addresses external to My Organization, that if more external recipients are included, the email will match a rule.
Threshold (External BCC)	This data will be matched only if the email contains at least: Internal recipients - Enter the minimum number of email addresses that are defined inside of My Organization that, along with external addresses, should cause the email to be regarded as suspicious of containing confidential information. External recipients - Enter the minimum number of email addresses external to My Organization, that would cause such an email to be suspicious.
Weighted Keywords or Phrases	Keyword Text - List of current keywords or regular expressions in the list of weighted keywords. To add more, click New. To change the selected keyword or regular expression, click Edit. The Edit Word window opens. Weight - The number that represents the importance of this item in recognizing a transmission that should be matched. The higher the number, the more weight/importance the item has. Max. Weight - The number that represents the ceiling for this item. If content of a transmission matches the item (by keyword or by regular expression) to a total of this weight, no more counts of the item are added to the total weight of the transmission. (Zero means there is no maximum weight.) RegEx? - Whether the item is a regular expression. Threshold - When the weights of all items in the list are added together, if they pass this threshold, the transmission is matched.

To edit a Data Type:

On the SmartDashboard, open the Data Loss Prevention tab.
Open Data Types, select a Data Type and click Edit.
In the General Properties window, edit/fill-in the fields that apply to the Data Type.
Click Finish.

Defining Data Type Groups

You can create a Data Type representation that is a group of existing Data Types.

For example, you could create a group of Data Types that protect your organization from leaking personal contact information, to comply with privacy laws. The Data Type group would include various built-in Data Types for personal names of different countries, last names, personal email addresses, and so on. Using the Data Type group, you can create and maintain rules more efficiently.

Data Type groups are matched on OR. If data matches any of the Data Types in the group, the Data Type group is matched.

To create a Data Type group:

In Data Types, click the arrow of New and select Data Type Group.
The Group Data Type window opens.
Enter a name for the group.
Click Add and select the Data Types that will be in this Data Type group.
If relevant, add Data Owners to the group.
Click OK.

Defining Advanced Matching for Keyword Data Types

You can add CPcode script files for more advanced match criteria to improve accuracy after a keyword, pattern, weighted keyword, or words from a dictionary are matched. If the CPcode script file has a corresponding value file (for constants values) or csv file, add it here.

Note - You can add more than one CPcode script. All of the scripts must match the keywords or phrases to be recognized as matching the data type.

To add advanced matching Data Type CPcode script:

In Data Types, select a Data Type and click Edit.
The Data Type window opens.
Click the Advanced Matching node.
In Run these CPcode for each matched keyword to apply additional match criteria, add the CPcode scripts to run on each of the Data Type matches.
- Add - Click to add CPcode scripts. The default file type is cpc. See the R77 CPcode DLP Reference Guide.
- View - Click to view a CPcode script in a text editor.
- Remove - Click to remove CPcode scripts.
Click OK.

Defining Post Match CPcode for a Data Type

For all Data Type representations, you can add CPcode scripts that run after a data type is matched.

When you use CPcode scripts here as match criteria, you get a more advanced level of improved accuracy on matched data types. When you set more than one CPcode script, Data Types with specified CPcode scripts are matched on AND. If data matches all of the CPcode scripts, the Data Type is matched. If the CPcode script file has a corresponding value file (for a constant value) or csv file, add it here.

For example, you can add a CPcode script that matches Data Types that occur during work hours (09:00 - 17:00) on work days.

To add a post match Data Type CPcode script:

In Data Types, select a Data Type and click Edit.
The Data Type window opens.
Click the Advanced Matching node.
In Run these CPcode scripts after this Data Type is matched to apply additional match criteria, add the CPcode scripts to run on each of the Data Type matches.
- Add - Click to add CPcode scripts. The default file type is cpc. See the R77 CPcode DLP Reference Guide.
- View - Click to view a CPcode script in a text editor.
- Remove - Click to remove CPcode scripts.
Click OK.

Recommendation - Testing Data Types

Before installing a policy that contains new Data Types, you can test them in a lab environment.

Recommendation for testing procedure:

Create a Data Type.
Create a user called Tester, with your email address.
Create a rule:
- Data = this Data Type
- Action = Detect
- Source = Tester
- Destination = Outside
Send an email (or other data transmission according to the protocols of the rule) that should be matched to the rule.
Open SmartView Tracker or SmartEvent and check that the incident was tracked with the Event Type value being the name of the Data Type.
- If the transmission was not caught, change the parameters of the Data Type. For example, if the Data Type is Document by Template, move the slider to a lower match-value.
- If the transmission was caught, change the parameters of the Data Type to be stricter, to ensure greater accuracy. For example, in a Document by Template Data Type, move the slider to a higher match-value.

After fine-tuning the parameters of the Data Type, re-send a data transmission that should be caught and check that it is.

Important - If you change the action of the rule to Ask User, to test the notifications, you must change the subject of the email if you send it a second time.

If Learning mode is active, DLP recognizes email threads. If a user answers an Ask User notification with Send, DLP will not ask again about any email in the same thread.

Send another transmission, as similar as possible, but that should be passed; check that it is passed.
For example, for a Document by Template Data Type, try to send a document that is somewhat similar to the template but contains no sensitive data.

If the acceptable transmission is not passed, adjust the Data Type parameters to increase accuracy.

Exporting Data Types

You can export to a file the Data Types that you have created or that are built-in. This allows you to share Data Types between DLP Gateways, when each is managed by a different Security Management Server.

You might want to export Data Types as a recovery measure: recover a Data Type that you or another DLP administrator deleted.

To export a Data Type:

Open Data Loss Prevention > Data Types.
Select the Data Type to export.
Click Actions > Export.
Save it as a file with the dlp_dt extension.

Importing Data Types

You can share Data Types with another Security Management Server or recover a Data Type that was deleted but previously exported. You can also obtain new Data Types from your value-added reseller or from Check Point and use this procedure to add the new Data Types to your local system.

Note - You can only export and then import Data Types on Security Management Servers that are the same version. For example, you can export and import Data Types on different R77 Security Management Servers. You cannot export Data Types from an R75 Security Management Server and then import them to an R77 Security Management Server.

To import Data Types:

Open Data Loss Prevention > Data Types.
Click Actions > Import.
Select the dlp_dt file holding the Data Type that you want.

Repositories

Repositories are network locations used for document storage. DLP has two kinds of repository

Fingerprint
Whitelist

Fingerprint Repository

The fingerprint repository is used to store files from which the fingerprint Data Type is derived. A fingerprint repository is automatically created when you create the fingerprint Data Type. Files that exactly or partially match documents in the fingerprint repository are identified before they go outside of the organization.

Whitelist Repository

The Whitelist repository is a store of documents that are allowed to go outside of the organization. The Whitelist repository can be used to improve the accuracy of the DLP policy.

Note - For a file not to be included in the DLP match, it must exactly match a file in the whitelist repository.

Creating a Fingerprint Repository

On the Data Loss Prevention tab > Repositories click New > Fingerprint.
The Data Type wizard opens with Fingerprint selected as the Data Type.
Enter a name for the Data Type.
Click Next.
In the Fingerprint window:
1. Click the Gateways arrow button to select gateways with the DLP blade enabled.
  By default, The DLP Blades object shows. This object represents all gateways that have the DLP blade enabled. Only gateways selected here scan the repository and enforce the fingerprint data type.
2. Define a network path to the repository
3. If the repository defined in the network path requires a username and password to access it, enter the relevant authentication credentials.
Click Test Connectivity.
This tests that DLP gateways defined in the gateways list (step 4a) can access the repository using the (optional) assigned authentication credentials.
Click the Match Similarity arrow.
This option matches similarity between the document in the repository and the document being examined by the DLP gateway. You can specify an exact match with a document in the repository, or a partial match based on:
- A percentage value or
- Number of matched text segments.
Click Next.
Select Configure additional Data Type Properties after clicking Finish if you want to configure more properties.
Click Finish.
The New data type wizard closes. The data type shows in the list of data types and also on the Repositories page.

To configure more fingerprint properties:

In the Data Types window or Repositories window, double-click fingerprint object to open it for editing. These properties can be configured:

General
Change the data entered in the Data Type wizard.
Data Owners
Add users or user groups that own the data. Data owners can be notified when the fingerprint data type is matched by a rule in the DLP policy.
Advanced Matching
Add CPcode scripts to apply more match criteria after the fingerprint data type is matched by a rule.
Scan Scheduling
Configure when the document repository is scanned to update the fingerprint data type. The default time object (Every-Day) has no time restrictions configured. This means that a scan runs without time restrictions after the fingerprint data type is added to a policy rule. If gateway resources and network bandwidth are an issue, limit the scan to off-peak hours.

Repository Scan Filter

This page offers more scanning criteria:

Scan files matching the following data types
This property lets you scan documents in the repository according to more data types, for example credit card numbers. If you add credit card numbers as the data type, all the files in the repository that contain credit card numbers are fingerprinted. If "spreadsheet files" are selected as the data type, only spreadsheet files in the repository are fingerprinted.
Scan files according to size
Only files of the specified maximum and minimum size are included in the fingerprint.

Scan files according to modification date

Only files that match the specified modification dates are included in the fingerprint.

Data locations

Use the Data Locations tree to include or not include repository sub-folders. If you want the fingerprint data type to prevent only one document type from leaving the organization, put that document in a folder that contains no other document. Select only that folder as the data location.

Creating a Whitelist Repository

On the Data Loss Prevention tab > Repositories click New > Whitelist Repository.
The Whitelist Repository window opens.

Enter a name and informative comments for the repository type.
In the Repository section:
1. Click the Gateways arrow button to select gateways with the DLP blade enabled.
  By default, The DLP Blades object shows. This object represents all gateways that have the DLP blade enabled. Only gateways selected here scan the repository.
2. Define a network path to the repository
3. If the repository defined in the network path requires a username and password to access it, enter the related authentication credentials. (Domain/Username).
Click Test Connectivity.
This tests that DLP gateways defined in the gateways list (step 2a) can access the repository using the (optional) assigned authentication credentials.
Select the Match Similarity arrow.
Do not include a text segment in the fingerprint match if the segment is in both the fingerprint and whitelist repositories
A text segment from a file in the whitelist repository might match a text segment from a file in the fingerprint repository. Such segments can be safely ignored during the fingerprint Data Type match.
Click OK.
The Whitelist shows in the list of repositories.

To manually start a scan of the whitelist repository, click Start in the Scan now area on the summary pane.

Whitelist Policy

There are two ways to create a list of files that will never be matched by the DLP rulebase:

Manually add the files to the Whitelist Policy window in SmartDashboard.
Files in the list are uploaded to the Security Management Server and not matched against DLP rules. This option is recommended if you only have a small number of files.
Place the files in a Whitelist Repository on the network.
Files in this repository are not included in the match.

To add files to the Whitelist:

On the Data Loss Prevention tab > Whitelist Policy > click Add.
Browse to the file.
Click Open.
The file is uploaded to a folder on the Security Management Server.

Note - For a file not to be included in the DLP match, it must exactly match a file in the whitelist.

Defining Email Addresses

In DLP administration you may need to define email addresses or domains that are outside of your network security management.

For example:

Addresses to which data must be sent, or should never be sent.
Domains that are external but should be considered internal for DLP.
Domains that are internal but should be checked for unauthorized data transfer (not everyone in your organization should have access to the data of everyone else).

You can create Email Address objects. Each object holds a list of addresses or domains, or both, where the list can contain one or more items. After you create an Email Address object, you can add it to:

Rules as the Source or Destination.
Exceptions to rules.
For example, the administrator of a hospital makes an exception to a rule that prevents patient records from being sent outside the organization. The exception says to allow patient records to be sent to the email address of the social worker.

Note - All the addresses in the object are a unit. You cannot choose to use some email addresses of an object and not others.

Notes about Domains:

When adding domains, do not use the @ sign. A valid domain example is: example.com
If you add a domain, it will catch all sub domains as well. For example, if the domain is example.com, email addresses such as jsmith@uk.example.com are also considered as part of My Organization.

To define email addresses and domains for use in rules:

Expand Additional Settings> Email Addresses.
Click New.
The Email Addresses window opens.
Enter a name for this group of email addresses (even if it includes only one address) or domain.
Enter the address or domain.
Add as many email addresses and domains as needed for this list.

Watermarking

Watermarking lets you monitor outgoing Microsoft Office documents. Visible watermarks or hidden encrypted text are added to Word, Excel, or PowerPoint files created in Office 2007 (or higher). Visible watermarks work as a deterrent by making it clear that the document contains confidential data. Invisible watermarks make forensic tracking possible: users and computers that handled the document can be traced to source.

Watermarking works by introducing custom XML files that contain the watermarking data. Only documents in these Office Open XML formats can be watermarked:

docx
pptx

xlsx

Important - Older formats supported in Office 2007 and above for backward compatibility (such as doc, ppt, and xls, cannot be watermarked). Changing the file extension from doc to docx will not make the document eligible for watermarking.

To watermark documents:

In SmartDashboard, on the DLP tab:

In the Policy window, select a Data Type.
In the Action column, select a restrictive Action such as Ask, Inform User or Detect, plus an existing watermark profile.
DLP has 3 built-in profiles:
- Classified. Places the word Classified in the center of the page.
- Invisible only. Contains only hidden text.
- Restricted. Places the word Restricted at the bottom of the page, and these inserted fields: sender, recipient, and send date.
If there are no exiting watermark profiles, click New and create one.

Note - You can also modify a built-in profile.

To create a new watermark profile:

New watermarks can be created from the Action column of a DLP rule, or from Additional Settings > Watermarks.

On the Watermarks page, click New.
The Watermark Profiles window opens.
In the General page, supply a name for the Watermark profile.
Click Advanced.
The Advanced Settings window opens.

Clear the Use the same configuration for all supported file types option to create different watermarks for Word, Excel, or PowerPoint files.

Note -

A watermark in Excel cannot exceed 255 characters. The 255 character limit includes the visible watermark text and formatting data. If you exceed the 255 character limit, the watermark feature makes a best effort to show as much text as possible.
The 255 limit is per document.

Set if watermarks will be added to:
- All pages
- First page only
- Even pages only
- Odd pages only
The actual placement of watermarks depends on:
- If the document contains Section Breaks on the page.
- The version of MS Word used to create the document.

Watermark option	Section Break	In Word 2007	In Word 2010
All pages	Yes	All pages get watermark	All pages get watermark
All pages	No	All pages get watermark	All pages get watermark
First page only	Yes	All pages get watermark	First page only gets watermark
First page only	No	All pages get watermark	First page only gets watermark
Even pages only	Yes	All pages get watermark	All pages get watermark
Even pages only	No	Only even pages get watermark	Only even pages get watermark
Odd pages only	Yes	All pages get watermark	All pages get watermark
Odd pages only	No	Only odd pages get watermark	Only odd pages get watermark

Click OK.

On the General Page

Supply a name for the watermark profile.
Click inside the Watermark graphic.
The Select text location on page window opens. There are seven possible locations for visible watermark text.

Using the text-editing toolbar:

Create suitable text for each watermark
Format it using the tools for font, font size, color.
To put a shadow behind Watermark text in Word and PowerPoint:

(i) On the gateway, run: cpstop.

(ii) On the gateway, open for editing: $DLPDIR/config/dlp.conf.

(iii) Search for the attribute: watermark_add_shadow_text(0).

(iv) Change the value of the attribute from 0 to 1.

(v) Set percentages for watermark transparency and size, for docx and pptx files.

(vi) Save and close.

(vii) Run: cpstart.

Note: Before the changes to dlp.conf take effect, you must run cpstop and cpstart.
Use the Insert Field to insert one or more of these predefined fields:
- Action Taken
- File name
- File Size (in bytes)
- Mail Subject
- Recipient (email address)
- Recipient (full name)
- Reference ID number
(The Incident UID in SmartView Tracker, which contains the IP address of the computer which sent the file)
- Rule Name
- Rule Severity
- Send Date
- Sender (email address)
- Sender (full name)
- Sender (user name)

Optionally set the watermark at:

A forty-five degree diagonal

Note - Watermark rotation is only available for:

PowerPoint presentations in MS Office 2007 and 2010

Word documents in MS Office 2010

Seventy-percent transparency (default).

Note -

Transparency is supported for PowerPoint and Word files in MS Office 2007 and 2010.
To alter the default transparency value:
- On the gateway, run: cpstop.
- Edit $DLPDIR/config/dlp.conf on the gateway.
- Change the watermark_text_opacity_percentage property from 30 (70% transparency) to the new value.
- Run: cpstart.

On the Hidden Text page:

Select Add the following hidden text to the document.
Click Add, and select which fields should be inserted as encrypted hidden text into the document.
For the purpose of forensic tracking, hidden text can be viewed using the DLP watermark viewing tool.

Click OK.

If Microsoft Office 2007 (or higher) is installed on the same computer as SmartDashboard, a preview of the watermark shows on a sample file in the preview pane.

Note - The preview pane is not available if you create or edit a watermark from the DLP policy rule base. To see a preview, create a watermark from Additional settings > Advanced > Watermarks > New.

In Additional Settings > Advanced > Watermarks section:
1. Make sure Apply watermarks on Data Loss Prevention rules is selected.
2. Set how existing watermarks are handled on documents that pass repeatedly through DLP gateways. Existing watermarks can be kept, or replaced.
  
  Note - Hidden encrypted text is not removed, only added to by each DLP gateway. Hidden text can later be used for forensic tracking.

Install the policy.

Important - If the Data Type scanned for by the DLP gateway occurs in the body of the email and not the document, the document will not be watermarked. For example if you are scanning for credit card numbers. If the credit card number shows in the body of an email with a document attached, the document will not be watermarked. The Data Type has to occur in the document.

Previewing Watermarks

In SmartDashboard > Data Loss Prevention tab > Additional Settings > Watermarks, Watermarks are previewed in the right-hand pane on sample documents.

Preview works by downloading sample Office files from the Security Management Server and applying the watermark to them. The sample preview files are named:

example.docx
example.pptx
example.xlsx

To open a document or preview it, you must install Microsoft Office 2007 (or higher) on the computer that has SmartDashboard installed.

Watermarks can also be previewed on User-Added Files.

To view watermarks on user-added files:

Open the drop-down box in the preview pane.
The Select File window opens.
Click Add and browse to your Word, Excel, or PowerPoint file.
The Select File window is now divided into User Added Files and Sample Files.

Select your user added file to see it previewed with the watermark.

Note - When you preview a user-added file, the file is uploaded to the Security Management Server. The file will stay on the server until you remove it by selecting the file in the Select File window and clicking the red X in the top right-hand corner.

Viewing Watermarks in MS Office Documents

For Office documents that have been watermarked by a DLP gateway, view the watermarks in this way:

Office document	Go to:
Word	View > Print Layout or Full Screen Reading
Excel	View > Page layout > Print Layout
PowerPoint	PowerPoint has a number of built-in layers. The DLP watermark sits above the slide layout layer but below the slide content layer. This means that the watermark always shows below the content of a slide.

Resolving Watermark Conflicts

When scanned by the DLP gateway, an email with a document attached might match one or more DLP rules. If the rules have different and conflicting watermark profiles, then the conflict must be resolved for visible watermarks and resolved for hidden text.

Resolving Hidden Text Conflicts

If different watermark profiles specify invisible text, the text is taken from the profile attached to the DLP rule that has the highest precedence. Rule precedence is derived from the ACTION and SEVERITY priorities in the DLP Rule Base.

Action	Priority
Ask User	1
Inform User	2
Detect	3

Hidden text is taken from the watermark profile belonging to the rule that has the highest ACTION priority. If the two rules have the Ask User setting, the same priority, then SEVERITY is considered:

Severity	Priority
Critical	1
High	2
Medium	3
Low	4

For example, if an email with a document attached matches these two rules:

Data	Action	Severity	Watermark Profile
Rule 1	Ask User	Low	W1
Rule 2	Detect	Critical	W2

The ACTION setting for Rule 1 has a greater priority than the ACTION setting defined for Rule 2. Rule 1 takes precedence. The hidden text configured for the W1 profile applies even though Rule 2 has a greater SEVERITY. If the rule is changed to:

Data	Action	Severity	Watermark Profile
Rule 1	Inform User	Low	W1
Rule 2	Inform User	Medium	W2

The rules have the same ACTION priority, so SEVERITY is considered. In this case Medium has a higher priority than Low. Hidden text from the W2 profile is added to the document. Rule 2 has precedence.

If the rules have the same priority for ACTION and SEVERITY, for example:

Data	Action	Severity	Watermark Profile
Rule 1	Inform User	Low	W1
Rule 2	Inform User	Low	W2

Rule precedence is decided according to an internal calculation based on the name of the rule in the data column.

Resolving Visible Watermark Conflicts

An outgoing document may match one or more rules in the DLP policy. If each rule specifies different watermarking profiles, then a conflict will arise. For example if different profiles specify dissimilar text in the center, the conflict must be resolved by merging the different watermark profiles according to rule precedence. Rule precedence is decided based on ACTION and SEVERITY priorities.

After rule precedence is decided, a merged watermark profile is built according to this criteria:

All the Visible watermarks from the rule with the highest precedence are added to the document.
Visible watermarks from the rule with the second highest precedence are added to the document only if they do not conflict with watermarks from the first.
Visible watermarks from the rule with the third highest precedence are added to the document only if they do not conflict with watermarks added by the previous two rules.
The procedure repeats until all watermarks are added to the merged profile. For example, if you have three DLP rules, each with a custom Watermark Profile, and an email matches all three of these rules:

DLP Data Rule	Precedence	Watermark Profile Name	In graphic
Rule_A	1	W1	1
Rule_B	2	W2	2
Rule_C	3	W3	3

Rule_1 has greater precedence than Rule_2 and Rule_3
Rule_2 has greater precedence than Rule_3

The merged profile (4) is built by taking elements from all the profiles.

All the watermarks from W1 are added to the merged profile (4)
Only the center watermark from W2 is added to the merged profile.
(The watermark in the top right corner will not overwrite the watermarked placed there by W1, which has higher precedence.)
Only the bottom right corner watermark from W3 is added to the merged profile.
(The watermark for the top center location is already taken by W1, which has greater precedence.)

Naming the Merged Profile

If the merged profile takes elements from existing profiles (hidden text or visible watermarks) then the name of those profiles are integrated into the name of the merged profile. In the above example, the name of the merged profile will be W1;W2;W3, with a semi-colon separating the individual profile names. This is the name that shows in the DLP Watermark Profile column in SmartView Tracker.

Turning Watermarking On and Off

Watermarking can be turned off in a number of ways:

In GuiDBedit:
- Search for the enable_watermarking_feature property
- Set the value of the property to FALSE.
In DLP > Additional Settings > Advanced > Watermarks section clear Apply watermarks on DLP rules
In the DLP rule base, the warning Watermarks are not applied on the DLP policy shows at the bottom of the policy table.

Clicking Apply opens the Advanced Settings Window where you can once more add watermarks in the DLP rules.

Using the DLP Watermark Viewing Tool

For forensic tracking, hidden text can be decrypted and read using the DLP watermark viewing tool.

To view hidden text on a watermarked document:

Copy the document, or a folder of documents, to the DLP gateway.
On the gateway, run: dlp_watermark_viewer
Enter the name of one file or the path to a directory that contains a number of files.
The output shows the hidden fields included in the profile.

Note - Only the hidden text is shown by the tool, not the document's content.

Keys used for decrypting hidden text are stored on the Security Management Server and downloaded to the Security Gateway. DLP gateways managed by the same Security Management Server share the same keys and a common (random) ID. The random ID identifies the Security Management Server that installed the DLP policy on the gateway. The viewing tool will only show text added by gateways managed by the same Security Management Server. For example, for a document that has passed through three DLP gateways, each managed by a different Security Management Server, you must copy the file to each gateway and run the tool on each. The tool will only show the hidden text added by that gateway, and not the text added by gateways managed by other Security Management Servers.

Important - If you reinstall a Security Gateway, the keys and random ID are downloaded again from the server. The new gateway can be used to decrypt hidden text added by the old one. But if you reinstall the Security Management Server the random ID is lost. The random ID added to the document by the gateway will not match the ID of the new Security Management Server. The DLP viewer will not show the document's hidden text.

Fine Tuning Source and Destination

In the rule base, you can change the default Source (My Organization) and the default Destination (Outside My Org) to any network object, user, or group that is defined in SmartDashboard, and you can fine tune user definitions specifically for DLP.

Note - SMTP only matches users, groups, and email addresses. HTTP and FTP only match Network objects. If needed, you can add a network and a user group to a rule.

From version R75.20 and higher, you can also use these objects as the Destination of the rule:

My Organization - When the system is configured to work with the Exchange Security Agent, use this object to define the entire internal organization including emails from users in the Source object.
Any - When the system is configured to work with the Exchange Security Agent, use this object to define any destination. This includes:
- All users in the internal organization.
- Any destination outside of the organization.
Domain - Defines a domain used in HTTP and FTP posts. For example, to examine Facebook posts that contain company confidential source code, create a rule with:
- Source = My Organization
- Destination = .facebook.com (domain object)
- Data Type = Source Code (built-in Data Type)

Note - These objects are not enforced in rules installed on gateway versions before R75.20. In such cases, policy installation might fail with warnings and errors. To avoid such errors, make sure to specify gateway versions that are R75.20 and higher in the Install On column.

To create a domain object:

Open the Firewall tab > Network Objects tree > New > Domain.
Enter the URL of the domain and click OK.

Creating Different Rules for Different Departments

You can set the Source of a rule to be any defined user, group, host, network, or VPN. You can then set the Destination to be Outside. The rule will inspect data transmissions from the source to any destination outside of the source. This will create DLP rules specific to one group of users.

Note the different between Outside Source (external to a source that is a subset of My Organization) and Outside of My Org (external to My Organization).

To enable use of Outside Source, the DLP gateway must be functioning in front of the servers that handle the data transmission protocols. For example, to use Outside on SMTP transmissions, the DLP gateway must inspect the emails before the Mail Server does.

Alternatively, the Destination of the rule could be another user, group, host, etc. This would create DLP rules to inspect and control the data transmissions between two groups of users.

Examples:

DLP rule to prevent the Finance Department from leaking salary information to employees.
- Source = Finance (define a group to include users, groups, or network that defines the Finance Department)
- Destination = Outside Source (any destination outside of Finance, internal or external to My Organization)
- Data Type = Salary Reports (define a Data Type Group that matches spreadsheets OR regular expressions for salaries in dollars - ([0-9]*),[0-9][0-9][0-9].[0-9][0-9] and employee names)
Data

Source

Destination

Action

Salary Reports

Finance

Outside Source

Prevent
DLP rule to prevent permanent employees from sending customer lists to temporary employees.
- Source = My Organization
- Destination = Temps (define a group of temporary employee user accounts)
- Data Type = Customer Names (built-in Data Type customized with your dictionary of customer names)
Data

Source

Destination

Action

Customer Names

My Organization

Temps

Prevent
Different DLP rules for different departments.
The Legal Department sends confidential legal documents to your legal firm. They need to be able to send to that firm, but never to leak to anyone else, either inside the organization or outside.

HR needs to send legal contracts to all employees, but not to leak to anyone outside the organization.

All other departments should have no reason to send legal documents based on your corporate template to anyone, with the exception of sending back the contracts to HR.

The first rule would be:
- Source = Legal (a group that you define to include your Legal Department)
- Destination = Outside Source (to prevent these documents from being leaked to other departments as well as outside the organization)
- Data = built-in Legal Documents
- Exception = allow the data to be sent to your lawyers email address
- Action = Ask User
The second rule would be:
- Source = HR
- Destination = Outside My Org
- Data = built-in Legal Documents
- Action = Ask User
The third rule would be:
- Source = selection of all groups excluding Legal and HR
- Destination = Outside Source (to prevent users from sharing confidential contracts)
- Data = built-in Legal Documents
- Exception = allow the data to be sent to HR
- Action = Ask User

Note - In this rule, you would have to exclude the two groups if you want to ensure that the previous rules are applied. If you chose My Organization as the source of the third rule, it would apply to the users in Legal and HR and thus negate the other rules.

Isolating the DMZ

To ensure that data transmissions to the DMZ are checked by Data Loss Prevention, define the DMZ as being outside of My Organization.

For example, the PCI DSS Requirement 1.4.1 requires that a DMZ be included in the environment to prevent direct Internet traffic to and from secured internal data access points.

To ensure traffic from My Organization to the DMZ is checked for Data Loss Prevention:

Make sure that the DLP gateway configuration includes a definition of the DMZ hosts and networks.
In SmartDashboard, open the Data Loss Prevention tab.
Click My Organization.
In the Networks area, make sure that:
- Anything behind the internal interfaces of my DLP gateways is selected.
- Anything behind interfaces which are marked as leading to the DMZ is not selected
Click OK.

Defining Strictest Security

You may choose to define the strictest environment possible. Using these settings ensures that data transmissions are always checked for Data Loss Prevention, even if the transmission is from and within your secured environment. For example:

If your organization includes a large number of temporary users and small number of permanent users and machines
If system administration has been known to take time to remove terminated aliases
If your domain is being changed

Important - You must ensure that legitimate transmissions are not blocked and that Data Owners are not overwhelmed with numerous email notifications. If you do use the settings explained here, set the actions of rules to Detect until you are sure that you have included all legitimate destinations in this strict definition of what is the internal My Organization.

To define a strict My Organization:

In SmartDashboard, open the Data Loss Prevention tab.
Click My Organization.
In the Email Addresses area, remove any defined items.
In the VPN area, select All VPN traffic and then click Exclusions.
In the VPN Communities window that opens, add the communities whose communications should be not checked by DLP.
In the Networks area select These networks and hosts only and then click Edit.
In the Networks and Hosts window, select the defined Check Point network objects that you want to include in My Organization.
In the Users areas, select These users, user groups and LDAP groups only and then click Edit.
In the User Groups and Users window, select the defined users, user groups, and LDAP groups that you want to include in My Organization.

Data transmissions among the internal objects and users will be passed unchecked if the Source of the rule is My Organization. Everything else will go through Data Loss Prevention.

Defining Protocols of DLP Rules

Each rule in the Data Loss Prevention policy has a definition for the protocols of the data transmission. The default setting for Protocols is Any: DLP will scan transmissions over all enabled protocols.

You can control which protocols are supported by DLP in general, or by each gateway, or for each rule.

To define supported protocols for DLP:

Open Additional Settings> Protocols.
Select the protocols that you want DLP to be able to support, in general.
For example, if performance becomes an issue, you could clear the HTTP checkbox here, without making any other change in the policy. HTTP posts and web mail would go through without Data Loss Prevention inspection.

To define supported protocols for individual DLP Gateways:

Open Additional Settings> Protocols.
In the Protocol Settings on DLP Blades area, select a DLP gateway.
Click Edit.
The properties window of the gateway opens.
Open the Data Loss Prevention page of the gateway properties.
Select Apply the DLP policy to these protocols only and select the protocols that you want this DLP gateway to support.

To define supported protocols for a rule:

In the Policy view, click the Protocol column plus button.
If this column is not visible, right-click a column header. In the list of possible columns that appears, select Protocols.
Select the protocols for this rule.
Traffic that matches the other parameters of the rule, but is sent over another protocol, is not inspected.

Fine Tuning for Protocol

When you choose a specific source or destination for a DLP rule, you can optimize the rule for the selected protocol.

By default, rules use all supported protocols, or the default protocols selected for the gateway (in the Check Point gateway window).

If you specify that a rule should use only mail sending protocols, such as SMTP, the source and destination can be users (including user groups and LDAP Account Units) or email addresses (including specific email or domains).

If you specify that a rule should use only HTTP or FTP or both, the rule will ignore any source or destination that is not recognized by IP address.

If the rule uses all supported protocols, HTTP and FTP will recognize only source and destinations that can be defined by IP address. SMTP will recognize and enforce the rule for sources and destinations based on users and emails.

Configuring More HTTP Ports

To scan transmissions on HTTP running on any port other the standard HTTP ports (80, 8080), you must define the non-standard ports to be included in the HTTP protocol.

To add ports to HTTP:

In SmartDashboard, select Manage > Services.
The Services window opens.
Click New > TCP.
The TCP Service Properties window opens.
Provide a name for the web service.
Provide the port or port range.
Click Advanced.
The Advanced TCP Service Properties window opens.
Leave Source Port blank.
In the Protocol Type list, select HTTP.
Click OK.