Detector Engine
Query Structure
Detectors are composed of rules, or queries that are compiled into an efficient detector and are run with the Code Security engine against files.
Each query is a group of patterns, called a pattern_group
and is hierarchical (a pattern group can contain more pattern groups and so on).
A pattern group is a collection of patterns with an aggregate relation.
pattern_group:
aggregate: or | and | append
patterns:
- pattern: "(:?key|token|secret|password|pwd|passwd)=(.*)" # assignment
pattern_type: single
- pattern: "hello" # assignment
pattern_type: multi

pattern_group
Aggregating patterns can:
and
- Bail out on the first mismatch.or
- Try any of the matches.append
- Try any of the matches and collect matches from those that matched.

pattern
Each pattern is of a type (pattern_type
):
single
- Match once per file.multi
- Match many times per file.
Both accept a performant, binary and text traversing regex.
Prematch Testers
A prematch tester is a test that runs before applying an in-depth matching and detection logic. As an example, it is more appropriate to use bail out detection for a small binary file with class documentation.

test_content_prematch
This meta-tester uses content classification and inference engine. It is a collection of testers that are useful when deciding if a certain file is worth getting a deep dive into.
By testing for content, you can:
- Filter for an unexpected binary file.
- Ensure a non-empty file goes through for further detection.
- Be able to run on classes of files, where Code Security has classified those by their content nature.
Content class | Example |
---|---|
code/infra
|
Ruby, Python |
data/infra
|
SQL, JSON![]() |
binary
|
Binary files |
docs
|
Markdown, Text |
tests
|
Unit tests, other test code |
examples
|
Example code, demo code and others |
vendor
|
3rd party code sitting in node_modules and others |
files
|
A general file class not fitting a single class. |
Usage
pattern: ".*"
test_content_prematch:
binary: false
minlen: 20,
maxlen: 2000,
content_classes:
- code/infra # our own classfication engine results
# content_classes_not:
# - Code # the inverse of content_class
content_types:
- Python # a programmingg language *name* (if you want extension, there's ways for that too)
content_types_not:
# - Ruby # the inverse of content_types
Test positive
<a Python file, size at 20-2000 bytes>
Test negative
<an SQL file, or a small file, or a binary file, etc.>

test_regex_prematch
You can test for a specific pre-match structure before Code Security deep dives into further matching.
By testing for Regex prematch, you can:
- Make sure a certain file structure exists before applying further testing, such as variable assignments.
- Verify that a certain 'sentinel' word exists in a large file by applying a generic word lookup, before applying a more specific matching.
Usage
pattern: "pass:(.*)"
test_regex_prematch:
- on: 0 # on full text
pattern: "aws\\.amazon\\.com"
Test positive
<large documentation file>
Here is how to connect to our database
1. Log into AWS console (console.aws.amazon.com)
2. Use following details:
DB pass: shazam123
Test negative
<Big file, not containing any mention of AWS detail>
Content Testers

test_fingerprints
Code Security can create one-way fingerprints for you to use when you want to detect pieces of information you cannot reveal.
By using test_fingerprints
, you can:
- Detect credit cards
- Find classified or private domains or hosts
First, you must generate your fingerprint. It is done locally on your machine using a secure and salted one-way hash:
$ $HOME/.spectral/spectral fingerprint --text <your private text>
< fingerprint >
Then, copy the resulting fingerprint.
Usage
pattern: "host=([a-zA-Z0-9_-.]+)"
test_fingerprints:
- on: 1
with: "<your fingerprint>"
is: true
Note that by specifying the character class and narrowing it down, we give some
useful information to attackers looking to bruteforce private information. Always be mindful that your character classes and secrets are wide enough.
Test positive
<private host>
Test negative
<any other text>

test_from_env
You can collect secrets from your ENV, rather than encode those as fingerprints and still search for them in your code. Code Security supports fetching those from your ENV, and relaying to the detector to use.
By using test_from_env
, you can:
- Detect secrets that you already have in your environment (local machine or CI) without exposing them.
- Find secrets that you do not want to expose in a persistent way.
To test, make sure to export it first:
$ SOME_SECRET_VAR=shazam $HOME/.spectral/spectral scan --nosend
Usage
pattern: "host=(.*)"
test_from_env:
- on: 1
with: "SOME_SECRET_VAR"
is: true
Test positive
shazam
Test negative
foobar

test_luhn
The Luhn algorithm is used for check-sum of a credit card and many forms of Social Security Number (SSN) numbers of the US, Canada and Israel.
By testing for Luhn, you can:
- Ensure a number is a valid credit card number.
- Verify that a given string match passes as a valid SSN, which helps identify fake from test strings.
Usage
pattern: "account=([0-9]+)"
test_luhn:
- on: 1
is: true
Test positive
79927398713
Test negative
79927398710
References

test_number
Available from: v1.4.2
Test for an generic representation of a number.
By testing for numbers, you can rule out a value that is supposed to be a password or a token.
Usage
pattern: "key=(.*)"
test_number:
- on: 1
is: false
Test positive
key=<random token>
Note that by returning false
and is: false
, test_number
provides a positive outcome.
Test negative
key=0.1234

test_base64
, test_base64bin
Verify that a text is a base64 encoded or binary encoded. Supports all common variants of encoding (URL safe and others).
By testing for base64, you can:
- Ensure that a match is base64 and fail fast in a sequence of tests when you are looking for a token.
- Validate that a string is base64 encoded given you suspect that it may contain sensitive information.
Usage
pattern: "account_encoded='([[:alnum:]/+]+[=]{0,2})'"
test_base64:
- on: 1
is: true
Test positive
account_encoded='eyAiYWNjb3VudCI6ICJzZWNyZXQtbnVtYmVyIiB9'
Test negative
account_encoded='replace_me'
The binary variant first decodes the base64 encoded string, and then tests whether it is binary or not:
pattern: "account_encoded='([0-9]+)'"
test_base64bin:
- on: 1
is: true

test_binary
As Code Security detectors are binary-aware, you can test for binary matches in any capturing expression.
By testing for binary data, you can flag and avoid matches that are false and contain no text.
Usage
pattern: "token=(.*)"
test_binary:
- on: 1
is: false
Test positive
<BINARY DATA>token=<BINARY_DATA>
Test negative
token=48SfRa4idxxUVyPAejafXxwjkreyj8MoJkjV
The binary variant first decodes the base64 encoded string, and then tests whether it is binary or not:
pattern: "account_encoded='([0-9]+)'"
test_base64bin:
- on: 1
is: true

test_maxlen
, test_minlen
Test for content size, minimum or maximum.
By testing for content size, you can:
- Ensure to fail fast for very short strings or very large content, and skip the match.
- Validate that on top of the various structural captures that you have done, you end up with a reasonable sized match.
Usage
pattern: "account_encoded='(.*)'"
test_minlen:
- on: 1
score: 2
Test positive
account_encoded='eyAiYWNjb3VudCI6ICJzZWNyZXQtbnVtYmVyIiB9'
Test negative
account_encoded='XX'
In the same way, you can use maxlen
:
pattern: "account_encoded='(.*)'"
test_maxlen:
- on: 1
score: 2000
Structural Testers

test_jwt
A JWT(JSON Web Token) test is an Internet proposed standard for creating data with optional signature and/or optional encryption, whose payload holds JSON that asserts claims, often used for service-to-service authentication.
By testing for JWT, you can:
- Make sure the key structure fits a standard JWT.
- Verify that a certain JWT is semantically valid (header is valid).
Usage
pattern: "token=(\\S+)"
test_jwt:
- on: 1
is: true
Test positive
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJsb2dnZWRJbkFzIjoiYWRtaW4iLCJpYXQiOjE0MjI3Nzk2Mzh9.gzSraSYS8EXBxLN_oWnFSRgCzcmJmMjLiuyu5CSpyHI
Test negative
bad_token
References

test_uri
A URI or URL parsing test. A given string is tested to be a valid URI.
By testing for URI, you can:
- Isolate URLs that are sensitive before applying further matching logic.
- Detect various kinds of authentication, such as Bearer, Basic and more, given a URL request structure (for example
curl
'ing URLs).
Usage
pattern: "curl\\s.*(http.*)"
test_uri:
- on: 1
is: true
Test positive
curl -L -o https://dev.acme.corp/secure/credentials.json -H"Authorization: Bearer <token>"
Test negative
sh curl.sh arg1 arg2

test_tvar
Test for various template variables, common in configuration and IaC files.
By testing for template variables, you can filter for legitimate configuration that was built with proper template variables instead of hardcoded secrets.
Usage
pattern: "DB_PASS=(.*)"
test_tvar:
- on: 1
is: false
Test positive
Note - is: false
is a positive outcome if the candidate does not contain a template variable.
DB_PASS=my-secret-password
Test negative
DB_PASS={{.Env.DBPass}}

test_changeme
Available from: v1.4.2
Test for various changeme values. Developer sometimes indicate a value to be replaced by various commonly-known idioms, such as fixme
and XXX
, which is also known as changeme.
By testing for changeme:
- You can filter for mock values, or "TODO: replace this" values.
- Use this in combination with other testers to create a powerful detector.
Usage
pattern: "DB_PASS=(.*)"
test_changeme:
- on: 1
is: false
Test positive
Note - is: false
is a positive outcome if the candidate does not contain a changeme value.
DB_PASS="<real password>"
Test negative
DB_PASS="XXX"

test_assignment
Available from: v1.4.2
Test for an assignment structure.
By testing for assignment:
- You can set the scene for detectors which are only interested in one part of an assignment clause.
- Combine an expected assignment with another tester to create a more powerful detector.
Usage
pattern: "DB_PASS(.*)"
test_assignment:
- on: 0 # on the complete expression
is: true
test_token:
- on: 1
is: true
Test positive
DB_PASS=<random token>
Test negative
DB_PASS, foo, bar

test_uuid
Test if a given string is a UUID. It supports all UUID types and formats (with or without hyphens, and with or without a prefix).
By testing for UUID, you can ignore suspect strings that are randomly generated but in fact are IDs (database IDs or other).
Usage
pattern: "key=(.*)"
test_uuid:
- on: 1
is: false
Test positive
Note is: false
so a positive outcome is candidate NOT containing a UUID:
key=my-secret-key
Test negative
key=<UUID representing a DB table primary key>

test_regex
, test_regex_not
A test_regex
is a tester that can verify a structural form after a match is a found. You can verify the match further.
By using test_regex
, you can:
- Apply a clearer set of validations, readable and maintainable.
- Split verification into stages to pronounce a specific use case:
- Capture something vague. For example,
Bearer (.*)
). - Run a semantic tester. For example,
test_token
on the token part of the bearer. - Run a structural tester. For example, "it should look like a
curl
request" withtest_regex
.
- Capture something vague. For example,
- Apply verification that is beyond a Regex DFA capabilities. For example, a state machine with more aggressive but performant backtracking can first be achieved by running two separate ones and combining later.
As an array based tester, an AND
relation is created between elements, and short-circuiting (failing fast) is applied.
test_regex
- all must apply, fail if one does not applytest_regex_not
- all must not apply, fail if one applies
Usage
pattern: "token=(.+)"
test_regex:
- on: 1
pattern: "([0-9].*){2}" # the value include at least 2 numbers
- on: 1
pattern: "([a-zA-Z].*){2}" # the value include at least 2 letters
Test positive
token=48SfRa4idxxUVyPAejafXxwjkreyj8MoJkjV
Test negative
token=env.get('token')
Usage (test_regex_not
)
pattern: "token=(\\S+)"
test_regex_not:
- on: 1
pattern: "[$][a-zA-Z0-9_-]+" # the value include valid template variable.
- on: 1
pattern: "(?i)(exmaple|test|fake|1234|abcde|xxxx|foobar)" # the value include some word or pattern that can tell that this is just a token placeholder.
Test positive
token=48SfRa4idxxUVyPAejafXxwjkreyj8MoJkjV
Test negative
token=$my_token
token=48SfRa4idxxUVyPAejafXxwjkreyjEXMAPLE
token=testRa4idxxUVyPAejafXxwjkreyj8MoJkjV
token=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
token=test1234abcdefoobarfake
Semantic Testers

test_cword
Test for the percentage of common words in a given string. Based on a unique and massive tech-related common words dictionary model.
By testing for common words. you can:
- Rule out non-machine generated keys.
- Validate that a given match passes as a machine generated secret.
Usage
pattern: "pass=(.*)"
test_cword:
- on: 1
from: 0.0 # defines a range of accepted percentage
to: 0.2 # low percentage of common words (up to 20%)
Test positive
zx28821a{_)
Test negative
hello

test_zx
Test for password strength based on the popular zxcvbn library.
By testing for zx
(abbreviated), you can:
- Detect strong passwords amongst fake.
- Apply existing policies for enforcing password strength.
Usage
pattern: "pass=(.*)"
test_zx:
- on: 1
score: 4.0 # same standard score scale (0-4) from zxcvbn
Test positive
zxHELLOyw{_)
Test negative
foobar

test_pass
Test for password strength (own model). Pick a threshold on a scale of 0-100.0. A password with strength > 80
is considered strong.
By using test_pass
, you can detect strong passwords amongst fake.
Usage
pattern: "pass=(.*)"
test_pass:
- on: 1
score: 80.0 # scale: 0-100
Test positive
zxHELLOyw{_)
Test negative
foobar

test_token
Test for tokens, keys, and machine-generated secrets (own model).
By using test_pass
, you can:
- Detect real tokens, keys, and secrets.
- Verify that a machine generated token is secret by model attributes.
Usage
- pattern: "token=(.*)"
pattern_type: multi
test_token:
- on: 1
score: 0.6 # True if the score is bigger then 0.6
# max is 1, min is 0
Test positive
token=48SfRa4idxxUVyPAejafXxwjkreyj8MoJkjV
Test negative
token=AnotherVariableOfClientData[0];

test_entropy
A normalized entropy test. We do not recommend this test as entropy is a metric not optimized for finding secrets and sensitive information. You can use entropy if you use legacy infrastructure and policies.
Usage
- pattern: "token=(.*)"
pattern_type: multi
test_entropy:
- on: 1
score: 4.0 # True if the entropy of the value is bigger then 4
# max is 5, min is 0
Test positive
token=48SfRa4idxxUVyPAejafXxwjkreyj8MoJkjV
Test negative
token=G6q5oRa4idxxxxxxxxxxxxxwjkreyj8MoJkjV
token=FooBarFooBarFooBarFooBarFooBarFooBar
token=asdfdsafdsfasdfadsfadsfdasfasdfsdafsdf
Testing Detector
To test, you can selectively include your new detectors by using --just-ids
and/or --just-tags
. With these you can use any of the common Code Security commands:
If you want to run your new rule on your entire Github org:
$HOME/.spectral/spectral github ... --just-ids PRV001
Alternatively, just to scan your current repo:
$HOME/.spectral/spectral run ... --just-tags acme-security
Submit the Detector for Review
Contact Check Point Support Center to review your detector. Ensure to redact sensitive information in the detector before your submit it. Check Point can help you build it and give you a free detector building session.