Whitelisting

In the previous examples we’ve looked at the metrics and details of internal facing service like failed SSH logins. Those types aren’t prone to a lot of false positives. But other sources, like web access logs, can be.

False Positives

You’ll recall that when looking at metrics that a high number of ‘Lines unparsed’ is normal. They were simply entries that didn’t match any specific events the parser was looking for. Parsed lines however, are ‘poured’ to a bucket. A bucket being a potential attack type.

sudo cscli metrics

Acquisition Metrics:
╭────────────────────────────────┬────────────┬──────────────┬────────────────┬────────────────────────╮
│             Source             │ Lines read │ Lines parsed │ Lines unparsed │ Lines poured to bucket │
├────────────────────────────────┼────────────┼──────────────┼────────────────┼────────────────────────┤
│ file:/var/log/auth.log         │ 69         │ -            │ 69             │ -                      │
│ file:/var/log/caddy/access.log │ 21         │ 21           │ -              │ 32                     │ <--Notice the high number in the 'poured' column
│ file:/var/log/syslog           │ 2          │ -            │ 2              │ -                      │
╰────────────────────────────────┴────────────┴──────────────┴────────────────┴────────────────────────╯

In the above example, “lines poured” is bigger than the number parsed. This is because some lines can match more than one scenario and end up in multiple buckets, like a malformed user agent asking for a page that doesn’t exist. Sometimes, that’s OK. Action isn’t taken until a given bucket meets a threshold. That’s in scenarios so let’s take a look there.

Scenario Metrics:
╭──────────────────────────────────────┬───────────────┬───────────┬──────────────┬────────┬─────────╮
│                Scenario              │ Current Count │ Overflows │ Instantiated │ Poured │ Expired │
├──────────────────────────────────────┼───────────────┼───────────┼──────────────┼────────┼─────────┤
│ crowdsecurity/http-crawl-non_statics │ -             │ -         │ 2            │ 17     │ 2       │
│ crowdsecurity/http-probing           │ -             │ 1         │ 2            │ 15     │ 1       │
╰──────────────────────────────────────┴───────────────┴───────────┴──────────────┴────────┴─────────╯

It appears the scenario ‘http-crawl-non_statics’ is designed to allow some light web-crawling. Of the 32 events ‘poured’ above, 17 of them went into it’s bucket and it ‘Instantiated’ tracking against 2 IPs, but neither ‘Overflowed’, which would cause an action to be taken.

However, ‘http-probing’ did. Assuming this is related to a web application you’re trying to use, you just got blocked. So let’s see what that scenario is looking for and what we can do about it.

sudo cscli hub list | grep http-probing

  crowdsecurity/http-probing                        ✔️  enabled  0.4      /etc/crowdsec/scenarios/http-probing.yaml

sudo cat  /etc/crowdsec/scenarios/http-probing.yaml
...
...
filter: "evt.Meta.service == 'http' && evt.Meta.http_status in ['404', '403', '400']
capacity: 10
reprocess: true
leakspeed: "10s"
blackhole: 5m
...
...

You’ll notice that it’s simply looking for a few status codes, notably ‘404’. If you get more than 10 in 10 seconds, you get black-holed for 5 min. The next thing is to find out what web requests are triggering it. We could just look for 404s in the web access log, but we can also ask CrowdSec itself to tell is. This will be more important when the triggers are more subtle, so let’s give it a try now.

# Grep some 404 events from the main log to a test file
sudo grep 404 /var/log/caddy/access.log | tail  >  ~/test.log

# cscli explain with -v for more detail
sudo cscli explain -v --file ./test.log --type caddy

  ├ s00-raw
  | ├ 🟢 crowdsecurity/non-syslog (first_parser)
  | └ 🔴 crowdsecurity/syslog-logs
  ├ s01-parse
  | └ 🟢 crowdsecurity/caddy-logs (+19 ~2)
  |   └ update evt.Stage : s01-parse -> s02-enrich
  |   └ create evt.Parsed.request : /0/icon/Smith
  |   ...
  |   └ create evt.Meta.http_status : 404
  |   ...
  ├-------- parser success 🟢
  ├ Scenarios
    ├ 🟢 crowdsecurity/http-crawl-non_statics
    └ 🟢 crowdsecurity/http-probing

In this case, the client is asking for the file /0/icon/Smith and it doesn’t exist. Turns out, the web client is asking just in case and accepting the 404 without complaint in the background. That’s fine for the app, but matches two things under the Scenarios section; that of someone crawling the server, and or someone probing it. To fix this, we’ll need to create a whitelist definition for the app.

You can also work it from the alerts side and inspect what happened (assuming you’ve caused an alert).

sudo cscli alert list

# This is an actual attack, and not something to be whitelisted, but it's a good example of how the inspection works.

╭─────┬──────────────────────────┬────────────────────────────────────────────┬─────────┬──────────────────────────────────────┬───────────┬─────────────────────────────────────────╮
│  ID │           value          │                   reason                   │ country │                  as                  │ decisions │                created_at               │
├─────┼──────────────────────────┼────────────────────────────────────────────┼─────────┼──────────────────────────────────────┼───────────┼─────────────────────────────────────────┤
951 │ Ip:165.22.253.118        │ crowdsecurity/http-probing                 │ SG      │ 14061 DIGITALOCEAN-ASN               │ ban:1     │ 2025-02-26 13:53:08.589118208 +0000 UTC │


sudo cscli alerts inspect 951 -d

################################################################################################

 - ID           : 951
 - Date         : 2025-02-26T13:53:14Z
 - Machine      : 0e4a17d2f5d44270b7d543ac29c1dd4eWv2ozxHsRqoJWmRL
 - Simulation   : false
 - Remediation  : true
 - Reason       : crowdsecurity/http-probing
 - Events Count : 11
 - Scope:Value  : Ip:165.22.253.118
 - Country      : SG
 - AS           : DIGITALOCEAN-ASN
 - Begin        : 2025-02-26 13:53:08.589118208 +0000 UTC
 - End          : 2025-02-26 13:53:13.990699814 +0000 UTC
 - UUID         : eb454114-bc1e-455d-bfcc-f4772803e8bf


 - Context  :
╭────────────┬──────────────────────────────────────────────────────────────╮
│     Key    │                             Value                            │
├────────────┼──────────────────────────────────────────────────────────────┤
│ method     │ GET                                                          │
│ status     │ 403│ target_uri │ /                                                            │
│ target_uri │ /wp-includes/wlwmanifest.xml                                 │
│ target_uri │ /xmlrpc.php?rsd                                              │
│ target_uri │ /blog/wp-includes/wlwmanifest.xml                            │
│ target_uri │ /web/wp-includes/wlwmanifest.xml                             │
│ target_uri │ /wordpress/wp-includes/wlwmanifest.xml                       │
│ target_uri │ /website/wp-includes/wlwmanifest.xml                         │
│ target_uri │ /wp/wp-includes/wlwmanifest.xml                              │
│ target_uri │ /news/wp-includes/wlwmanifest.xml                            │
│ target_uri │ /2018/wp-includes/wlwmanifest.xml                            │
│ target_uri │ /2019/wp-includes/wlwmanifest.xml                            │
│ user_agent │ Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 │
│            │ (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36       │
╰────────────┴──────────────────────────────────────────────────────────────╯

Whitelist

To whitelist an app, we create a file with an expression that matches the behavior we see above, such as the apps attempts to load a file that doesn’t exist, and exempts it. You can only add these to the s02 stage folder and the name element but be unique for each.

sudo vi /etc/crowdsec/parsers/s02-enrich/some-app-whitelist.yaml

This example uses the startsWith expression and assumes that all requests start the same

name: you/some-app
description: "Whitelist 404s for icon requests" 
whitelist: 
  reason: "icon request" 
  expression:   
    - evt.Parsed.request startsWith '/0/icon/'

If it’s less predictable, you can use a regular expression instead and combine with other expressions like a site match. In general, the more specific the better.

name: you/some-app-whitelist
description: "Whitelist 404s for icon requests" 
whitelist: 
  reason: "icon request" 
  expression:   
    - evt.Parsed.request matches '^/[0-9]/icon/.*' && evt.Meta.target_fqdn == "some-app.you.org"

Now you can reload crowdsec and test

sudo systemctl restart crowdsec.service

sudo cscli explain -v --file ./test.log --type caddy
 ├ s00-raw
 | ├ 🔴 crowdsecurity/syslog-logs
 | └ 🟢 crowdsecurity/non-syslog (+5 ~8)
 |  └ update evt.ExpectMode : %!s(int=0) -> 1
 |  └ update evt.Stage :  -> s01-parse
...
 ├ s02-enrich
 | ├ 🟢 you/some-app-whitelist (~2 [whitelisted])
 |  ├ update evt.Whitelisted : %!s(bool=false) -> true
 |  ├ update evt.WhitelistReason :  -> some icon request
 | ├ 🟢 crowdsecurity/dateparse-enrich (+2 ~2)
...
...
 | ├ 🟢 crowdsecurity/http-logs (+7)
 | └ 🟢 crowdsecurity/whitelists (unchanged)
 └-------- parser success, ignored by whitelist (audioserve icon request) 🟢

You’ll see in the above example, we successfully parsed the entry, but it was ‘ignored’ and didn’t go on to the Scenario setion.

Regular Checking

You’ll find yourself doing this fairly regularly at first.

# Look for an IP on the ban list
sudo cscli alerts list

# Pull out the last several log entries for that IP
sudo grep SOME.IP.FROM.ALERTS /var/log/caddy/access.log | tail -10 > test.log 

# See what it was asking for
cat test.log | jq '.request'
cat test.log | jq '.request.uri'

# Ask caddy why it had a problem
sudo cscli explain -v --file ./test.log --type caddy

Troubleshooting

New Whitelist Has No Effect

If you have more than one whitelist, check the name you gave it on the first line. If that’s not unique, the whole thing will be silently ignore.

Regular Expression Isn’t Matching

CrowdSec uses the go-centric expr-lang. You may be used to unix regex where you’d escape slashes, for example. A tool like https://www.akto.io/tools/regex-tester is helpful.