This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

CrowdSec

1: Installation
2: Detailed Activity
3: Whitelisting
4: Custom Parser
5: On Alpine
6: Cloudflare Proxy

1 - Installation

Overview

CrowdSec has two main parts; detection and interdiction.

Detection is handled by the main CrowdSec binary. You tell it what files to keep an eye on, how to parse those files, and what something ‘bad’ looks like. It then keeps a list of IPs that have done bad things.

Interdiction is handled by any number of plugins called ‘bouncers’, so named because they block access or kick out bad IPs. They run independently and keep an eye on the list, to do things like edit the firewall to block access for a bad IP.

There is also the ‘crowd’ part. The CrowdSec binary downloads IPs of known bad-actors from the cloud for your bouncers to keep out and submits alerts from your systems.

Installation

With Debian, you can simply add the repo via their script and install with a couple lines.

curl -s https://packagecloud.io/install/repositories/crowdsec/crowdsec/script.deb.sh | sudo bash
sudo apt install crowdsec
sudo apt install crowdsec-firewall-bouncer-nftables

This installs both the detection (crowdsec) and the interdiction (crowdsec-firewall-bouncer) parts. Assuming eveything went well, crowdsec will check in with the cloud, download a baseline list of known bad-actors, the firewall-bouncer will set up a basic drop list in the firewall, and crowdsec will start watching your syslog for intrusion attempts.

# Check out the very long drop list
sudo nft list ruleset | less

Note - if there are no rules, you may need to sudo systemctl restart nftables.service or possibly reboot (as I’ve found in testing)

Configuration

CrowdSec comes pre-configured to watch for ssh brute-force attacks. If you have specific services to watch you can add those as described below.

Add a Service

You probably want to watch a specific service, like web server. Take a look at [https://hub.crowdsec.net/] to see all the available components. For example, browse the collections and search for caddy. The more info link will show you how to install the collection;

sudo cscli collections list -a
sudo cscli collections install crowdsecurity/caddy

Tell CrowdSec where Caddy’s log files are.

sudo tee -a /etc/crowdsec/acquis.yaml << EOF

---
filenames:
 - /var/log/caddy/*.log
labels:
  type: caddy
---
EOF

Restart crowdsec for these changes to take effect

sudo systemctl reload crowdsec

Operation

DataFlow

CrowdSec works by pulling in data from the Acquisition files, Parsing the events, comparing to Scenarios, and then Deciding if action should be taken.

Acquisition of data from log files is based on entries in the acquis.yaml file, and the events given a label as defined in that file.

Those events feed the Parsers. There are a handful by default, but only the ones specifically interested in a given label will see it. They look for keywords like ‘FAILED LOGIN’ and then extract the IP.

Successfully parsed lines are feed to the Scenarios to if what happened matters. The scenarios look for things like 10 FAILED LOGINs in 1 min. This separates the accidental bad password entry from a brute force attempt.

Matching a scenario gets the IP added to the Decision List, i.e the list of bad IPs. These have a configurable expiration, so that if you really guess wrong 10 times in a row, you’re not banned forever.

The bouncers use this list to take action, like a firewall block, and will unblock you after the expiration.

Collections

Parsers and Scenarios work best when they work together so they are usually distributed together as a Collection. You can have collections of collections as well. For example, the base installation comes with the linux collection that includes a few parsers and the sshd collection.

To see what Collections, Parsers and Scenarios are running, use the cscli command line interface.

sudo cscli collections list
sudo cscli collections inspect crowdsecurity/linux
sudo cscli collections inspect crowdsecurity/sshd

Inspecting the collection will tell you what parsers and scenarios it contains. As well as some metrics. To learn more a collection and it’s components, you can check out their page:

https://hub.crowdsec.net/author/crowdsecurity/collections/linux

The metrics are a bit confusing until you learn that the ‘Unparsed’ column doesn’t mean unparsed so much as it means a non-event. These are just normal logfile lines that don’t have one of the keywords the parser was looking for, like ‘LOGIN FAIL’.

Status

Is anyone currently attacking you? The decisions list shows you any current bad actors and the alerts list shows you a summary of past decisions. If you are just getting started this is probably none, but if you’re open to the internet this will grow quickly.

sudo cscli decisions list
sudo cscli alerts list

But you are getting events from the cloud and you can check those with the -a option. You’ll notice that every 2 hours the community-blocklist is updated.

sudo cscli alerts list -a

After a while of this collection running, you’ll start to see these kinds of alerts

sudo cscli alerts list
╭────┬───────────────────┬───────────────────────────────────────────┬─────────┬────────────────────────┬───────────┬─────────────────────────────────────────╮
│ ID │       value       │                  reason                   │ country │           as           │ decisions │               created_at                │
├────┼───────────────────┼───────────────────────────────────────────┼─────────┼────────────────────────┼───────────┼─────────────────────────────────────────┤
│ 27 │ Ip:18.220.128.229 │ crowdsecurity/http-bad-user-agent         │ US      │ 16509 AMAZON-02        │ ban:1     │ 2023-03-02 13:12:27.948429492 +0000 UTC │
│ 26 │ Ip:18.220.128.229 │ crowdsecurity/http-path-traversal-probing │ US      │ 16509 AMAZON-02        │ ban:1     │ 2023-03-02 13:12:27.979479713 +0000 UTC │
│ 25 │ Ip:18.220.128.229 │ crowdsecurity/http-probing                │ US      │ 16509 AMAZON-02        │ ban:1     │ 2023-03-02 13:12:27.9460075 +0000 UTC   │
│ 24 │ Ip:18.220.128.229 │ crowdsecurity/http-sensitive-files        │ US      │ 16509 AMAZON-02        │ ban:1     │ 2023-03-02 13:12:27.945759433 +0000 UTC │
│ 16 │ Ip:159.223.78.147 │ crowdsecurity/http-probing                │ SG      │ 14061 DIGITALOCEAN-ASN │ ban:1     │ 2023-03-01 23:03:06.818512212 +0000 UTC │
│ 15 │ Ip:159.223.78.147 │ crowdsecurity/http-sensitive-files        │ SG      │ 14061 DIGITALOCEAN-ASN │ ban:1     │ 2023-03-01 23:03:05.814690037 +0000 UTC │
╰────┴───────────────────┴───────────────────────────────────────────┴─────────┴────────────────────────┴───────────┴─────────────────────────────────────────╯

You may even need to unblock yourself

sudo cscli decisions list
sudo cscli decision delete --id XXXXXXX

Next Steps

You’re now taking advantage of the crowd-part of the crowdsec and added your own service. If you don’t have any alerts though, you may be wondering how well it’s actually working.

Take a look at the detailed activity if you want to look more closely at what’s going on.

2 - Detailed Activity

Inspecting Metrics

Data comes in through the parsers. To see what they are doing, let’s take a look at the Acquisition and Parser metrics.

sudo cscli metrics

Most of the ‘Acquisition Metrics’ lines will be read and unparsed. This is because normal events are dropped. It only considers lines parsed if they were passed on to a scenario. The ‘bucket’ column refers to event scenarios and is also blank as there were no parsed lines to hand off.

Acquisition Metrics:
╭────────────────────────┬────────────┬──────────────┬────────────────┬────────────────────────╮
│         Source         │ Lines read │ Lines parsed │ Lines unparsed │ Lines poured to bucket │
├────────────────────────┼────────────┼──────────────┼────────────────┼────────────────────────┤
│ file:/var/log/auth.log │ 216        │ -            │ 216            │ -                      │
│ file:/var/log/syslog   │ 143        │ -            │ 143            │ -                      │
╰────────────────────────┴────────────┴──────────────┴────────────────┴────────────────────────╯

The ‘Parser Metrics’ will show the individual parsers - but not all of them. Only parsers that have at least one ‘hit’ are shown. In this example, only the syslog parser shows up. It’s a low-level parser that doesn’t look for matches, so every line is a hit.

Parser Metrics:
╭─────────────────────────────────┬──────┬────────┬──────────╮
│             Parsers             │ Hits │ Parsed │ Unparsed │
├─────────────────────────────────┼──────┼────────┼──────────┤
│ child-crowdsecurity/syslog-logs │ 359  │ 359    │ -        │
│ crowdsecurity/syslog-logs       │ 359  │ 359    │ -        │
╰─────────────────────────────────┴──────┴────────┴──────────╯

However, try a couple failed SSH login attempts and you’ll see them and how they feed up the the Acquisition Metrics.


Acquisition Metrics:
╭────────────────────────┬────────────┬──────────────┬────────────────┬────────────────────────╮
│         Source         │ Lines read │ Lines parsed │ Lines unparsed │ Lines poured to bucket │
├────────────────────────┼────────────┼──────────────┼────────────────┼────────────────────────┤
│ file:/var/log/auth.log │ 242        │ 3            │ 239            │ -                      │
│ file:/var/log/syslog   │ 195        │ -            │ 195            │ -                      │
╰────────────────────────┴────────────┴──────────────┴────────────────┴────────────────────────╯

Parser Metrics:
╭─────────────────────────────────┬──────┬────────┬──────────╮
│             Parsers             │ Hits │ Parsed │ Unparsed │
├─────────────────────────────────┼──────┼────────┼──────────┤
│ child-crowdsecurity/sshd-logs   │ 61   │ 3      │ 58       │
│ child-crowdsecurity/syslog-logs │ 442  │ 442    │ -        │
│ crowdsecurity/dateparse-enrich  │ 3    │ 3      │ -        │
│ crowdsecurity/geoip-enrich      │ 3    │ 3      │ -        │
│ crowdsecurity/sshd-logs         │ 8    │ 3      │ 5        │
│ crowdsecurity/syslog-logs       │ 442  │ 442    │ -        │
│ crowdsecurity/whitelists        │ 3    │ 3      │ -        │
╰─────────────────────────────────┴──────┴────────┴──────────╯

Lines poured to bucket however, is still empty. That means the the action didn’t match a scenario defining a hack attempt. In fact - you may notice the ‘whitelist` was triggered. Let’s ask crowdsec to explain what’s going on.

Detailed Parsing

To see which parsers got involved and what they did, you can ask.

sudo cscli explain --file /var/log/auth.log --type syslog

Here’s a ssh example of a failed login. The numbers, such as (+9 ~1), mean that the parser added 9 elements it parsed from the raw event, and updated 1. Notice the whitelists parser at the end. It’s catching this event and dropping it, hence the ‘parser failure’. The failure message is a red herring, as this is how it’s supposed to work. It short-circuits as soon as it thinks something should be white-listed.

line: Mar  1 14:08:11 www sshd[199701]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=192.168.1.16  user=allen
        ├ s00-raw
        |       └ 🟢 crowdsecurity/syslog-logs (first_parser)
        ├ s01-parse
        |       └ 🟢 crowdsecurity/sshd-logs (+9 ~1)
        ├ s02-enrich
        |       ├ 🟢 crowdsecurity/dateparse-enrich (+2 ~1)
        |       ├ 🟢 crowdsecurity/geoip-enrich (+9)
        |       └ 🟢 crowdsecurity/whitelists (~2 [whitelisted])
        └-------- parser failure 🔴

But why exactly did it get whitelisted? Let’s ask for a verbose report.

sudo cscli explain -v --file /var/log/auth.log --type syslog

line: Mar  1 14:08:11 www sshd[199701]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=192.168.1.16  user=someGuy
        ├ s00-raw
        |       └ 🟢 crowdsecurity/syslog-logs (first_parser)
        ├ s01-parse
        |       └ 🟢 crowdsecurity/sshd-logs (+9 ~1)
        |               └ update evt.Stage : s01-parse -> s02-enrich
        |               └ create evt.Parsed.sshd_client_ip : 192.168.1.16
        |               └ create evt.Parsed.uid : 0
        |               └ create evt.Parsed.euid : 0
        |               └ create evt.Parsed.pam_type : unix
        |               └ create evt.Parsed.sshd_invalid_user : someGuy
        |               └ create evt.Meta.service : ssh
        |               └ create evt.Meta.source_ip : 192.168.1.16
        |               └ create evt.Meta.target_user : someGuy
        |               └ create evt.Meta.log_type : ssh_failed-auth
        ├ s02-enrich
        |       ├ 🟢 crowdsecurity/dateparse-enrich (+2 ~1)
        |               ├ create evt.Enriched.MarshaledTime : 2023-03-01T14:08:11Z
        |               ├ update evt.MarshaledTime :  -> 2023-03-01T14:08:11Z
        |               ├ create evt.Meta.timestamp : 2023-03-01T14:08:11Z
        |       ├ 🟢 crowdsecurity/geoip-enrich (+9)
        |               ├ create evt.Enriched.Longitude : 0.000000
        |               ├ create evt.Enriched.ASNNumber : 0
        |               ├ create evt.Enriched.ASNOrg : 
        |               ├ create evt.Enriched.ASNumber : 0
        |               ├ create evt.Enriched.IsInEU : false
        |               ├ create evt.Enriched.IsoCode : 
        |               ├ create evt.Enriched.Latitude : 0.000000
        |               ├ create evt.Meta.IsInEU : false
        |               ├ create evt.Meta.ASNNumber : 0
        |       └ 🟢 crowdsecurity/whitelists (~2 [whitelisted])
        |               └ update evt.Whitelisted : %!s(bool=false) -> true
        |               └ update evt.WhitelistReason :  -> private ipv4/ipv6 ip/ranges
        └-------- parser failure 🔴

Turns out that private IP ranges are whitelisted by default so you can’t lock yourself out from inside. The parser crowdsecurity/whitelists has updated the property ’evt.Whitelisted’ to true and gave it a reason. That property appears to be a built-in that flags events to be dropped.

If you want to change the ranges, you can edit the logic by editing the yaml file. A sudo cscli hub list will show you what file that is. Add or remove entries from the list it’s checking the ‘ip’ value and ‘cidr’ value against. Any match cases whitelist to become true.

3 - Whitelisting

In the previous examples we’ve looked at the metrics and details of internal facing service like failed SSH logins. Those types aren’t prone to a lot of false positives. But other sources, like web access logs, can be.

False Positives

You’ll recall that when looking at metrics that a high number of ‘Lines unparsed’ is normal. They were simply entries that didn’t match any specific events the parser was looking for. Parsed lines however, are ‘poured’ to a bucket. A bucket being a potential attack type.

sudo cscli metrics

Acquisition Metrics:
╭────────────────────────────────┬────────────┬──────────────┬────────────────┬────────────────────────╮
│             Source             │ Lines read │ Lines parsed │ Lines unparsed │ Lines poured to bucket │
├────────────────────────────────┼────────────┼──────────────┼────────────────┼────────────────────────┤
│ file:/var/log/auth.log         │ 69         │ -            │ 69             │ -                      │
│ file:/var/log/caddy/access.log │ 21         │ 21           │ -              │ 32                     │ <--Notice the high number in the 'poured' column
│ file:/var/log/syslog           │ 2          │ -            │ 2              │ -                      │
╰────────────────────────────────┴────────────┴──────────────┴────────────────┴────────────────────────╯

In the above example, “lines poured” is bigger than the number parsed. This is because some lines can match more than one scenario and end up in multiple buckets, like a malformed user agent asking for a page that doesn’t exist. Sometimes, that’s OK. Action isn’t taken until a given bucket meets a threshold. That’s in scenarios so let’s take a look there.

Scenario Metrics:
╭──────────────────────────────────────┬───────────────┬───────────┬──────────────┬────────┬─────────╮
│                Scenario              │ Current Count │ Overflows │ Instantiated │ Poured │ Expired │
├──────────────────────────────────────┼───────────────┼───────────┼──────────────┼────────┼─────────┤
│ crowdsecurity/http-crawl-non_statics │ -             │ -         │ 2            │ 17     │ 2       │
│ crowdsecurity/http-probing           │ -             │ 1         │ 2            │ 15     │ 1       │
╰──────────────────────────────────────┴───────────────┴───────────┴──────────────┴────────┴─────────╯

It appears the scenario ‘http-crawl-non_statics’ is designed to allow some light web-crawling. Of the 32 events ‘poured’ above, 17 of them went into it’s bucket and it ‘Instantiated’ tracking against 2 IPs, but neither ‘Overflowed’, which would cause an action to be taken.

However, ‘http-probing’ did. Assuming this is related to a web application you’re trying to use, you just got blocked. So let’s see what that scenario is looking for and what we can do about it.

sudo cscli hub list | grep http-probing

  crowdsecurity/http-probing                        ✔️  enabled  0.4      /etc/crowdsec/scenarios/http-probing.yaml

sudo cat  /etc/crowdsec/scenarios/http-probing.yaml
...
...
filter: "evt.Meta.service == 'http' && evt.Meta.http_status in ['404', '403', '400']
capacity: 10
reprocess: true
leakspeed: "10s"
blackhole: 5m
...
...

You’ll notice that it’s simply looking for a few status codes, notably ‘404’. If you get more than 10 in 10 seconds, you get black-holed for 5 min. The next thing is to find out what web requests are triggering it. We could just look for 404s in the web access log, but we can also ask CrowdSec itself to tell is. This will be more important when the triggers are more subtle, so let’s give it a try now.

# Grep some 404 events from the main log to a test file
sudo grep 404 /var/log/caddy/access.log | tail  >  ~/test.log

# cscli explain with -v for more detail
sudo cscli explain -v --file ./test.log --type caddy

  ├ s00-raw
  | ├ 🟢 crowdsecurity/non-syslog (first_parser)
  | └ 🔴 crowdsecurity/syslog-logs
  ├ s01-parse
  | └ 🟢 crowdsecurity/caddy-logs (+19 ~2)
  |   └ update evt.Stage : s01-parse -> s02-enrich
  |   └ create evt.Parsed.request : /0/icon/Smith
  |   ...
  |   └ create evt.Meta.http_status : 404
  |   ...
  ├-------- parser success 🟢
  ├ Scenarios
    ├ 🟢 crowdsecurity/http-crawl-non_statics
    └ 🟢 crowdsecurity/http-probing

In this case, the client is asking for the file /0/icon/Smith and it doesn’t exist. Turns out, the web client is asking just in case and accepting the 404 without complaint in the background. That’s fine for the app, but matches two things under the Scenarios section; that of someone crawling the server, and or someone probing it. To fix this, we’ll need to create a whitelist definition for the app.

You can also work it from the alerts side and inspect what happened (assuming you’ve caused an alert).

sudo cscli alert list

# This is an actual attack, and not something to be whitelisted, but it's a good example of how the inspection works.

╭─────┬──────────────────────────┬────────────────────────────────────────────┬─────────┬──────────────────────────────────────┬───────────┬─────────────────────────────────────────╮
│  ID │           value          │                   reason                   │ country │                  as                  │ decisions │                created_at               │
├─────┼──────────────────────────┼────────────────────────────────────────────┼─────────┼──────────────────────────────────────┼───────────┼─────────────────────────────────────────┤
│ 951 │ Ip:165.22.253.118        │ crowdsecurity/http-probing                 │ SG      │ 14061 DIGITALOCEAN-ASN               │ ban:1     │ 2025-02-26 13:53:08.589118208 +0000 UTC │


sudo cscli alerts inspect 951 -d

################################################################################################

 - ID           : 951
 - Date         : 2025-02-26T13:53:14Z
 - Machine      : 0e4a17d2f5d44270b7d543ac29c1dd4eWv2ozxHsRqoJWmRL
 - Simulation   : false
 - Remediation  : true
 - Reason       : crowdsecurity/http-probing
 - Events Count : 11
 - Scope:Value  : Ip:165.22.253.118
 - Country      : SG
 - AS           : DIGITALOCEAN-ASN
 - Begin        : 2025-02-26 13:53:08.589118208 +0000 UTC
 - End          : 2025-02-26 13:53:13.990699814 +0000 UTC
 - UUID         : eb454114-bc1e-455d-bfcc-f4772803e8bf


 - Context  :
╭────────────┬──────────────────────────────────────────────────────────────╮
│     Key    │                             Value                            │
├────────────┼──────────────────────────────────────────────────────────────┤
│ method     │ GET                                                          │
│ status     │ 403                                                          │
│ target_uri │ /                                                            │
│ target_uri │ /wp-includes/wlwmanifest.xml                                 │
│ target_uri │ /xmlrpc.php?rsd                                              │
│ target_uri │ /blog/wp-includes/wlwmanifest.xml                            │
│ target_uri │ /web/wp-includes/wlwmanifest.xml                             │
│ target_uri │ /wordpress/wp-includes/wlwmanifest.xml                       │
│ target_uri │ /website/wp-includes/wlwmanifest.xml                         │
│ target_uri │ /wp/wp-includes/wlwmanifest.xml                              │
│ target_uri │ /news/wp-includes/wlwmanifest.xml                            │
│ target_uri │ /2018/wp-includes/wlwmanifest.xml                            │
│ target_uri │ /2019/wp-includes/wlwmanifest.xml                            │
│ user_agent │ Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 │
│            │ (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36       │
╰────────────┴──────────────────────────────────────────────────────────────╯

Whitelist

To whitelist an app, we create a file with an expression that matches the behavior we see above, such as the apps attempts to load a file that doesn’t exist, and exempts it. You can only add these to the s02 stage folder and the name element but be unique for each.

sudo vi /etc/crowdsec/parsers/s02-enrich/some-app-whitelist.yaml

This example uses the startsWith expression and assumes that all requests start the same

name: you/some-app
description: "Whitelist 404s for icon requests" 
whitelist: 
  reason: "icon request" 
  expression:   
    - evt.Parsed.request startsWith '/0/icon/'

If it’s less predictable, you can use a regular expression instead and combine with other expressions like a site match. In general, the more specific the better.

name: you/some-app-whitelist
description: "Whitelist 404s for icon requests" 
whitelist: 
  reason: "icon request" 
  expression:   
    - evt.Parsed.request matches '^/[0-9]/icon/.*' && evt.Meta.target_fqdn == "some-app.you.org"

Now you can reload crowdsec and test

sudo systemctl restart crowdsec.service

sudo cscli explain -v --file ./test.log --type caddy

 ├ s00-raw
 | ├ 🔴 crowdsecurity/syslog-logs
 | └ 🟢 crowdsecurity/non-syslog (+5 ~8)
 |  └ update evt.ExpectMode : %!s(int=0) -> 1
 |  └ update evt.Stage :  -> s01-parse
...
 ├ s02-enrich
 | ├ 🟢 you/some-app-whitelist (~2 [whitelisted])
 |  ├ update evt.Whitelisted : %!s(bool=false) -> true
 |  ├ update evt.WhitelistReason :  -> some icon request
 | ├ 🟢 crowdsecurity/dateparse-enrich (+2 ~2)
...
...
 | ├ 🟢 crowdsecurity/http-logs (+7)
 | └ 🟢 crowdsecurity/whitelists (unchanged)
 └-------- parser success, ignored by whitelist (audioserve icon request) 🟢

You’ll see in the above example, we successfully parsed the entry, but it was ‘ignored’ and didn’t go on to the Scenario setion.

Regular Checking

You’ll find yourself doing this fairly regularly at first.

# Look for an IP on the ban list
sudo cscli alerts list

# Pull out the last several log entries for that IP
sudo grep SOME.IP.FROM.ALERTS /var/log/caddy/access.log | tail -10 > test.log 

# See what it was asking for
cat test.log | jq '.request'
cat test.log | jq '.request.uri'

# Ask caddy why it had a problem
sudo cscli explain -v --file ./test.log --type caddy

Troubleshooting

New Whitelist Has No Effect

If you have more than one whitelist, check the name you gave it on the first line. If that’s not unique, the whole thing will be silently ignore.

Regular Expression Isn’t Matching

CrowdSec uses the go-centric expr-lang. You may be used to unix regex where you’d escape slashes, for example. A tool like https://www.akto.io/tools/regex-tester is helpful.

4 - Custom Parser

When checking out the detailed metrics you may find that log entries aren’t being parsed. Maybe the log format has changed or you’re logging additional data the author didn’t anticipate. The best thing is to add your own parser.

Types of Parsers

There are several type of parsers and they are used in stages. Some are designed to work with the raw log entries while others are designed to take pre-parsed data and add or enrich it. This way you can do branching and not every parser needs to now how to read a syslog message.

Their Local Path will tell you what stage they kick in at. Use sudo cscli parsers list to display the details. s00-raw works with the ‘raw’ files while s01 and s02 work further down the pipeline. Currently, you can only create s00 and s01 level parsers.

Integrating with Scenarios

Useful parsers supply data that Scenarios are interested in. You can create a parser that watches the system logs for ‘FOOBAR’ entries, extracts the ‘FOOBAR-LEVEL`, and passes it on. But if nothing is looking for ‘FOOBARs’ then nothing will happen.

Let’s say you’ve added the Caddy collection. It’s pulled in a bunch of Scenarios you can view with sudo cscli scenarios list. If you look at one of the assicated files you’ll see a filter section where they look for ’evt.Meta.http_path’ and ’evt.Parsed.verb’. They are all different though, so how do you know what data to supply?

Your best bet is to take an existing parser and modify it.

Examples

Note - CrowdSec is pretty awesome and after talking in the discord they’ve already accomodated both these scenarios within a relase cycle or two. So these two examples are solved. I’m sure you’ll find new ones, though ;-)

A Web Example

Let’s say that you’ve installed the Caddy collection, but you’ve noticed basic auth login failures don’t trigger the parser. So let’s add a new file and edit it.

sudo cp /etc/crowdsec/parsers/s01-parse/caddy-logs.yaml /etc/crowdsec/parsers/s01-parse/caddy-logs-custom.yaml

You’ll notice two top level sections where the parsing happens; nodes and statics and some grok pattern matching going on.

Nodes allow you try multiple patterns and if any match, the whole section is considered successful. I.e. if the log could have either the standard HTTPDATE or a CUSTOMDATE, as long as it has one it’s good and the matching can move on. Statics just goes down the list extracting data. If any fail the whole event is considered a fail and dropped as unparseable.

All the pasrsed data gets attached to event as ’evt.Parsed.something’ and some of the statics are moving it to evt values the Senarios will be looking for Caddy logs are JSON formatted and so basically already parsed and this example makes use of the JsonExtract method quite a bit.

# We added the caddy logs in the acquis.yaml file with the label 'caddy' and so we use that as our filter here
filter: "evt.Parsed.program startsWith 'caddy'"
onsuccess: next_stage
# debug: true
name: caddy-logs-custom
description: "Parse custom caddy logs"
pattern_syntax:
 CUSTOMDATE: '%{DAY:day}, %{MONTHDAY:monthday} %{MONTH:month} %{YEAR:year} %{TIME:time} %{WORD:tz}'
nodes:
  - nodes:
    - grok:
        pattern: '%{NOTSPACE} %{NOTSPACE} %{NOTSPACE} \[%{HTTPDATE:timestamp}\]%{DATA}'
        expression: JsonExtract(evt.Line.Raw, "common_log")
        statics:
          - target: evt.StrTime
            expression: evt.Parsed.timestamp
    - grok:
        pattern: "%{CUSTOMDATE:timestamp}"
        expression: JsonExtract(evt.Line.Raw, "resp_headers.Date[0]")
        statics:
          - target: evt.StrTime
            expression: evt.Parsed.day + " " + evt.Parsed.month + " " + evt.Parsed.monthday + " " + evt.Parsed.time + ".000000" + " " + evt.Parsed.year
    - grok:
        pattern: '%{IPORHOST:remote_addr}:%{NUMBER}'
        expression: JsonExtract(evt.Line.Raw, "request.remote_addr")
    - grok:
        pattern: '%{IPORHOST:remote_ip}'
        expression: JsonExtract(evt.Line.Raw, "request.remote_ip")
    - grok:
        pattern: '\["%{NOTDQUOTE:http_user_agent}\"]'
        expression: JsonExtract(evt.Line.Raw, "request.headers.User-Agent")
statics:
  - meta: log_type
    value: http_access-log
  - meta: service
    value: http
  - meta: source_ip
    expression: evt.Parsed.remote_addr
  - meta: source_ip
    expression: evt.Parsed.remote_ip
  - meta: http_status
    expression: JsonExtract(evt.Line.Raw, "status")
  - meta: http_path
    expression: JsonExtract(evt.Line.Raw, "request.uri")
  - target: evt.Parsed.request #Add for http-logs enricher
    expression: JsonExtract(evt.Line.Raw, "request.uri")
  - parsed: verb
    expression: JsonExtract(evt.Line.Raw, "request.method")
  - meta: http_verb
    expression: JsonExtract(evt.Line.Raw, "request.method")
  - meta: http_user_agent
    expression: evt.Parsed.http_user_agent
  - meta: target_fqdn
    expression: JsonExtract(evt.Line.Raw, "request.host")
  - meta: sub_type
    expression: "JsonExtract(evt.Line.Raw, 'status') == '401' && JsonExtract(evt.Line.Raw, 'request.headers.Authorization[0]') startsWith 'Basic ' ? 'auth_fail' : ''"

The very last line is where a status 401 is checked. It looks for a 401 and a request for Basic auth. However, this misses events where someone asks for a resource that is protected and the serer responds telling you Basic is needed. I.e. when a bot is poking at URLs on your server ignoring the prompts to login. You can look at the log entries more easily with this command to follow the log and decode it while you recreate failed attempts.

sudo tail -f /var/log/caddy/access.log | jq

To change this, update the expression to also check the response header with an additional ? (or) condition.

    expression: "JsonExtract(evt.Line.Raw, 'status') == '401' && JsonExtract(evt.Line.Raw, 'request.headers.Authorization[0]') startsWith 'Basic ' ? 'auth_fail' : ''"xtract(evt.Line.Raw, 'status') == '401' && JsonExtract(evt.Line.Raw, 'resp_headers.Www-Authenticate[0]') startsWith 'Basic ' ? 'auth_fail' : ''"

Syslog Example

Let’s say you’re using dropbear and failed logins are not being picked up by the ssh parser

To see what’s going on, you use the crowdsec command line interface. The shell command is cscli and you can ask it about it’s metrics to see how many lines it’s parsed and if any of them are suspicious. Since we just restarted, you may not have any syslog lines yet, so let’s add some and check.

ssh [email protected]
logger "This is an innocuous message"

cscli metrics
INFO[28-06-2022 02:41:33 PM] Acquisition Metrics:
+------------------------+------------+--------------+----------------+------------------------+
|         SOURCE         | LINES READ | LINES PARSED | LINES UNPARSED | LINES POURED TO BUCKET |
+------------------------+------------+--------------+----------------+------------------------+
| file:/var/log/messages | 1          | -            | 1              | -                      |
+------------------------+------------+--------------+----------------+------------------------+

Notice that the line we just read is unparsed and that’s OK. That just means it wasn’t an entry the parser cared about. Let’s see if it responds to an actual failed login.

dbclient some.remote.host

# Enter some bad passwords and then exit with a Ctrl-C. Remember, localhost attempts are whitelisted so you must be remote.
[email protected]'s password:
[email protected]'s password:

cscli metrics
INFO[28-06-2022 02:49:51 PM] Acquisition Metrics:
+------------------------+------------+--------------+----------------+------------------------+
|         SOURCE         | LINES READ | LINES PARSED | LINES UNPARSED | LINES POURED TO BUCKET |
+------------------------+------------+--------------+----------------+------------------------+
| file:/var/log/messages | 7          | -            | 7              | -                      |
+------------------------+------------+--------------+----------------+------------------------+

Well, no luck. We will need to adjust the parser

sudo cp /etc/crowdsec/parsers/s01-parse/sshd-logs.yaml /etc/crowdsec/parsers/s01-parse/sshd-logs-custom.yaml

Take a look at the logfile and copy an example line over to https://grokdebugger.com/. Use a pattern like

Bad PAM password attempt for '%{DATA:user}' from %{IP:source_ip}:%{INT:port}

Assuming you get the pattern worked out, you can then add a section to the bottom of the custom log file you created.

  - grok:
      name: "SSHD_AUTH_FAIL"
      pattern: "Login attempt for nonexistent user from %{IP:source_ip}:%{INT:port}"
      apply_on: message

5 - On Alpine

Install

There are some packages available, but (as of 2022) they are a bit behind and don’t include the config and service files. So let’s download the latest binaries from Crowsec and create our own.

Download the current release

Note: Download the static versions. Alpine uses a differnt libc than other distros.

cd /tmp
wget https://github.com/crowdsecurity/crowdsec/releases/latest/download/crowdsec-release-static.tgz
wget https://github.com/crowdsecurity/cs-firewall-bouncer/releases/latest/download/crowdsec-firewall-bouncer.tgz

tar xzf crowdsec-firewall*
tar xzf crowdsec-release*
rm *.tgz

Install Crowdsec and Register with The Central API

You cannot use the wizard as it expects systemd and doesn’t support OpenRC. Follow the Binary Install steps from CrowdSec’s binary instrcutions.

sudo apk add bash newt envsubst
cd /tmp/crowdsec-v*

# Docker mode skips configuring systemd
sudo ./wizard.sh --docker-mode

sudo cscli hub update
sudo cscli machines add -a
sudo cscli capi register

# A collection is just a bunch of parsers and scenarios bundled together for convienence
sudo cscli collections install crowdsecurity/linux

Install The Firewall Bouncer

We need a netfilter tool so install nftables. If you already have iptables installed you can skip this step and set FW_BACKEND to that below when generating the API keys.

sudo apk add nftables

Now we install the firewall bouncer. There is no static build of the firewall bouncer yet from CrowdSec, but you can get one from Alpine testing (if you don’t want to compile it yourself)

# Change from 'edge' to other versions a needed
echo "http://dl-cdn.alpinelinux.org/alpine/edge/testing" >> /etc/apk/repositories
apk update
apk add cs-firewall-bouncer

Now configure the bouncer. We will once again do this manually becase there is not support for non-systemd linuxes with the install script. But cribbing from their install script, we see we can:

cd /tmp/crowdsec-firewall*

BIN_PATH_INSTALLED="/usr/local/bin/crowdsec-firewall-bouncer"
BIN_PATH="./crowdsec-firewall-bouncer"
sudo install -v -m 755 -D "${BIN_PATH}" "${BIN_PATH_INSTALLED}"

CONFIG_DIR="/etc/crowdsec/bouncers/"
sudo mkdir -p "${CONFIG_DIR}"
sudo install -m 0600 "./config/crowdsec-firewall-bouncer.yaml" "${CONFIG_DIR}crowdsec-firewall-bouncer.yaml"

Generate The API Keys

Note: If you used the APK, just do the first two lines to get the API_KEY (echo $API_KEY) and manually edit the file (vim /etc/crowdsec/bouncers/crowdsec-firewall-bouncer.yaml)

cd /tmp/crowdsec-firewall*
CONFIG_DIR="/etc/crowdsec/bouncers/"

SUFFIX=`tr -dc A-Za-z0-9 </dev/urandom | head -c 8`
API_KEY=`sudo cscli bouncers add cs-firewall-bouncer-${SUFFIX} -o raw`
FW_BACKEND="nftables"
API_KEY=${API_KEY} BACKEND=${FW_BACKEND} envsubst < ./config/crowdsec-firewall-bouncer.yaml | sudo install -m 0600 /dev/stdin "${CONFIG_DIR}crowdsec-firewall-bouncer.yaml"

Create RC Service Files

sudo touch /etc/init.d/crowdsec
sudo chmod +x /etc/init.d/crowdsec
sudo rc-update add crowdsec

sudo vim /etc/init.d/crowdsec

#!/sbin/openrc-run

command=/usr/local/bin/crowdsec
command_background=true

pidfile="/run/${RC_SVCNAME}.pid"

depend() {
   need localmount
   need net
}

Note: If you used the package from Alpine testing above it came with a service file. Just rc-update add cs-firewall-bouncer and skip this next step.

sudo touch /etc/init.d/cs-firewall-bouncer
sudo chmod +x /etc/init.d/cs-firewall-bouncer
sudo rc-update add cs-firewall-bouncer

sudo vim /etc/init.d/cs-firewall-bouncer

#!/sbin/openrc-run

command=/usr/local/bin/crowdsec-firewall-bouncer
command_args="-c /etc/crowdsec/bouncers/crowdsec-firewall-bouncer.yaml"
pidfile="/run/${RC_SVCNAME}.pid"
command_background=true

depend() {
  after firewall
}

Start The Services and Observe The Results

Start up the services and view the logs to see that everything started properly

sudo service start crowdsec
sudo service cs-firewall-bouncer status

sudo tail /var/log/crowdsec.log
sudo tail /var/log/crowdsec-firewall-bouncer.log

# The firewall bouncer should tell you about how it's inserting decisions it got from the hub

sudo cat /var/log/crowdsec-firewall-bouncer.log

time="28-06-2022 13:10:05" level=info msg="backend type : nftables"
time="28-06-2022 13:10:05" level=info msg="nftables initiated"
time="28-06-2022 13:10:05" level=info msg="Processing new and deleted decisions . . ."
time="28-06-2022 14:35:35" level=info msg="100 decisions added"
time="28-06-2022 14:35:45" level=info msg="1150 decisions added"
...
...

# If you are curious about what it's blocking
sudo nft list table crowdsec
...

6 - Cloudflare Proxy

Cloudflare offers an excellent reverse proxy and they filter most bad actors for you. But not all. Here’s a sample of what makes it through;

allen@www:~/$ sudo cscli alert list    
╭─────┬────────────────────┬───────────────────────────────────┬─────────┬────────────────────────┬───────────┬─────────────────────────────────────────╮
│  ID │        value       │               reason              │ country │           as           │ decisions │                created_at               │
├─────┼────────────────────┼───────────────────────────────────┼─────────┼────────────────────────┼───────────┼─────────────────────────────────────────┤
│ 221 │ Ip:162.158.49.136  │ crowdsecurity/jira_cve-2021-26086 │ IE      │ 13335 CLOUDFLARENET    │ ban:1     │ 2025-01-22 15:14:34.554328601 +0000 UTC │
│ 187 │ Ip:128.199.182.152 │ crowdsecurity/jira_cve-2021-26086 │ SG      │ 14061 DIGITALOCEAN-ASN │ ban:1     │ 2025-01-19 20:50:45.822199509 +0000 UTC │
│ 186 │ Ip:46.101.1.225    │ crowdsecurity/jira_cve-2021-26086 │ GB      │ 14061 DIGITALOCEAN-ASN │ ban:1     │ 2025-01-19 20:50:41.699518104 +0000 UTC │
│ 181 │ Ip:162.158.108.104 │ crowdsecurity/http-bad-user-agent │ SG      │ 13335 CLOUDFLARENET    │ ban:1     │ 2025-01-19 12:39:20.468268327 +0000 UTC │
│ 180 │ Ip:172.70.208.61   │ crowdsecurity/http-bad-user-agent │ SG      │ 13335 CLOUDFLARENET    │ ban:1     │ 2025-01-19 12:38:36.664997131 +0000 UTC │
╰─────┴────────────────────┴───────────────────────────────────┴─────────┴────────────────────────┴───────────┴─────────────────────────────────────────╯

You can see that CrowdSec took action, but it was the wrong one. It’s blocking the Cloudflare exit node and removed everyone’s access.

What we want is:

Identify the actual attacker
Block that somewhere effective (the firewall-bouncer can’t selectively block proxied traffic)

Identifying The Attacker

We could replace the CrowdSec Caddy log parser and use a different header, but there’s a hint in the CrowdSec parser that suggests using the trusted_proxies module.

##Caddy now sets client_ip to the value of X-Forwarded-For if users sets trusted proxies

Additionally, we can choose the CF-Connecting-IP header like francislavoie suggests, as X-Forwarded-For is easily spoofed.

Add a Trusted Proxy

To set Cloudflare as a trusted proxy we must identify all the Cloudflare exit node IPs to trust them. That would be hard to manage, but happily, there’s a handy caddy-cloudflare-ip module for that. Many thanks to WeidiDeng!

sudo caddy add-package github.com/WeidiDeng/caddy-cloudflare-ip

sudo vi /etc/caddy/Caddyfile

#
# Global Options Block
#
{
        servers {             
                trusted_proxies cloudflare  
                client_ip_headers CF-Connecting-IP  
        }    
}

After restarting Caddy, we can see the header change

sudo head /var/log/caddy/access.log  | jq '.request'
sudo tail /var/log/caddy/access.log  | jq '.request'

Before

  "remote_ip": "172.68.15.223",
  "client_ip": "172.68.15.223",

After

  "remote_ip": "172.71.98.114",
  "client_ip": "109.206.128.45",

And when consulting crowdsec, we can see it’s using the client_ip information.

sudo tail /var/log/caddy/access.log > test.log
sudo cscli explain -v --file ./test.log --type caddy

 ├ s01-parse
 | └ 🟢 crowdsecurity/caddy-logs (+14 ~2)
 |  └ update evt.Stage : s01-parse -> s02-enrich
 |  └ create evt.Parsed.remote_ip : 109.206.128.45 <-- Your Actual IP

And when launching a probe we can see it show up with the correct IP.

# Ask for lots of pages that don't exist to simulate a HTTP probe
for X in {1..100}; do curl -D - https://www.some.org/$X;done


sudo cscli decisions list   
╭─────────┬──────────┬───────────────────┬────────────────────────────┬────────┬─────────┬───────────────┬────────┬────────────┬──────────╮
│    ID   │  Source  │    Scope:Value    │           Reason           │ Action │ Country │       AS      │ Events │ expiration │ Alert ID │
├─────────┼──────────┼───────────────────┼────────────────────────────┼────────┼─────────┼───────────────┼────────┼────────────┼──────────┤
│ 2040067 │ crowdsec │ Ip:109.206.128.45 │ crowdsecurity/http-probing │ ban    │ US      │ 600 BADNET-AS │ 11     │ 3h32m5s    │ 235      │
╰─────────┴──────────┴───────────────────┴────────────────────────────┴────────┴─────────┴───────────────┴────────┴────────────┴──────────╯

This doesn’t do anything on its own (because traffic is proxied) but we can make it work if we change bouncers.

Changing Bouncers

The ideal approach would to tell Cloudflare to stop forwarding traffic from the bad actors. There is a cloudflare-bouncer to do just that. It’s rate limited however, and only suitable for premium clients. There is also the CrowdSec Cloudflare Worker. It’s better, but still suffers from limits for non-premium clients.

Caddy Bouncer

Instead, we’ll use the caddy-crowdsec-bouncer. This is a layer 4 (protocol level) bouncer. It works inside Caddy and will block IPs based on the client_ip from the HTTP request.

Generate an API key for the bouncer with the bouncer add command - this doesn’t actually install anything, just generates a key.

sudo cscli bouncers add caddy-bouncer

Add the module to Caddy (which is the actual install).

sudo caddy add-package github.com/hslatman/caddy-crowdsec-bouncer

Configure Caddy

#
# Global Options Block
#
{
        
        crowdsec {
                api_key ABIGLONGSTRING
        }
        # Make sure to add the order statement
        order crowdsec first
}
www.some.org {

    crowdsec 

    root * /var/www/www.some.org
    file_server
}

And restart.

sudo systemctl restart caddy.service

Testing Remediation

Let’s test that probe again. Initially, you’ll get a 404 (not found) but after while of that, it should switch to 403 (access denied)

for X in {1..100}; do curl -D - --silent https://www.some.org/$X | grep HTTP;done

HTTP/2 404 
HTTP/2 404 
...
...
HTTP/2 403 
HTTP/2 403

Conclusion

Congrats! after much work you’ve traded 404s for 403s. Was it worth it? Probably. If an adversary’s probe had a chance to find something, it has less of a chance now.

Bonus Section

I mentioned earlier that the X-Forwarded-For header could be spoofed. Let’s take a look at that. Here’s an example.

# Comment out 'client_ip_headers CF-Connecting-IP' from your Caddy config, and restart.

for X in {1..100}; do curl -D - --silent "X-Forwarded-For: 192.168.0.2" https://www.some.org/$X | grep HTTP;done

HTTP/2 404 
HTTP/2 404 
...
...
HTTP/2 404 
HTTP/2 404

No remediation happens. Turns out Cloudflare appends by default, giving you:

sudo tail -f /var/log/caddy/www.some.org.log | jq

    "client_ip": "192.168.0.2",

      "X-Forwarded-For": [
        "192.168.0.2,109.206.128.45"
      ],

Caddy takes the first value, which is rather trusting but canonically correct, puts it as the client_ip and CrowdSec uses that.

Adjusting Cloudflare

You don’t need to, but you can configure Cloudflare to “Remove visitor IP headers”. This is counterintuitive, but the notes say “…Cloudflare will only keep the IP address of the last proxy”. In testing, it keeps the last value in the X-Forwarded-For string, and that’s what we’re after. It works for normal and forged headers.

Log in to the Cloudflare dashboard and select your website
Go to Rules > Overview
Select “Manage Request Header Transform Rules”
Select “Managed Transforms”
Enable Remove visitor IP headers

The Overview page may look different depending on your plan, so you may have to hunt around for this setting.

Now when you test, you’ll get access denied regardless of your header

for X in {1..100}; do curl -D - --silent "X-Forwarded-For: 192.168.0.2" https://www.some.org/$X | grep HTTP;done

HTTP/2 404 
HTTP/2 404 
...
...
HTTP/2 403 
HTTP/2 403

Bonus Ending

You’ve added an extra layer of protection - but it’s not clear if it’s worth it. It may add to the proxy time, so use at your own discretion.