This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Documentation

This is the documentation root. Use the left-hand nav bar to descend taxonomically, or use the search to find what you are after.

1 - Email

Email is a commodity service, but critical for many things - so you can get it anywhere, but you better not mess it up.

Your options, in increasing order of complexity, are:

Forwarding

Email sent to [email protected] is simply forwarded to someplace like gmail. It’s free and easy, and you don’t need any infrastructure. Most registrars like GoDaddy, NameCheap, CloudFlare, etc, will handle it.

You can even reply from [email protected] by integrating with SendGrid or a similar provider.

Remote-Hosting

If you want more, Google and Microsoft have full productivity suites. Just edit your DNS records, import your users, and pay them $5 a head per month. You still have to ‘do email’ but it’s a little less work than if you ran the whole stack. In most cases, companies that specialize in email do it better than you can.

Self-Hosting

If you are considering local email, let me paraphrase Kenji López-Alt. The first step is, don’t. The big guys can do it cheaper and better. But if it’s a philosophical, control, or you just don’t have the funding, press on.

A Note About Cost

Most of the cost is user support. Hosting means someone else gets purchase and patch a server farm, but you still have to talk to users. My (anecdotal) observation is that fully hosting saves 10% in overall costs and it soothes out expenses. The more users you have, the more that 10% starts to matter.

1.1 - Forwarding

This is the best solution for a small number of users. You configure it at your registrar and rely on google (or someone similar) to do all the work for free.

If you want your out-bound emails to come from your domain name (and you do), add an out-bound relay. This is also free for minimal use.

Registrar Configuration

This is different per registrar, but normally involves creating an address and it’s destination

Cloudflare

  • (Login - assumes you use cloudflare as your registrar)
  • Login and select the domain in question.
  • Select Email, then Email Routing.
  • Under Routes, select Create address.

Once validated, email will begin arriving at the destination.

Configure Relaying

The registrars is only forwarding email, not sending it. To get your sent mail to from from your domain, you must integrate with a mail service such as SendGrid

SendGrid

  • Create a free account and login
  • Authenticate your domain name (via DNS)
  • Create an API key (Settings -> API Keys -> Restricted Access, Defaults)

Gmail

  • Settings -> Accounts -> Send Mail as
  • Add your domain email
  • Configure the SMTP server with:
    • SMTP server: “smtp.sendgrid.net”
    • username: “apikey”
    • password: (the key you created above)

After validating the code Gmail sends you, there will be a drop down in the From field of new emails.

1.2 - Remote Hosting

This is more in the software-as-a-service category. You get an admin dashboard and are responsible for managing users and mail flow. The hosting service provide will help you with basic things, but you’re doing most of the work yourself.

Having manged 100K+ user mail systems and migrated from on-prem sendmail to exchange and then O365 and Google, I can confidently say the infrastructure and even platform amounts to less than 10% of the cost of providing the service.

The main advantage to hosting is that you’re not managing the platform, installing patches and replacing hardware. The main disadvantage is is that you have little control and sometimes things are broken and you can’t do anything about it.

Medium sized organizations benefit most from hosting. You probably need a productivity suite anyways, and email is usually wrapped up in that. It saves you from having to specialize someone in email and the infrastructure associated with it.

But if controlling access to your data is paramount, then be aware that you have lost that and treat email as a public conversation.

1.3 - Self Hosting

When you self-host, you develop expertise in email itself, arguably a commodity service where such expertise has small return. But, you have full control and your data is your own.

The generally accepted best practice is install Postfix and Dovecot. This is the simplest path and what I cover here. But there are some pretty decent all-in-one packages such as Mailu, Modoboa, etc. These usually wrap Postfix and Dovecot to spare you the details and improve your quality of life, at the cost of not really knowing how they really work.

You’ll also need to configure a relay. Many ISPs block basic mail protocol and many recipient servers are rightly suspicious of random emails from unknown IPs in cable modem land.

  1. Postfix
  2. Dovecot
  3. Relay

1.3.1 - Postfix

This is the first step - having a server that accepts and sends mail. After installing, you’ll be able to check messages at the console. Remote client access (such as with Thunderbird) comes later.

Preparation

You need:

  • Linux Server
  • Firewall Port-Forward
  • Public DNS

We use Debian Bookwork (12) in this example but any derivative will be similar. You’ll forward port 25 an add DNS entries after the installation.

Installation

Some configuration is done at install time by the package so you must make sure your hostname is correct. We use the hostname ‘mail’ in this example.

# Correct internal hostnames as needed. 'mail' and 'mail.home.lan' are good suggestions.
cat /etc/hostname /etc/hosts

# Set the external host name and run the package installer
EXTERNAL="mail.your.org"
sudo debconf-set-selections <<< "postfix postfix/mailname string $EXTERNAL"
sudo debconf-set-selections <<< "postfix postfix/main_mailer_type string 'Internet Site'"
sudo apt install --assume-yes postfix

# Add the main domain to the destinations as well
DOMAIN="your.org"
sudo sed -i "s/^mydestination = \(.*\)/mydestination = $DOMAIN \1/"  /etc/postfix/main.cf

Test with telnet - use your unix system ID for the rcpt address below.

telnet localhost 25
ehlo localhost
mail from: <[email protected]>
rcpt to: <[email protected]>
data
Subject: Wish List

Red Ryder BB Gun
.
quit

Assuming that ‘you’ matches your shell account, Postfix will have accepted the message and used it’s Local Delivery Agent to store it in the local message store. That’s in /var/mail.

cat /var/mail/YOU 

Configuration

DNS

At a minimum, you need a server entry. This allows mail sent directly to your server ([email protected]) to reach it. You should also create a special entry for your domain root. That way, mail sent to [email protected] will get routed to the right server. Use your actual IP address and take advantage of dynamic DNS when possible.

Name Type Value
mail A 20.236.44.162
@ MX mail

Encryption

Postfix will use the untrusted “snakeoil” that debian supplies by default to opportunistically encrypt communication between in and other mail servers. Surprisingly, most other servers will accept this cert (or fall back to non-encrypted). So lets proceed with it for now until we generate a trusted one later.

Spam Protection

The default config is secured so that it won’t relay messages, but it will accept message from Santa, and is subject to backscatter and a few other things. Let’s tighten it up.

sudo tee -a /etc/postfix/main.cf << EOF

# Tighten up formatting
smtpd_helo_required = yes
disable_vrfy_command = yes
strict_rfc821_envelopes = yes

# Error codes instead of bounces
invalid_hostname_reject_code = 554
multi_recipient_bounce_reject_code = 554
non_fqdn_reject_code = 554
relay_domains_reject_code = 554
unknown_address_reject_code = 554
unknown_client_reject_code = 554
unknown_hostname_reject_code = 554
unknown_local_recipient_reject_code = 554
unknown_relay_recipient_reject_code = 554
unknown_virtual_alias_reject_code = 554
unknown_virtual_mailbox_reject_code = 554
unverified_recipient_reject_code = 554
unverified_sender_reject_code = 554
EOF

sudo systemctl reload postfix.service

PostFix has some recommendations as well.

sudo tee -a /etc/postfix/main.cf << EOF

# PostFix Suggestions
smtpd_helo_restrictions = 
	reject_unknown_helo_hostname
smtpd_sender_restrictions = 
	reject_unknown_sender_domain
smtpd_recipient_restrictions = 
	permit_mynetworks, 
	permit_sasl_authenticated,
	reject_unauth_destination,
	reject_rbl_client zen.spamhaus.org,
	reject_rhsbl_reverse_client dbl.spamhaus.org,
	reject_rhsbl_helo dbl.spamhaus.org,
	reject_rhsbl_sender dbl.spamhaus.org
smtpd_relay_restrictions = 
	permit_mynetworks, 
	permit_sasl_authenticated,
	reject_unauth_destination
smtpd_data_restrictions = 
	reject_unauth_pipelining
EOF

sudo systemctl reload postfix.service

If you test a message from Santa now, Postfix will do some checks and realize it’s bogus.

550 5.7.27 [email protected]: Sender address rejected: Domain northpole.org does not accept mail (nullMX)

Header Cleanup

Postfix will attach a Received: header to outgoing emails that has details of your internal network and mail client. That’s information you don’t need to broadcast. You can remove that with a “cleanup” step as the message is sent.

# Insert a header check after the 'cleanup' line in the smtp section of the master file and create a header_checks file
sudo sed -i '/^cleanup.*/a    -o header_checks=regexp:/etc/postfix/header_checks' /etc/postfix/master.cf
echo "/^Received:/ IGNORE" | sudo tee -a /etc/postfix/header_checks

Note - there is some debate on if this triggers a higher spam score. You may want to replace instead.

Testing

Incoming

You can now receive mail to [email protected] and [email protected]. Try this to make sure you’re getting messages. Feel free to install mutt if you’d like a better client at the console.

Outgoing

You usually can’t send mail. Many ISPs block outgoing port 25 to keep a lid on spam bots. This prevents you from sending any messages. You can test that by trying to connect to gmail on port 25 from your server.

nc -zv gmail-smtp-in.l.google.com 25

Also, many mail servers will reverse-lookup your IP to see who it belongs to. That request will go to your ISP (who owns the IPs) and show their DNS name instead of yours. You’re often blocked at this step, though some providers will work with you. Put in a ticket to find out.

Even if you’re not blocked and your ISP has given you a static IP with a matching reverse-lookup, you will suffer from a lower reputation score as you’re not a well-known email provider. This can cause your sent messages to be delayed while being considered for spam.

To solve these issues, relay your email though a email provider. This will improve your reputation score (used to judge spam), ease the additional security layers such as SPF, DKIM, DMARC, and is usually free at small volume.

Postfix even calls this using a ‘Smarthost’

Next Step

Troubleshooting

When adding Postfix’s anti-spam suggestions, we left off the smtpd_client_restrictions and smtpd_end_of_data_restrictions as they created problems during testing.

You may get a warning from Postfix that one of the settings you’ve added is overriding one of the earlier settings. Simply delete the first instance. These are usually default settings that we’re overriding.

Sources

https://serverfault.com/questions/143968/automate-the-installation-of-postfix-on-ubuntu
https://willem.com/blog/2019-09-10_fighting-backscatter-spam-at-server-level
https://www.linuxbabe.com/mail-server/setup-basic-postfix-mail-sever-ubuntu

Misc

DNS and Network

You need two DNS records to ensure email reaches you. A mail exchange (MX) record for the root domain indicating what server accepts mail, and a hostname (A) for that server so it can be found.

Type Name Value
MX @ mail
A mail (firewall IP Address)

At your firewall, port-forward TCP 25 to your internal server and adjust any local firewalls as needed.

Mail Addresses

Postfix only accepts messages for users in the “local recpient table” which is built from the unix password file and the aliases file[^pdwb].

Postfix doesn’t know about your root domain yet. Append that to the mydestinations line in the main.cf file. You should also configure the HELO name. Otherwise, the server will go around identifying itself as it’s internal hostname.

sudo sed -i 's/^mydestination.*/&, your.org/' /etc/postfix/main.cf
sudo sed -i '/^mydestination.*/a smtp_helo_name = mail.your.org' /etc/postfix/main.cf
sudo systemctl restart postfix.service

The “Postmaster” address goes to root by default. Direct root’s email to you to see those and update mail aliases.

echo "root:   $USER" | sudo tee -a /etc/aliases
sudo newaliases

1.3.2 - Relay

A relay is simply another mail server that you give your outgoing mail to, rather than try to deliver it yourself.

There are many companies that specialize in this. Sign up for a free account and they give you the block of text to add to your postfix config. Some popular ones are:

  • SendGrid
  • MailGun
  • Sendinblue

They allow anywhere between 50 and 300 a day for free.

SendGrid

Relay Setup

SendGrid’s free plan gives you 50 emails a day. Create an account, verify your email address ([email protected]), and follow the instructions.

https://docs.sendgrid.com/for-developers/sending-email/postfix

Restart Postfix and use mutt to send an email. It works! the only thing you’ll notice is that your message has a “On Behalf Of” notice in the message letting you know it came from SendGrid. Follow the section below to change that.

Domain Integration

To integrate your domain fully, add DNS records for SendGrid using these instructions.

https://docs.sendgrid.com/ui/account-and-settings/how-to-set-up-domain-authentication

This will require you to login and go to:

  • Settings -> Sender Authentication -> Domain Authentication

Stick with the defaults that include automatic security and SendGrid will give you three CNAME records. Add those to your DNS and your email will check out.

Technical Notes

DNS

If you’re familiar with email domain-based security, you’ll see that two of the records SendGrid gives you are links to DKIM keys so SendGrid can sign emails as you. The other record (emXXXX) is the host sendgrid will use to send email. The SPF record for that host will include a SendGrid SPF record that includes multiple pools of IPs so that SPF checks will pass. They use CNAMEs on your side so they can rotate keys and pool addresses without changing DNS entries.

If none of this makes sense to you, then that’s really the point. You don’t have to know any of it - they take care of it for you.

Next Steps

Your server can now send email too. All shell users on your sever rejoice!

To actually use your mail server, you’ll want to add some remote client access.

1.3.3 - Dovecot

Dovecot is an IMAP (Internet Message Access Protocol) server that allows remote clients to access their mail. There are other protocols and servers, but Dovecot has about 75% of the internet and is a good choice.

Installation

sudo apt install dovecot-imapd
sudo apt install dovecot-submissiond

Configuration

Storage

Both Postfix and Dovecot use mbox storage format by default. This is one big file with all your mail in it and doesn’t scale well. Switch to the newer maildir format where your messages are stored as individual files.

# Change where Postfix delivers mail.
sudo postconf -e "home_mailbox = Maildir/"
sudo systemctl reload postfix.service

# Change where Dovecot looks for mail.
sudo sed -i 's/^mail_location.*/mail_location = maildir:~\/Maildir/' /etc/dovecot/conf.d/10-mail.conf
sudo systemctl reload dovecot.service

Encryption

Dovecot comes with it’s own default cert. This isn’t trusted, but Thunderbird will prompt you and you can choose to accept it. This will be fine for now. We’ll generate a valid cert later.

Credentials

Dovecot checks passwords against the local unix system by default and no changes are needed.

Outgoing Mail

One potential surprise is that IMAP is only for retrieving mail. It’s remote file access only.

To send mail, your client is would traditionally relay messages to your mail server. But we have relaying turned off, as we don’t want just anyone relaying messages.

The solution is to enable authentication and by convention this is done by a separate port process, called the Submission Server.

We’ve installed Dovecot’s submission server as it’s newer and easier to set up. Postfix even suggests considering it, rather than theirs. The only configuration needed it to set the localhost as the relay.

# Set the relay as localhost where postfix runs
sudo sed -i 's/#submission_relay_host =/submission_relay_host = localhost/' /etc/dovecot/conf.d/20-submission.conf
sudo systemctl reload dovecot.service

Port Forwarding

Forward ports 143 and 587 to your mail server and test that you can connect from both inside and outside your LAN.

nc -zf mail.your.org 143
nc -zf mail.your.org 587

If it’s working from outside your network, but not inside, you may need to enable [reflection] aka hairpin NAT. This will be different per firewall vendor, but in OPNSense it’s:

Firewall -> Settings -> Advanced

 # Enable these settings
Reflection for port forwards
Reflection for 1:1
Automatic outbound NAT for Reflection

Clients

Thunderbird and others will successfully discover the correct ports and services when you provide your email address of [email protected].

Notes

Dovecot defaults to port 587 for the submission service which is an older standard for explicit TLS. It’s now recommended by RFC to use implicit TLS on port 465.

You can enable both and your clients will pick their fav. Thunderbird defaults to the 465 when both are available.

vi /etc/dovecot/conf.d/10-master.conf

# Change the default of

service submission-login {
  inet_listener submission {
    #port = 587
  }
}

to 

service submission-login {
  inet_listener submission {
    #port = 587
  }
  inet_listener submissions {
    port = 465
    ssl = yes
  }
}

Next Steps

Now that you’ve got the basics working, let’s secure things a little more

Sources

https://dovecot.org/list/dovecot/2019-July/116661.html

1.3.4 - Security

Certificates

We should use valid certificates. The best way to do that is with the certbot utility.

Certbot

Certbot automates the process of getting and renewing certs, and only requires a brief connection to port 80 as proof it’s you. It only runs once every 60 days so there is little risk of exploit.

Forward Port 80

You probably already have a web server using port 80 at your firewall. To make it work with certbot, add a name-based virtual host proxy.

# Here is a caddy example. Add this block to your Caddyfile
http://mail.your.org {
        reverse_proxy * mail.internal.lan
}

# You can also use a well-known URL if you're already using that vhost
http://mail.your.org {
   handle /.well-known/acme-challenge/ {
     reverse_proxy mail.internal.lan
   }
 }

Install Certbot

Once the port forwarding is in place, you can install certbot and request a certificate. Note the --deploy-hook argument. This reloads services after a cert is obtained or renewed. Else, they’ll keep using an expired one.

sudo apt install certbot
sudo certbot certonly --standalone --domains mail.your.org --non-interactive --agree-tos -m [email protected] --deploy-hook "service postfix reload; service dovecot reload"

Postfix

Tell Postfix about the cert by using the postconf utility. This will warn you about any potential configuration errors.

sudo postconf -e 'smtpd_tls_cert_file = /etc/letsencrypt/live/mail.your.org/fullchain.pem'
sudo postconf -e 'smtpd_tls_key_file = /etc/letsencrypt/live/mail.your.org/privkey.pem'
sudo postfix reload

Dovecot

Change the Dovecot to use the cert as well.

sudo sed -i 's/^ssl_cert = .*/ssl_cert = <\/etc\/letsencrypt\/live\/MAIL.YOUR.ORG\/fullchain.pem/' /etc/dovecot/conf.d/10-ssl.conf
sudo sed -i 's/^ssl_key = .*/ssl_key = <\/etc\/letsencrypt\/live\/MAIL.YOUR.ORG\/privkey.pem/' /etc/dovecot/conf.d/10-ssl.conf
sudo dovecot reload

Verifying

You can view the certificates with the commands:

openssl s_client -connect mail.server.org:143 -starttls imap -servername mail.server.org
openssl s_client -starttls smtp -showcerts -connect mail.server.org:587 -servername mail.server.org

Intrusion Prevention

In my testing it takes less than an hour before someone discovers and attempts to break into your mail server. You may wish to GeoIP block or otherwise limit connections. You can also use crowdsec.

Crowdsec

Crowdsec is an open-source IPS that monitors your log files and blocks suspicious behavior.

Install as per their instructions.

curl -s https://packagecloud.io/install/repositories/crowdsec/crowdsec/script.deb.sh | sudo bash
sudo apt install -y crowdsec
sudo apt install crowdsec-firewall-bouncer-nftables
sudo cscli collections install crowdsecurity/postfix

Postfix

Most services now log to the system journal rather than a file. You can view them with the journalctl command

# What is the exact service unit name?
sudo systemctl status | grep postfix

# Anything having to do with that service unit
sudo journalctl --unit [email protected]

# Zooming into just the identifiers smtp and smtpd
sudo journalctl --unit [email protected] -t postfix/smtp -t postfix/smtpd

Crowdsec accesses the system journal by adding a block to it’s log acquisition directives.

sudo tee -a /etc/crowdsec/acquis.yaml << EOF
source: journalctl
journalctl_filter:
  - "[email protected]"
labels:
  type: syslog
---
EOF

sudo systemctl reload crowdsec

Dovecot

Install the dovecot collection as well.

sudo cscli collections install crowdsecurity/dovecot
sudo tee -a /etc/crowdsec/acquis.yaml << EOF
source: journalctl
journalctl_filter:
  - "_SYSTEMD_UNIT=dovecot.service"
labels:
  type: syslog
---
EOF

sudo systemctl reload crowdsec

Is it working? You won’t see anything at first unless you’re actively under attack. But after 24 hours you may see some examples of attempts to relay spam.

allen@mail:~$ sudo cscli alerts list
╭────┬────────────────────┬────────────────────────────┬─────────┬──────────────────────────────────────────────┬───────────┬─────────────────────────────────────────╮
│ ID │       value        │           reason           │ country │                      as                      │ decisions │               created_at                │
├────┼────────────────────┼────────────────────────────┼─────────┼──────────────────────────────────────────────┼───────────┼─────────────────────────────────────────┤
│ 60 │ Ip:187.188.233.58  │ crowdsecurity/postfix-spam │ MX      │ 17072 TOTAL PLAY TELECOMUNICACIONES SA DE CV │ ban:1     │ 2023-05-24 06:33:10.568681233 +0000 UTC │
│ 54 │ Ip:177.229.147.166 │ crowdsecurity/postfix-spam │ MX      │ 13999 Mega Cable, S.A. de C.V.               │ ban:1     │ 2023-05-23 20:17:49.912754687 +0000 UTC │
│ 53 │ Ip:177.229.154.70  │ crowdsecurity/postfix-spam │ MX      │ 13999 Mega Cable, S.A. de C.V.               │ ban:1     │ 2023-05-23 20:15:27.964240044 +0000 UTC │
│ 42 │ Ip:43.156.25.237   │ crowdsecurity/postfix-spam │ SG      │ 132203 Tencent Building, Kejizhongyi Avenue  │ ban:1     │ 2023-05-23 01:15:43.87577867 +0000 UTC  │
│ 12 │ Ip:167.248.133.186 │ crowdsecurity/postfix-spam │ US      │ 398722 CENSYS-ARIN-03                        │ ban:1     │ 2023-05-20 16:03:15.418409847 +0000 UTC │
╰────┴────────────────────┴────────────────────────────┴─────────┴──────────────────────────────────────────────┴───────────┴─────────────────────────────────────────╯

If you’d like to get into the details, take a look at the Crowdsec page .

Next Steps

Now that you’ve got the inside secured, let’s secure the ‘outside’ parts of it so people trust the email you’re sending.

Sources

https://discourse.crowdsec.net/t/crowdsec-container-and-journald-matches-issues/953

1.3.5 - Authentication

Email authentication prevents forgery. People can still send unsolicited email, but they can’t fake who it’s from. If you set up a Relay for Postfix, the relayer is doing it for you. But otherwise, proceed onward to prevent your outgoing mail being flagged as spam.

You need three things

  • SPF: Server IP addresses - which specific servers have authorization to send email.
  • DKIM: Server Secrets - email is signed so you know it’s authentic and unchanged.
  • DMARC: Verifies the address in the From: aligns with the domain sending the email, and what to do if not.

SPF

SPF, or Sender Policy Framework, is the oldest component. It’s a DNS TXT record that lists the servers authorized to send email for a domain.

A receiving server looks at a messages’s return path (aka RFC5321.MailFrom header) to see what domain the email purports to be from. It then looks up that domain’s SPF record and if the server that sent the email isn’t included, the email is considered forged.

Note - this doesn’t check the From: header the user sees. Messages can appear (to the user) to be from anywhere. So it’s is mostly a low-level check to prevent spambots.

The DNS record for your Postfix server should look like:

Type: "TXT"
NAME: "@"
Value: "v=spf1 a:mail.your.org -all"

The value above shows the list of authorized servers (a:) contains mail.your.org. Mail from all other servers is considered forged (-all).

To have your Postfix server check SPF for incoming messages add the SPF policy agent.

sudo apt install postfix-policyd-spf-python

sudo tee -a /etc/postfix/master.cf << EOF

policyd-spf  unix  -       n       n       -       0       spawn
    user=policyd-spf argv=/usr/bin/policyd-spf
EOF

sudo tee -a /etc/postfix/main.cf << EOF

policyd-spf_time_limit = 3600
smtpd_recipient_restrictions =
   permit_mynetworks,
   permit_sasl_authenticated,
   reject_unauth_destination,
   check_policy_service unix:private/policyd-spf
EOF

sudo systemctl restart postfix

DKIM

DKIM, or DomainKeys Identified Mail, signs the emails as they are sent ensuring that the email body and From: header (the one you see in your client) hasn’t been changed in transit and is vouched for by the signer.

Receiving servers see the DKIM header that includes who signed it, then use DNS to check it. Unsigned mail simply isn’t checked. (There is no could-but-didn’t in the standard).

Note - There is no connection between the domain that signs the message and what the user sees in the From: header. Messages can have a valid DKIM signature and still appear to be from anywhere. DKIM is mostly to prevent man-in-the-middle attacks from altering the message.

For Postfix, this requires installation of OpenDKIM and a connection as detailed here. Make sure to sign with the domain root.

https://tecadmin.net/setup-dkim-with-postfix-on-ubuntu-debian/

Once you’ve done that, create the following DNS entry.

Type: "TXT"
NAME: "default._domainkey"
Value: "v=DKIM1; h=sha256; k=rsa; p=MIIBIjANBgkq..."

DMARC

Having a DMARC record is the final piece that instructs servers to check the From: header the user sees against the domain return path from the SPF and DKIM checks, and what to do on a fail.

This means mail “From: [email protected]” sent though mail.your.org mail servers will be flagged as spam.

The DNS record should look like:

Type: "TXT"
NAME: "_dmarc"
Value: "v=DMARC1; p=reject; adkim=s; aspf=r;"
  • p=reject: Reject messages that fail
  • adkim=s: Use strict DKIM alignment
  • aspf=r: Use relaxed SPF alignment

Reject (p=reject) indicates that email servers should “reject” emails that fail DKIM or SPF tests, and skip quarantine.

Strict DKIM alignment (=s) means that the SPF Return-Path domain or the DKIM signing domain must be an exact match with the domain in the From: address. A DKIM signature from your.org would exactly match [email protected].

Relaxed SPF alignment (=r) means subdomains of the From: address are acceptable. I.e. the server mail.your.org from the SPF test aligns with an email from: [email protected].

You can also choose quarantine mode (p=quarantine) or report-only mode (p=none) where the email will be accepted and handled as such by the receiving server, and a report sent to you like below.

v=DMARC1; p=none; rua=mailto:[email protected]

DMARC is an or test. In the first example, if either the SPF or DKIM domains pass, then DMARC passes. You can choose to test one, both or none at all (meaning nothing can pass DMARC) as the the second DMARC example.

To implement DMARC checking in Postfix, you can install OpenDMARC and configure a mail filter as described below.

https://www.linuxbabe.com/mail-server/opendmarc-postfix-ubuntu

Next Steps

Now that you are hadnling email securely and authentically, let’s help ease client connections

Autodiscovery

1.3.6 - Autodiscovery

In most cases you don’t need this. Thunderbird, for example, will use a shotgun approach and may find your sever using ‘common’ server names based on your email address.

But there is an RFC and other clients may need help.

DNS SRV

This takes advantage of the RFC with an entry for IMAP and SMTP Submission

Type Name Service Protocol TTL Priority Weight Port Target
SRV @ _imap TCP auto 10 5 143 mail.your.org
SRV @ _submission TCP auto 10 5 465 mail.your.org

Web Autoconfig

  • Create a DNS entry for autoconfig.your.org
  • Create a vhost and web root for that with the file mail/config-v1.1.xml
  • Add the contents below to that file
<?xml version="1.0"?>
<clientConfig version="1.1">
    <emailProvider id="your.org">
      <domain>your.org</domain>
      <displayName>Example Mail</displayName>
      <displayShortName>Example</displayShortName>
      <incomingServer type="imap">
         <hostname>mail.your.org</hostname>
         <port>143</port>
         <socketType>STARTTLS</socketType>
         <username>%EMAILLOCALPART%</username>
         <authentication>password-cleartext</authentication>
      </incomingServer>
      <outgoingServer type="smtp">
         <hostname>mail.your.org</hostname>
         <port>587</port>
         <socketType>STARTTLS</socketType> 
         <username>%EMAILLOCALPART%</username> 
         <authentication>password-cleartext</authentication>
         <addThisServer>true</addThisServer>
      </outgoingServer>
    </emailProvider>
    <clientConfigUpdate url="https://www.your.org/config/mozilla.xml" />
</clientConfig>

Note

It’s traditional to match server names to protocols and we would have used “imap.your.org” and “smtp.your.org”. But using ‘mail’ is popular now and it simplifies setup at several levels.

Thunderbird will try to guess at your server names, attempting to connect to smtp.your.org for example. But many Postfix configurations have spam prevention that interfere.

Sources

https://cweiske.de/tagebuch/claws-mail-autoconfig.htm
https://www.hardill.me.uk/wordpress/2021/01/24/email-autoconfiguration/

2 - Media

2.1 - Signage

2.1.1 - Anthias (Screenly)

Overview

Anthias (AKA Screenly) is a simple, open-source digital signage system that runs well on a raspberry pi. When plugged into a monitor, it displays images, video or web sites in slideshow fashion. It’s managed directly though a web interface on the device and there are fleet and support options.

Preparation

Use the Raspberry Pi Imager to create a 64 bit Raspberry Pi OS Lite image. Select the gear icon at the bottom right to enable SSH, create a user, configure networking, and set the locale. Use SSH continue configuration.

setterm --cursor on

sudo raspi-config nonint do_change_locale en_US-UTF-8
sudo raspi-config nonint do_configure_keyboard us
sudo raspi-config nonint do_wifi_country US
sudo timedatectl set-timezone America/New_York
  
sudo raspi-config nonint do_hostname SOMENAME

sudo apt update;sudo apt upgrade -y

sudo reboot

Enable automatic updates and enable reboots

sudo apt -y install unattended-upgrades

# Remove the leading slashes from some of the updates and set to true
sudo sed -i 's/^\/\/\(.*origin=Debian.*\)/  \1/' /etc/apt/apt.conf.d/50unattended-upgrades
sudo sed -i 's/^\/\/\(Unattended-Upgrade::Remove-Unused-Kernel-Packages \).*/  \1"true";/' /etc/apt/apt.conf.d/50unattended-upgrades
sudo sed -i 's/^\/\/\(Unattended-Upgrade::Remove-New-Unused-Dependencies \).*/  \1"true";/' /etc/apt/apt.conf.d/50unattended-upgrades
sudo sed -i 's/^\/\/\(Unattended-Upgrade::Remove-Unused-Dependencies \).*/  \1"true";/' /etc/apt/apt.conf.d/50unattended-upgrades
sudo sed -i 's/^\/\/\(Unattended-Upgrade::Automatic-Reboot \).*/  \1"true";/' /etc/apt/apt.conf.d/50unattended-upgrades

Installation

bash <(curl -sL https://www.screenly.io/install-ose.sh)

Operation

Adding Content

Navigate to the Web UI at the IP address of the device. You may wish to enter the settings and add authentication and change the device name.

You may add common graphic types, mp4, web and youtube links. It will let you know if it fails to download the youtube video. Some heavy web pages fail to render correctly, but most do.

Images must be sized to for the screen. In most cases this is 1080. Larger images are scaled down, but smaller images are not scaled up. For example, PowerPoint is often used to create slides, but it exports at 720. On a 1080 screen creates black boarders. You can change the resolution on the Pi with rasp-config or add a registry key to Windows to change PowerPoint’s output size.

Windows Registry Editor Version 5.00

[HKEY_CURRENT_USER\Software\Microsoft\Office\16.0\PowerPoint\Options]
"ExportBitmapResolution"=dword:00000096

Schedule the Screen

You may want to turn off the display during non-operation hours. The vcgencmd command can turn off video output and some displays will choose to enter power-savings mode.

sudo tee /etc/cron.d/screenpower << EOF

# m h dom mon dow usercommand

# Turn monitor on
30 7  * * 1-5 root /usr/bin/vcgencmd display_power 1

# Turn monitor off
30 19 * * 1-5 root /usr/bin/vcgencmd display_power 0

# Weekly Reboot just in case
0 7 * * 1 root /sbin/shutdown -r +10 "Monday reboot in 10 minutes"
EOF

Troubleshooting

YouTube Fail

You may find you must download the video manually and then upload to Anthias. Use the utility yt-dlp to list and then download the mp4 version of a video

yt-dlp --list-formats https://www.youtube.com/watch?v=YE7VzlLtp-4
yt-dlp --format 22 https://www.youtube.com/watch?v=YE7VzlLtp-4

WiFi Disconnect

Some variants of the OS do not automatically reconnect to WiFi should the Access Point reboot. You way want to add the following script that checks for that and reconnects.

sudo touch /usr/local/bin/checkwifi
sudo chmod +x /usr/local/bin/checkwifi
sudo vim.tiny /usr/local/bin/checkwifi
#!/bin/bash

# Exit if eth0 is connected
grep -q 1 /sys/class/net/eth0/carrier && exit

# Exit if WiFi isn't configured
grep -q ssid /etc/wpa_supplicant/wpa_supplicant.conf || exit 

GATEWAY=$(ip route list | grep default | awk '{print $3}')

ping -c4 $GATEWAY > /dev/null

if [ $? != 0 ]
then
  logger checkwifi fail `date`
  service wpa_supplicant restart
  service dhcpcd restart
fi
sudo tee /etc/cron.d/checkwifi << EOF
# Check WiFi connection
*/5 * * * * /usr/bin/sudo -H /usr/local/bin/checkwifi >> /dev/null 2>&1"
EOF

Hidden WiFi

If you didn’t set up WiFi during imaging, you can use raspi-config after boot, but you must add a line if it’s a hidden network, and reboot.

sudo sed -i '/psk/a\        scan_ssid=1' /etc/wpa_supplicant/wpa_supplicant.conf

Wrong IP on Splash Screen

This seems to be captured during installation and then resides statically in this file. Adjust as needed.

vi ./screenly/docker-compose.yml

2.1.2 - Anthias Deployment

If you do regular deployments you can create an image. A reasonable approach is to:

  • Shrink the last partition
  • Zero fill the remaining free space
  • Find the end of the last partition
  • DD that to a file
  • Use raspi-config to resize after deploying

Or you can use PiShrink to script all that.

Installation

wget https://raw.githubusercontent.com/Drewsif/PiShrink/master/pishrink.sh
chmod +x pishrink.sh
sudo mv pishrink.sh /usr/local/bin

Operation

# Capture and shrink the image
sudo dd if=/dev/mmcblk0 of=anthias-raw.img bs=1M
sudo pishrink.sh anthias-raw.img anthias.img

# Copy to a new card
sudo dd if=anthias.img of=/dev/mmcblk0 bs=1M

If you need to modify the image after creating it you can mount it via loop-back.

sudo losetup --find --partscan anthias.img
sudo mount /dev/loop0p2 /mnt/

# After you've made changes

sudo umount /mnt
sudo losetup --detach-all

Manual Steps

If you have access to a graphical desktop environment, use GParted. It will resize the filesystem and partitions for you quite easily.

# Mount the image via loopback and open it with GParted
sudo losetup --find --partscan anthias-raw.img

# Grab the right side of the last partition with your mouse and 
# drag it as far to the left as you can, apply and exit
sudo gparted /dev/loop0

Now you need to find the last sector and truncate the file after that location. Since the truncate utility operates on bytes, you convert sectors to bytes with multiplication.

# Find the End of the last partition. In the below example, it's Sector *9812664*
$ sudo fdisk -lu /dev/loop0

Units: sectors of 1 * 512 = 512 bytes

Device       Boot  Start     End Sectors  Size Id Type
/dev/loop0p1        8192  532479  524288  256M  c W95 FAT32 (LBA)
/dev/loop0p2      532480 9812664 9280185  4.4G 83 Linux


sudo losetup --detach-all

sudo truncate --size=$[(9812664+1)*512] anthias-raw.img

Very Manual Steps

If you don’t have a GUI, you can do it with a combination of commands.

# Mount the image via loopback
sudo losetup --find --partscan anthias-raw.img

# Check and resize the file system
sudo e2fsck -f /dev/loop0p2
sudo resize2fs -M /dev/loop0p2

... The filesystem on /dev/loop0p2 is now 1149741 (4k) blocks long

# Now you can find the end of the resized filesystem by:

# Finding the number of sectors.
#     Bytes = Num of blocks * block size
#     Number of sectors = Bytes / sector size
echo $[(1149741*4096)/512]

# Finding the start sector (532480 in the example below)
sudo fdisk -lu /dev/loop0

Device       Boot  Start      End  Sectors  Size Id Type
/dev/loop0p1        8192   532479   524288  256M  c W95 FAT32 (LBA)
/dev/loop0p2      532480 31116287 30583808 14.6G 83 Linux

# Adding the number of sectors to the start sector. Add 1 because you want to end AFTER the end sector
echo $[532480 + 9197928 + 1]

# And resize the part to that end sector (ignore the warnings)
sudo parted resizepart 2 9730409 

Great! Now you can follow the remainder of the GParted steps to find the new last sector and truncate the file.

Extra Credit

It’s handy to compress the image. xz is pretty good for this

xz anthias-raw.img

xzcat anthias-raw.img | sudo dd of=/dev/mmcblk0

In these procedures, we make a copy of the SD card before we do anything. Another strategy is to resize the SD card directly, and then use dd and read in X number of sectors rather than read it all in and then truncate it. A bit faster, if a but less recoverable from in the event of a mistake.

2.1.3 - API

The API docs on the web refer to screenly. Anthias uses an older API. However, you can access the API docs for the version your working with at

http://sign.your.domain/api/docs/

You’ll have to correct the swagger form with correct URL, but after that you can see what you’re working with.

3 - Monitoring

Time series vs event data.

3.1 - Metrics

3.1.1 - Prometheus

Overview

Prometheus is a time series database, meaning it’s optimized to store and work with data organized in time order. It includes in it’s single binary:

  • Database engine
  • Collector
  • Simple web-based user interface

This allows you to collect and manage data with fewer tools and less complexity than other solutions.

Data Collection

End-points normally expose metrics to Prometheus by making a web page available that it can poll. This is done by including a instrumentation library (provided by Prometheus) or simply adding a listener on a high-level port that spits out some text when asked.

For systems that don’t support Prometheus natively, there are a few add-on services to translate. These are called ’exporters’ and translate things such as SNMP into a web format Prometheus can ingest.

Alerting

You can also alert on the data collected. This is through the Alert Manager, a second package that works closely with Prometheus.

Visualization

You still need a dashboard tool like Grafana to handle visualizations, but you can get started quite quickly with just Prometheus.

3.1.1.1 - Installation

Prometheus doesn’t offer a repo. Debian does, but the stable release is about a year behind. So you’ll want to install from backports or testing. We’ll use the latter as it’s moderately up to date, and pin the rest of the release.

# Backports
echo 'deb http://deb.debian.org/debian bullseye-backports main' | sudo tee -a /etc/apt/sources.list.d/backports.list

# Testing
echo 'deb http://deb.debian.org/debian testing main' | sudo tee -a /etc/apt/sources.list.d/testing.list

# Pin the current level so you don't get a surprise upgrade
sudo tee -a /etc/apt/preferences.d/not-bookworm << EOF
Package: *
Pin: release n=bookworm
Pin-Priority: 50
EOF

# Living Dangerously with test
sudo apt update
sudo apt install -t testing prometheus

Configuration

Use this for your starting config.

cat /etc/prometheus/prometheus.yml
global:
  scrape_interval: 15s
scrape_configs:
  - job_name: prometheus
    static_configs:
      - targets: ["localhost:9090"]

This says every 15 seconds, run down the job list. And there is one job - to check out the system at ’localhost:9090’ which happens to be itself.

For every target listed, the scraper makes a web request for /metrics/ and stores the results. It ingests all the data presented and stores them for 15 days. You can choose to ignore certain elements or retain differently by adding config, but by default it takes everything given.

You can see this yourself by just asking like Prometheus would. Hit it up directly in your browser. For example, Prometheus is making metrics available at /metrics

http://some.server:9090/metrics

Operation

User Interface

You can access the Web UI at:

http://some.server:9090

At the top, select Graph (you should be there already) and in the Console tab click the dropdown labeled “insert metric at cursor”. There you will see all the data being exposed. This is mostly about the GO language it’s written in, and not super interesting. A simple Graph tab is available as well.

Data Composition

Data can be simple, like:

go_gc_duration_seconds_sum 3

Or it can be dimensional which is accomplished by adding labels. In the example below, the value of go_gc_duration_seconds has 5 labeled sub-sets.

go_gc_duration_seconds{quantile="0"} 4.5697e-05
go_gc_duration_seconds{quantile="0.25"} 7.814e-05
go_gc_duration_seconds{quantile="0.5"} 0.000103396
go_gc_duration_seconds{quantile="0.75"} 0.000143687
go_gc_duration_seconds{quantile="1"} 0.001030941

In this example, the value of net_conntrack_dialer_conn_failed_total has several.

net_conntrack_dialer_conn_failed_total{dialer_name="alertmanager",reason="refused"} 0
net_conntrack_dialer_conn_failed_total{dialer_name="alertmanager",reason="resolution"} 0
net_conntrack_dialer_conn_failed_total{dialer_name="alertmanager",reason="timeout"} 0
net_conntrack_dialer_conn_failed_total{dialer_name="alertmanager",reason="unknown"} 0
net_conntrack_dialer_conn_failed_total{dialer_name="default",reason="refused"} 0
net_conntrack_dialer_conn_failed_total{dialer_name="default",reason="resolution"} 0
net_conntrack_dialer_conn_failed_total{dialer_name="default",reason="timeout"} 0
net_conntrack_dialer_conn_failed_total{dialer_name="default",reason="unknown"} 0
net_conntrack_dialer_conn_failed_total{dialer_name="snmp",reason="refused"} 0
net_conntrack_dialer_conn_failed_total{dialer_name="snmp",reason="resolution"} 0
net_conntrack_dialer_conn_failed_total{dialer_name="snmp",reason="timeout"} 0
net_conntrack_dialer_conn_failed_total{dialer_name="snmp",reason="unknown"} 0

How is this useful? It allows you to do aggregations - such as looking at all the net_contrack failures, and also look at the failures that were specifically refused. All with the same data.

Removing Data

You may have a target you want to remove. Such as a typo hostname that is now causing a large red bar on a dashboard. You can remove that mistake by enabling the admin API and issuing a delete

sudo sed -i 's/^ARGS.*/ARGS="--web.enable-admin-api"/' /etc/default/prometheus

sudo systemctl reload prometheus

curl -s -X POST -g 'http://localhost:9090/api/v1/admin/tsdb/delete_series?match[]={instance="badhost.some.org:9100"}'

The default retention is 15 days. You may want less than that and you can configure --storage.tsdb.retention.time=1d similar to above. ALL data has the same retention, however. If you want historical data you must have a separate instance or use VictoriaMetrics.

Next Steps

Let’s get something interesting to see by adding some OS metrics

Troubleshooting

If you can’t start the prometheus server, it may be an issue with the init file. Test and Prod repos use different defaults. Add some values explicitly to get new versions running

sudo vi /etc/default/prometheus

ARGS="--config.file="/etc/prometheus/prometheus.yml  --storage.tsdb.path="/var/lib/prometheus/metrics2/"

3.1.1.2 - Node Exporter

This is a service you install on your end-points that make CPU/Memory/Etc. metrics available to Prometheus.

Installation

On each device you want to monitor, install the node exporter with this command.

sudo apt install prometheus-node-exporter

Do a quick test to make sure it’s responding to scrapes.

curl localhost:9100/metrics

Configuration

Back on your Prometheus server, add these new nodes as a job in the prometheus.yaml file. Feel free to drop the initial job where Prometheus was scraping itself.

global:
  scrape_interval: 15s
scrape_configs:
  - job_name: 'servers'
    static_configs:
    - targets:
      - some.server:9100
      - some.other.server:9100
      - and.so.on:9100
sudo systemctl reload prometheus.service

Operation

You can check the status of your new targets at:

http://some.server:9090/classic/targets

A lot of data is collected by default. On some low power systems you may want less. For just the basics, customize the the config to disable the defaults and only enable specific collectors.

In the case below we are reduce collection to just CPU, Memory, and Hardware metrics. When scraping a Pi 3B, this reduces the Scrape Duration from 500 to 50ms.

sudo sed -i 's/^ARGS.*/ARGS="--collector.disable-defaults --collector.hwmon --collector.cpu --collector.meminfo"/' /etc/default/prometheus-node-exporter
sudo systemctl restart prometheus-node-exporter

The available collectors are listed on the page:

https://github.com/prometheus/node_exporter

3.1.1.3 - SNMP Exporter

SNMP is one of the most prevalent (and clunky) protocols still widely used on network-attached devices. But it’s a good general-purpose way to get data from lots of different makes of products in a similar way.

But Prometheus doesn’t understand SNMP. The solution is a translation service that acts a a middle-man and ’exports’ data from those devices in a way Prometheus can.

Installation

Assuming you’ve already installed Prometheus, install some SNMP tools and the exporter. If you have an error installing the mibs-downloader, check troubleshooting at the bottom.

sudo apt install snmp snmp-mibs-downloader
sudo apt install -t testing prometheus-snmp-exporter

Change the SNMP tools config file to allow use of installed MIBs.

sudo sed -i 's/^mibs/# &/' /etc/snmp/snmp.conf

Preparation

We need a target, so assuming you have a switch somewhere and can enable SNMP on it, let’s query the switch for its name, AKA sysName. Here we’re using version “2c” of the protocol with the read-only password “public”. Pretty standard.

# Note: app
snmpwalk -v 2c -c public some.switch.address sysName

SNMPv2-MIB::sysName.0 = STRING: Some-Switch

Note: If you get back an error or just the ‘iso’ prefixed value, double check your MIBs are installed.

Configuration

To add this switch to the Prometheus scraper, add a new job to the prometheus.yaml file. This job will include the targets as normal, but also the path (since it’s different than default) and an optional parameter called module that specific to the SNMP exporter. It also does something confusing - a relabel_config

This is because Prometheus isn’t actually taking to the switch, it’s talking to the local SNMP exporter service. So we put all the targets normally, and then at the bottom ‘oh, by the way, do a switcheroo’. This allows Prometheus to display all the data normally with no one the wiser.

...
...
scrape_configs:
  - job_name: 'snmp'
    static_configs:
      - targets:
        - some.switch.address    
    metrics_path: /snmp
    params:
      module: [if_mib]
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: 127.0.0.1:9116  # The SNMP exporter's real hostname:port.

Operation

No configuration on the exporter side is needed. Reload the config and check the target list. Then examine data in the graph section. Add additional targets as needed and the exporter will pull in the data.

http://some.server:9090/classic/targets

These metrics are considered well known and so will appear in the database named sysUpTime and upsBasicBatteryStatus and not be prefixed with snmp_ like you might expect.

Next Steps

If you have something non-standard, or you simply don’t want that huge amount of data in your system, look at the link below to customize the SNMP collection with the Generator.

SNMP Exporter Generator Customization

Troubleshooting

The snmp-mibs-downloader is just a handy way to download a bunch of default MIBs so when you use the tools, all the cryptic numbers, like “1.3.6.1.2.1.17.4.3.1” are translated into pleasant names.

If you can’t find the mibs-downloader its probably because it’s in the non-free repo and that’s not enabled by default. Change your apt sources file like so

sudo vi /etc/apt/sources.list

deb http://deb.debian.org/debian/ bullseye main contrib non-free
deb-src http://deb.debian.org/debian/ bullseye main contrib non-free

deb http://security.debian.org/debian-security bullseye-security main contrib non-free
deb-src http://security.debian.org/debian-security bullseye-security main contrib non-free

deb http://deb.debian.org/debian/ bullseye-updates main contrib non-free
deb-src http://deb.debian.org/debian/ bullseye-updates main contrib non-free

It may be that you only need to change one line.

3.1.1.4 - SNMP Generator

Installation

There is no need to install the Generator as it comes with the SNMP exporter. But if you have a device that supplies it’s own MIB (and many do), you should add that to the default location.

# Mibs are often named SOMETHING-MIB.txt
sudo cp -n *MIB.txt /usr/share/snmp/mibs/

Preparation

You must identify the values you want to capture. Using snmpwalk is a good way to see what’s available, but it helps to have a little context.

The data is arranged like a folder structure that you drill-down though. The folder names are all numeric, with ‘.’ instead of slashes. So if you wanted to get a device’s sysName you’d click down through 1.3.6.1.2.1.1.5 and look in the file 0.

When you use snmpwalk it starts wherever you tell it and then starts drilling-down, printing out everything it finds.

How do you know that’s where sysName is at? A bunch of folks got together (the ISO folks) and decided everything in advance. Then they made some handy files (MIBs) and passed them out so you didn’t have to remember all the numbers.

They allow vendors to create their own sections as well, for things that might not fit anywhere else.

A good place to start is looking at what the vendor made available. You see this by walking their section and including their MIB so you get descriptive names - only the ISO System MIB is included by default.

# The SysobjectID identifies the vendor section
# Note use of the MIB name without the .txt
$ snmpwalk -m +SOMEVENDOR-MIB -v 2c -c public some.address SysobjectID

SNMPv2-MIB::sysObjectID.0 = OID: SOMEVENDOR-MIB::somevendoramerica

# Then walk the vendor section using the name from above
$ snmpwalk -m +SOMEVENDOR-MIB -v 2c -c some.address somevendoramerica

SOMEVENDOR-MIB::model.0 = STRING: SOME-MODEL
SOMEVENDOR-MIB::power.0 = INTEGER: 0
...
...

# Also check out the general System section
$ snmpwalk -m +SOMEVENDOR-MIB -v 2c -c public some.address system

# You can also walk the whole ISO tree. In some cases,
# there are thousands of entries and it's indeciperable
$ snmpwalk -m +SOMEVENDOR-MIB -v 2c -c public some.system iso

This can be a lot of information and you’ll need to do some homework to see what data you want to collect.

Configuration

The exporter’s default configuration file is snmp.yml and contains about 57 Thousand lines of config. It’s designed to pull data from whatever you point it at. Basically, it doesn’t know what device it’s talking to, so it tries to cover all the bases.

This isn’t a file you should edit by hand. Instead, you create instructions for the generator and it look though the MIBs and create one for you. Here’s an example for a Samlex Invertor.

vim ~/generator.yml
modules:
  samlex:
    walk:
      - sysLocation
      - inverterMode
      - power
      - vin
      - tempDD
      - tempDA
prometheus-snmp-generator generate
sudo cp /etc/prometheus/snmp.yml /etc/prometheus/snmp.yml.orig
sudo cp ~/snmp.yml /etc/prometheus
sudo systemctl reload prometheus-snmp-exporter.service

Configuration in Prometheus remains the same - but since we picked a new module name we need to adjust that.

    ...
    ...
    params:
      module: [samlex]
    ...
    ...
sudo systemctl reload prometheus.service

Adding Data Prefixes

by default, the names are all over the place. The SNMP Exporter Devs leave it this way because there are a lot of pre-built dashboards on downstream systems that expect the existing names.

If you are building your own downstream systems you can prefix (as is best-practice) as you like with a post generation step. This example cases them all to be prefixed with samlex_.

prometheus-snmp-generator generate
sed -i 's/name: /name: samlex_/' snmp.yml

Combining MIBs

You can combine multiple systems in the generator file to create one snmp.yml file, and refer to them by the module name in the Prometheus file.

modules:
  samlex:
    walk:
      - sysLocation
      - inverterMode
      - power
      - vin
      - tempDD
      - tempDA
  ubiquiti:
    walk:
      - something
      - somethingElse  

Operation

As before, you can get a preview directly from the exporter (using a link like below). This data should show up in the Web UI too.

http://some.server:9116/snmp?module=samlex&target=some.device

Sources

https://github.com/prometheus/snmp_exporter/tree/main/generator

3.2 - Events

3.3 - Visualization

3.3.1 - Grafana

4 - Network

4.1 - VPN

4.1.1 - Wireguard

Wireguard is a new, light-weight VPN that is both faster and simpler than its predecessors. With a small code-base and modern cryptography, it’s the future of VPNs.

Concepts

Wireguard is a layer 3 VPN and as such, only works with IPv4/6. It doesn’t provide DHCP, bridging, or other low-level features.

Participants authenticate using public-key cryptography, use UDP as a transport and do not respond to unauthenticated connection attempts.

Every participant is considered a peer. Each defines their own IP address, routing rules, and decides from whom they will accept traffic. Every peer must exchange public keys with every other other peer. There is no central authority.

Traffic is sent directly between configured peers but can also be relayed through central nodes if so configured by routing rules on the participants.

Scenarios

The way you deploy depends on what you’re doing, but in general you’ll either connect directly point-to-point or create a central server for remote access or management.

Central Server and Remote Access

This is the classic setup where remote systems connect to the network through one central point. Configure a wireguard server as that central point and then your clients (remote peers) to connect.

Central Server and Remote Management

Another common use is to have a fleet of devices ‘phone-home’ so you can reach them easily.

Point to Point

You can also have peers talk directly to each other. This is often used with routers to connect networks across the internet.

4.1.1.1 - Central Server

A central server gives remote devices a reachable target, allowing them to traverse firewalls and NAT and connect. Let’s create a server and generate and add your first remote peer.

Preparation

You’ll need:

  • Public Domain Name or Static IP
  • Linux Server
  • Ability to port-forward UDP 51830

A dynamic domain name will work and it’s reasonably priced (usually free). You just need something for the peers to connect to, though a static IP is best. You can possibly break connectivity if your IP changes while your peers are connected or have the old IP cached.

We use Debian in this example and derivatives should be similar. UDP 51820 is the standard port but you can choose another if desired.

You must also choose a VPN network that doesn’t overlap with your existing networks. We use 192.168.100.0/24 in this example.

Installation

sudo apt install wireguard-tools

Configuration

All the server needs is a single config file and it will look something like this:

[Interface]
Address = 192.168.100.1/24
ListenPort = 51820
PrivateKey = sGp9lWqfBx+uOZO8V5NPUlHQ4pwbvebg8xnfOgR00Gw=

We picked .1 as our server address (pretty standard), created a private key with the wg tool, and put that in the file /etc/wireguard/wg0.conf. Here’s the commands to do that.

# As root
cd /etc/wireguard/
umask 077

wg genkey > server_privatekey
wg pubkey < server_privatekey > server_publickey

read PRIV < server_privatekey

cat << EOF > wg0.conf
[Interface]
Address = 192.168.100.1/24
ListenPort = 51820
PrivateKey = $PRIV
EOF

Operation

The VPN operates by creating network interface and loading a kernel module. You use the linux ip command to add a network interface of type wireguard (that automatically loads the kernel module) or use the wg-quick command do do it for you. Name the interface wg0 and it will pull in the config wg0.conf

Test the Interface

wg-quick up wg0

ping 192.168.100.1

wg-quick down wg0

Enable The Service

For normal use, employ systemctl to create a service using the installed service file.

systemctl enable --now wg-quick@wg0

Administration

The most common procedure is adding new clients. Each must have a unique key and IP, as the keys are hashed and used as part of the internal routing.

Create a Client

Let’s create a client config file by generating a key and assigning them an IP. It’s not secure, but it is pragmatic.

wg genkey > client_privatekey # Generates and saves the client private key
wg pubkey < client_privatekey # Displays the client's public key

Add the client’s public key and IP to your server’s wg0.conf and reload. For the IP, it’s fine to just increment. Note the /32, meaning we will only accept that IP from this peer.

[Interface]
Address = 192.168.100.1/24
ListenPort = 51820
PrivateKey = XXXXXX

##  Some Client  ##
[Peer]
PublicKey = XXXXXX
AllowedIPs = 192.168.100.2/32
wg-quick down wg0 &&  wg-quick up wg0

Send The Client Config

A client config file should look similar to this. The [Interface] is about the client and the [Peer] is about the server.

[Interface]
PrivateKey = THE-CLIENT-PRIVATE-KEY
Address = 192.168.100.2/32

[Peer]
PublicKey = YOUR-SERVERS-PUBLIC-KEY
AllowedIPs = 192.168.100.0/24
Endpoint = your.server.org:51820

Put in the keys and domain name, zip it up and send it on to your client as securely as possible. One neat trick is to display a QR code right in the shell. Devices that have a camera can import from that.

qrencode -t ANSIUTF8 < client-wg0.conf

Test The Client

You should be able to ping the server from the client. If not, take a look at the troubleshooting steps.

Next Steps

We haven’t enabled forwarding yet or set up firewall rules as those depend on what role your central peer will play. Proceed on to Remote Access or Remote Management as desired.

Troubleshooting

When something is wrong, you don’t get an error message, you just get nothing. You bring up the client interface but you can’t ping the server 192.168.100.1. But you can turn on log messages on the server with this command.

echo module wireguard +p > /sys/kernel/debug/dynamic_debug/control
dmesg

# When done, send a '-p'

Key Errors

wg0: Invalid handshake initiation from 205.133.134.15:18595

In this case, you should check your keys and possibly take the server interface down and up.

Typeos

ifconfig: ioctl 0x8913 failed: No such device

Check your conf is named /etc/wireguard/wg0.conf and look for any typoes.

Firewall Issues

If you see no wireguard error messages, you should suspect your firewall. Since it’s UDP you can’t test the port directly, but you can use netcat.

nc -ulp 51820  # On the server

nc -u some.server 51820 # On the client. Type and see if it shows up on the server

4.1.1.2 - Remote Access

This is the classic setup where remote peers initiate a connection to a server that is reachable through the internet, that then forwards their traffic onward.

Traffic Handling

The main choice is route or masquerade .

Routing

If you route, the client’s VPN IP address is what other devices see. This is generally preferred as it allows you to log who was doing what at the individual servers. But you must update your network equipment to treat the central server as a router.

Masquerading

Masquerading causes the server to translate all the traffic. This makes everything look like its coming from the server. It’s less secure, but less complicated and much quicker to implement.

For this first example, we will masquerade traffic from the server.

Central Server Config

Enable Forwarding and Masquerade

Use sysctl to enable forwarding on the server and nft to add masquerade.

# as root
sysctl -w net.ipv4.ip_forward=1

nft flush ruleset
nft add table nat
nft add chain nat postrouting { type nat hook postrouting priority 100\; }
nft add rule nat postrouting masquerade

Persist Rules

It’s best if we add our new rules onto the defaults and enable the nftables service.

# as root
echo "net.ipv4.ip_forward=1" >> /etc/sysctl.conf

nft list ruleset >> /etc/nftables.conf

systemctl enable --now  nftables.service 

Client Config

Your remote peer - the one you created when setting up the server - needs it’s AllowedIPs adjusted so it knows to send more traffic through the tunnel.

Full Tunnel

This sends all traffic from the client over the VPN.

AllowedIPs = 0.0.0.0/0

Split Tunnel

The most common config is to send specific networks through the tunnel. This keeps netflix and such off the VPN

AllowedIPs = 192.168.100.0/24, 192.168.XXX.XXX, 192.168.XXX.YYY

DNS

In some cases, you’ll need the client to use your internal DNS server to resolve private domain names. Make sure this server is in the AllowedIPs above.

[Interface]
PrivateKey = ....
Address = ...
DNS = 192.168.1.1

Firewall Rules

You may want to apply some controls to your clients, such as preventing them from talking to each other while still letting them ping the server and having an ‘admin’ station. You can do this by adding rules to the forward chain.

# Allow an 'admin' peer at .2 full access to others and accept their replies
sudo nft add rule inet filter forward iifname "wg0" ip saddr 192.168.100.2 accept
sudo nft add rule inet filter forward ct state {established, related} accept
# Reject any other traffic between peers
sudo nft add rule inet filter forward iifname "wg0" oifname "wg0" reject with icmp type admin-prohibited

You can persist this change by editing your /etc/nftables.conf file to look like this.

sudo vi /etc/nftables.conf
#!/usr/sbin/nft -f

flush ruleset

table inet filter {
        chain input {
                type filter hook input priority 0;
        }
        chain forward {
                type filter hook forward priority 0;

                # Accept admin traffic
                iifname "wg0" ip saddr 192.168.40.4 accept
                iifname "wg0" ct state {established, related} accept

                # Reject other traffic between peers
                iifname "wg0" oifname "wg0" reject with icmp type admin-prohibited
        }
        chain output {
                type filter hook output priority 0;
        }
}
table ip nat {
        chain postrouting {
                type nat hook postrouting priority srcnat; policy accept;
                masquerade
        }
}

Note: The syntax of the file is slightly different than the command. You can use nft list ruleset to see how nft commands translate into running rules. If you get desperate, you can install iptables to add old examples (that get translated on the fly into nft) and list rules to see how to they turn out

4.1.1.3 - Remote Mgmt

In this scenario peers initiate connections to the cerntral server, making their way through NAT and Firewalls, but you don’t want to forward their traffic.

Central Server Config

No forwarding or masquerade is desired, so there is no additional configuration to the central server.

Client Config

The remote peer - the one you created when setting up the server - is already set up with one exception; a keep-alive.

When the remote peer establishes it’s connection to the central server, intervening firewalls allow you to talk back as they assume it’s in response. However, the firewall will eventually ‘close’ this window unless the client continues sending traffic occasionally to ‘keep alive’ the connection.

# Add this to the bottom of your client's conf file
PersistentKeepalive = 20

Firewall Rules

You should apply some controls to your clients to prevent them from talking to each other (and possibly the server and you also need a rule for the admin station. You can do this by adding rules to the forward chain.

# Allow an 'admin' peer at .2 full access to others and accept their replies
sudo nft add rule inet filter forward iifname "wg0" ip saddr 192.168.100.2 accept
sudo nft add rule inet filter forward ct state {established, related} accept
# Reject any other traffic between peers
sudo nft add rule inet filter forward iifname "wg0" oifname "wg0" reject with icmp type admin-prohibited

You can persist this change by editing your /etc/nftables.conf file to look like this.

sudo vi /etc/nftables.conf
#!/usr/sbin/nft -f

flush ruleset

table inet filter {
        chain input {
                type filter hook input priority 0;
        }
        chain forward {
                type filter hook forward priority 0;

                # Accept admin traffic
                iifname "wg0" ip saddr 192.168.100.2 accept
                iifname "wg0" ct state {established, related} accept

                # Reject other traffic between peers
                iifname "wg0" oifname "wg0" reject with icmp type admin-prohibited
        }
        chain output {
                type filter hook output priority 0;
        }
}
table ip nat {
        chain postrouting {
                type nat hook postrouting priority srcnat; policy accept;
                masquerade
        }
}

4.1.1.4 - Routing

Rather than masquerade, your wireguard server can forward traffic with the VPN addresses intact. You must handle that on your network in one of the following ways.

Symmetric Routing

Classically, you’d treat the wireguard server like any other router. You’d create a management interface and/or a routing interface and advertise routes appropriately.

On a small network, you would simply overlay an additional IP range on top of the existing on by adding a second IP address on your router and put your wireguard server on that network. Your local servers will see the VPN addressed clients and send traffic to the router that will pass it to the wireguard server.

Asymmetric Routing

In a small network you might have the central peer on the same network as the other servers. In this case, it will be acting like a router and forwarding traffic, but the other servers won’t know about it and so will send replies back to their default gateway.

To remedy this, add a static route at the gateway for the VPN range that sends traffic back to the central peer. Asymmetry is generally frowned upon, but it gets the job done with one less hop.

Host Static Routing

You can also configure the servers in question with a static route for VPN traffic so they know to send it directly back to the Wireguard server. This is fastest but you have to visit every host. Though you can use DHCP to distribute this route in some cases.

4.1.1.5 - LibreELEC

LibreELEC and CoreELEC are Linux-based open source software appliances for running the Kodi media player. These can be used as kiosk displays and you can remotely manage them with wireguard.

Create a Wireguard Service

These systems have wireguard support, but use connman that lacks split-tunnel ability1. This forces all traffic through the VPN and so is unsuitable for remote management. To enable split-tunnel, create a wireguard service instead.

Create a service unit file

vi /storage/.config/system.d/wg0.service
[Unit]
Description=start wireguard interface

# The network-online service isn't guaranteed to work on *ELEC
#Requires=network-online.service

After=time-sync.target
Before=kodi.service

[Service]
Type=oneshot
RemainAfterExit=true
StandardOutput=journal

# Need to check DNS is responding before we proceed
ExecStartPre=/bin/bash -c 'until nslookup google.com; do sleep 1; done'

ExecStart=ip link add dev wg0 type wireguard
ExecStart=ip address add dev wg0 10.1.1.3/24
ExecStart=wg setconf wg0 /storage/.config/wireguard/wg0.conf
ExecStart=ip link set up dev wg0

ExecStop=ip link set down dev wg0
ExecStop=ip address del dev wg0 10.1.1.3/24
ExecStop=ip link del dev wg0

[Install]
WantedBy=multi-user.target

Create a Wireguard Config File

Note: This isn’t exactly the same file wg-quick uses, just close enough to confuse.

vi /storage/.config/wireguard/wg0.conf
[Interface]
PrivateKey = XXXXXXXXXXXXXXX

[Peer]
PublicKey = XXXXXXXXXXXXXXX
AllowedIPs = 10.1.1.0/24
Endpoint = endpoint.hostname:31194
PersistentKeepalive = 25

Enable and Test

systemctl enable --now wg0.service
ping 10.1.1.1

Create a Cron Check

When using a DNS name for the endpoint you may become disconnected. To catch this, use a cron job

# Use the internal wireguard IP address of the peer you are connecting to. .1 in this case
crontab -e
*/5 * * * * ping -c1 -W5 10.1.1.1 || ( systemctl stop wg0; sleep 5; systemctl start wg0 )

4.1.1.6 - TrueNAS Scale

You can remotely manage TrueNAS Scale via Wireguard by adding it as a service.

Wireguard is installed by default, though not exposed in the GUI. To add a wg interface, create a config file and add a wg-quick service via systemd. Add iptables port-forwarding to access containerized apps.

Configuration

Add a basic peer config as when setting up a Central Server and save the file on the client as /etc/wireguard/wg1.conf. It’s rumored that wg0 is reserved for the TrueNAS cloud service. Once the config is in place, use wg-quick up wg1 command to test and enable as below.

nano /etc/wireguard/wg1.conf

systemctl enable --now wg-quick@wg1

If you use a domain name for the remote peer this service will fail at system boot rather than wait for DNS. Add a pre-start to the service file to specifically test name resolution.

vi /lib/systemd/system/[email protected]

[Service] 
...
...
ExecStartPre=/bin/bash -c 'until host google.com; do sleep 1; done'

Note: Don’t include a DNS server in your wireguard settings or everything on the NAS will attempt to use your remote DNS and fail if the link goes down.

Accessing Hosted Apps

You can access the TrueNAS web interface via the wg interface, but hosted apps seem specifically bound to the physical NIC address by the way kubernates is forwarding traffic. Selecting host as the network type in the app doesn’t seem to help, but you can add a command like this

iptables -t nat -A PREROUTING --dst 192.168.30.11 -p tcp --dport 20910 -j DNAT --to-destination 192.168.1.129:20910

You may want to make this permanent^[4].

Troubleshooting

Fall-Back Cron Job

If the service proves unreliable, it’s possible to add a cron job as a fall-back.

crontab -e

*/5 * * * * ping -c1 -W5 10.0.0.1 || ( cp /root/wg1.conf /etc/wireguard/ ; wg-quick down wg1 ; wg-quick up wg1 )

The cp command is in case an upgrade removes the config. However, upgrades also remove cron jobs so some other method should be devised.

Cronjob Fails

cronjob kills interface when it can’t ping

or

/usr/local/bin/wg-quick: line 32: resolvconf: command not found

Calling wg-quick via cron causes a resolvconf issue, even though it works at the command line. One solution is to remove any DNS config from your wg conf file so it doesn’t try to register the remote DNS server.

Nov 08 08:23:59 truenas wg-quick[2668]: Name or service not known: `some.server.org:port' Nov 08 08:23:59 truenas wg-quick[2668]: Configuration parsing error … Nov 08 08:23:59 truenas systemd[1]: Failed to start WireGuard via wg-quick(8) for wg1.

The DNS service isn’t available (yet), despite Requires=network-online.target nss-lookup.target already in the service unit file. One way to solve this is a pre-exec in the Service section of the unit file^[3]. This is hacky, but none of the normal directives worked.

The cron job above will bring the service up eventually, but it’s nice to have it at boot.

Upgrade Kills Connection

The Bluefin upgrade seems to have removed or disabled existing cronjobs and wireguard configs. This might be due to not putting them in through the GUI. You may be able to put a copy of the wg.conf on a pool and use the GUI to add a more persistent cronjob

https://www.truenas.com/docs/scale/scaletutorials/systemsettings/advanced/managecronjobsscale/

Notes

https://www.truenas.com/docs/core/coretutorials/network/wireguard/ https://www.truenas.com/community/threads/no-internet-connection-with-wireguard-on-truenas-scale-21-06-beta-1.94843/#post-693601 [3]:https://serverfault.com/questions/867830/systemd-start-service-only-after-dns-is-available [4]:https://serverfault.com/questions/1046065/how-to-port-forward-from-enp7s0-to-localhost80

4.1.1.7 - Proxmox

Proxmox is frequently used for it’s ability to mix Linux Containers and Virtual Machines. Containers are ideal for their low overhead, but they use the host’s kernel so you must load the wireguard module there.

Prepare Proxmox

The Wireguard kernel module is now available on proxmox, so all you need to is load it.

apt install wireguard
modprobe wireguard
# This step may no longer be required, but haven't tested
echo "wireguard" >> /etc/modules-load.d/modules.conf

Edit the container’s config

Add the TUN to the lxc[^1] conf file on the host

lxc.mount.entry = /dev/net/tun dev/net/tun none bind create=file

5 - Security

5.1 - CrowdSec

5.1.1 - Installation

Overview

CrowdSec has two main parts; detection and interdiction.

Detection is handled by the main CrowdSec binary. You tell it what files to keep an eye on, how to parse those files, and what something ‘bad’ looks like. It then keeps a list of IPs that have done bad things.

Interdiction is handled by any number of plugins called ‘bouncers’, so named because they block access or kick out bad IPs. They run independently and keep an eye on the list, to do things like edit the firewall to block access for a bad IP.

There is also the ‘crowd’ part. The CrowdSec binary downloads IPs of known bad-actors from the cloud for your bouncers to keep out and submits alerts from your systems.

Installation

With Debian, you can simply add the repo via their script and install with a couple lines.

curl -s https://packagecloud.io/install/repositories/crowdsec/crowdsec/script.deb.sh | sudo bash
sudo apt install crowdsec
sudo apt install crowdsec-firewall-bouncer-nftables

This installs both the detection (crowdsec) and the interdiction (crowdsec-firewall-bouncer) parts. Assuming eveything went well, crowdsec will check in with the cloud, download a baseline list of known bad-actors, the firewall-bouncer will set up a basic drop list in the firewall, and crowdsec will start watching your syslog for intrusion attempts.

# Check out the very long drop list
sudo nft list ruleset | less

Configuration

CrowdSec comes pre-configured to watch for ssh brute-force attacks. If you have specific services to watch you can add those as described below.

Add a Service

You probably want to watch a specific service, like web server. Take a look at [https://hub.crowdsec.net/] to see all the available components. For example, browse the collections and search for caddy. The more info link will show you how to install the collection;

sudo cscli collections list -a
sudo cscli collections install crowdsecurity/caddy

Tell CrowdSec where Caddy’s log files are.

sudo tee -a /etc/crowdsec/acquis.yaml << EOF

---
filenames:
 - /var/log/caddy/*.log
labels:
  type: caddy
---
EOF

Restart crowdsec for these changes to take effect

sudo systemctl reload crowdsec

Operation

DataFlow

CrowdSec works by pulling in data from the Acquisition files, Parsing the events, comparing to Scenarios, and then Deciding if action should be taken.

Acquisition of data from log files is based on entries in the acquis.yaml file, and the events given a label as defined in that file.

Those events feed the Parsers. There are a handful by default, but only the ones specifically interested in a given label will see it. They look for keywords like ‘FAILED LOGIN’ and then extract the IP.

Successfully parsed lines are feed to the Scenarios to if what happened matters. The scenarios look for things like 10 FAILED LOGINs in 1 min. This separates the accidental bad password entry from a brute force attempt.

Matching a scenario gets the IP added to the Decision List, i.e the list of bad IPs. These have a configurable expiration, so that if you really guess wrong 10 times in a row, you’re not banned forever.

The bouncers use this list to take action, like a firewall block, and will unblock you after the expiration.

Collections

Parsers and Scenarios work best when they work together so they are usually distributed together as a Collection. You can have collections of collections as well. For example, the base installation comes with the linux collection that includes a few parsers and the sshd collection.

To see what Collections, Parsers and Scenarios are running, use the cscli command line interface.

sudo cscli collections list
sudo cscli collections inspect crowdsecurity/linux
sudo cscli collections inspect crowdsecurity/sshd

Inspecting the collection will tell you what parsers and scenarios it contains. As well as some metrics. To learn more a collection and it’s components, you can check out their page:

https://hub.crowdsec.net/author/crowdsecurity/collections/linux

The metrics are a bit confusing until you learn that the ‘Unparsed’ column doesn’t mean unparsed so much as it means a non-event. These are just normal logfile lines that don’t have one of the keywords the parser was looking for, like ‘LOGIN FAIL’.

Status

Is anyone currently attacking you? The decisions list shows you any current bad actors and the alerts list shows you a summary of past decisions. If you are just getting started this is probably none, but if you’re open to the internet this will grow quickly.

sudo cscli decisions list
sudo cscli alerts list

But you are getting events from the cloud and you can check those with the -a option. You’ll notice that every 2 hours the community-blocklist is updated.

sudo cscli alerts list -a

After a while of this collection running, you’ll start to see these kinds of alerts

sudo cscli alerts list
╭────┬───────────────────┬───────────────────────────────────────────┬─────────┬────────────────────────┬───────────┬─────────────────────────────────────────╮
│ ID │       value       │                  reason                   │ country │           as           │ decisions │               created_at                │
├────┼───────────────────┼───────────────────────────────────────────┼─────────┼────────────────────────┼───────────┼─────────────────────────────────────────┤
│ 27 │ Ip:18.220.128.229 │ crowdsecurity/http-bad-user-agent         │ US      │ 16509 AMAZON-02        │ ban:1     │ 2023-03-02 13:12:27.948429492 +0000 UTC │
│ 26 │ Ip:18.220.128.229 │ crowdsecurity/http-path-traversal-probing │ US      │ 16509 AMAZON-02        │ ban:1     │ 2023-03-02 13:12:27.979479713 +0000 UTC │
│ 25 │ Ip:18.220.128.229 │ crowdsecurity/http-probing                │ US      │ 16509 AMAZON-02        │ ban:1     │ 2023-03-02 13:12:27.9460075 +0000 UTC   │
│ 24 │ Ip:18.220.128.229 │ crowdsecurity/http-sensitive-files        │ US      │ 16509 AMAZON-02        │ ban:1     │ 2023-03-02 13:12:27.945759433 +0000 UTC │
│ 16 │ Ip:159.223.78.147 │ crowdsecurity/http-probing                │ SG      │ 14061 DIGITALOCEAN-ASN │ ban:1     │ 2023-03-01 23:03:06.818512212 +0000 UTC │
│ 15 │ Ip:159.223.78.147 │ crowdsecurity/http-sensitive-files        │ SG      │ 14061 DIGITALOCEAN-ASN │ ban:1     │ 2023-03-01 23:03:05.814690037 +0000 UTC │
╰────┴───────────────────┴───────────────────────────────────────────┴─────────┴────────────────────────┴───────────┴─────────────────────────────────────────╯

You may even need to unblock yourself

sudo cscli decisions list
sudo cscli decision delete --id XXXXXXX

Next Steps

You’re now taking advantage of the crowd-part of the crowdsec and added your own service. If you don’t have any alerts though, you may be wondering how well it’s actually working.

Take a look at the detailed activity if you want to look more closely at what’s going on.

5.1.2 - Detailed Activity

Inspecting Metrics

Data comes in through the parsers. To see what they are doing, let’s take a look at the Acquisition and Parser metrics.

sudo cscli metrics

Most of the ‘Acquisition Metrics’ lines will be read and unparsed. This is because normal events are dropped. It only considers lines parsed if they were passed on to a scenario. The ‘bucket’ column refers to event scenarios and is also blank as there were no parsed lines to hand off.

Acquisition Metrics:
╭────────────────────────┬────────────┬──────────────┬────────────────┬────────────────────────╮
│         Source         │ Lines read │ Lines parsed │ Lines unparsed │ Lines poured to bucket │
├────────────────────────┼────────────┼──────────────┼────────────────┼────────────────────────┤
│ file:/var/log/auth.log │ 216        │ -            │ 216            │ -                      │
│ file:/var/log/syslog   │ 143        │ -            │ 143            │ -                      │
╰────────────────────────┴────────────┴──────────────┴────────────────┴────────────────────────╯

The ‘Parser Metrics’ will show the individual parsers - but not all of them. Only parsers that have at least one ‘hit’ are shown. In this example, only the syslog parser shows up. It’s a low-level parser that doesn’t look for matches, so every line is a hit.

Parser Metrics:
╭─────────────────────────────────┬──────┬────────┬──────────╮
│             Parsers             │ Hits │ Parsed │ Unparsed │
├─────────────────────────────────┼──────┼────────┼──────────┤
│ child-crowdsecurity/syslog-logs │ 359  │ 359    │ -        │
│ crowdsecurity/syslog-logs       │ 359  │ 359    │ -        │
╰─────────────────────────────────┴──────┴────────┴──────────╯

However, try a couple failed SSH login attemps and you’ll see them and how they feed up the the Acquistion Metrics.


Acquisition Metrics:
╭────────────────────────┬────────────┬──────────────┬────────────────┬────────────────────────╮
│         Source         │ Lines read │ Lines parsed │ Lines unparsed │ Lines poured to bucket │
├────────────────────────┼────────────┼──────────────┼────────────────┼────────────────────────┤
│ file:/var/log/auth.log │ 242        │ 3            │ 239            │ -                      │
│ file:/var/log/syslog   │ 195        │ -            │ 195            │ -                      │
╰────────────────────────┴────────────┴──────────────┴────────────────┴────────────────────────╯

Parser Metrics:
╭─────────────────────────────────┬──────┬────────┬──────────╮
│             Parsers             │ Hits │ Parsed │ Unparsed │
├─────────────────────────────────┼──────┼────────┼──────────┤
│ child-crowdsecurity/sshd-logs   │ 61   │ 3      │ 58       │
│ child-crowdsecurity/syslog-logs │ 442  │ 442    │ -        │
│ crowdsecurity/dateparse-enrich  │ 3    │ 3      │ -        │
│ crowdsecurity/geoip-enrich      │ 3    │ 3      │ -        │
│ crowdsecurity/sshd-logs         │ 8    │ 3      │ 5        │
│ crowdsecurity/syslog-logs       │ 442  │ 442    │ -        │
│ crowdsecurity/whitelists        │ 3    │ 3      │ -        │
╰─────────────────────────────────┴──────┴────────┴──────────╯

Lines poured to bucket however, is still empty. That means the scenaros decided it wasn’t a hack attempt. With SSH timeouts it actually hard to do without a tool. Plus, you may notice the ‘whitelist` was triggered. Private IP ranges are whilelisted by default so you can’t lock yourself out from inside.

Let’s ask crowdsec to explain what’s going on

Detailed Parsing

To see which parsers got involved and what they did, you can ask.

sudo cscli explain --file /var/log/auth.log --type syslog

Here’s a ssh example of a failed login. The numbers, such as (+9 ~1), mean that the parser added 9 elements it parsed from the raw event, and updated 1. Notice the whitelists parser at the end. It’s catching this event and dropping it, hense the ‘parser failure’

line: Mar  1 14:08:11 www sshd[199701]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=192.168.1.16  user=allen
        ├ s00-raw
        |       └ 🟢 crowdsecurity/syslog-logs (first_parser)
        ├ s01-parse
        |       └ 🟢 crowdsecurity/sshd-logs (+9 ~1)
        ├ s02-enrich
        |       ├ 🟢 crowdsecurity/dateparse-enrich (+2 ~1)
        |       ├ 🟢 crowdsecurity/geoip-enrich (+9)
        |       └ 🟢 crowdsecurity/whitelists (~2 [whitelisted])
        └-------- parser failure 🔴

Why exactly did it get whitelisted? Let’s ask for a verbose report.

sudo cscli explain -v --file /var/log/auth.log --type syslog
line: Mar  1 14:08:11 www sshd[199701]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=192.168.1.16  user=someGuy
        ├ s00-raw
        |       └ 🟢 crowdsecurity/syslog-logs (first_parser)
        ├ s01-parse
        |       └ 🟢 crowdsecurity/sshd-logs (+9 ~1)
        |               └ update evt.Stage : s01-parse -> s02-enrich
        |               └ create evt.Parsed.sshd_client_ip : 192.168.1.16
        |               └ create evt.Parsed.uid : 0
        |               └ create evt.Parsed.euid : 0
        |               └ create evt.Parsed.pam_type : unix
        |               └ create evt.Parsed.sshd_invalid_user : someGuy
        |               └ create evt.Meta.service : ssh
        |               └ create evt.Meta.source_ip : 192.168.1.16
        |               └ create evt.Meta.target_user : someGuy
        |               └ create evt.Meta.log_type : ssh_failed-auth
        ├ s02-enrich
        |       ├ 🟢 crowdsecurity/dateparse-enrich (+2 ~1)
        |               ├ create evt.Enriched.MarshaledTime : 2023-03-01T14:08:11Z
        |               ├ update evt.MarshaledTime :  -> 2023-03-01T14:08:11Z
        |               ├ create evt.Meta.timestamp : 2023-03-01T14:08:11Z
        |       ├ 🟢 crowdsecurity/geoip-enrich (+9)
        |               ├ create evt.Enriched.Longitude : 0.000000
        |               ├ create evt.Enriched.ASNNumber : 0
        |               ├ create evt.Enriched.ASNOrg : 
        |               ├ create evt.Enriched.ASNumber : 0
        |               ├ create evt.Enriched.IsInEU : false
        |               ├ create evt.Enriched.IsoCode : 
        |               ├ create evt.Enriched.Latitude : 0.000000
        |               ├ create evt.Meta.IsInEU : false
        |               ├ create evt.Meta.ASNNumber : 0
        |       └ 🟢 crowdsecurity/whitelists (~2 [whitelisted])
        |               └ update evt.Whitelisted : %!s(bool=false) -> true
        |               └ update evt.WhitelistReason :  -> private ipv4/ipv6 ip/ranges
        └-------- parser failure 🔴

This shows the actual data and at the bottom, parser crowdsecurity/whitelists has updated the property ’evt.Whitelisted’ to true and gave it a reason. That property appears to be a built-in that flags events to be dropped.

If you want to change the ranges, you can edit the logic by editing the yaml file. A sudo cscli hub list will show you what file that is. Add or remove entries from the list it’s checking the ‘ip’ valie and ‘cidr’ value against. Any match cases whitelist to become true.

False Positives

You may see a high percent of ‘Lines poured to bucket’ relative to ‘Lines read’, like in this example where almost all are. Some lines triggering two scenareos when the ‘bucket’ is greater than the number of ‘parsed’

Acquisition Metrics:
╭────────────────────────────────┬────────────┬──────────────┬────────────────┬────────────────────────╮
│             Source             │ Lines read │ Lines parsed │ Lines unparsed │ Lines poured to bucket │
├────────────────────────────────┼────────────┼──────────────┼────────────────┼────────────────────────┤
│ file:/var/log/auth.log         │ 69         │ -            │ 69             │ -                      │
│ file:/var/log/caddy/access.log │ 2121           │ -              │ 32│ file:/var/log/syslog           │ 2          │ -            │ 2              │ -                      │
╰────────────────────────────────┴────────────┴──────────────┴────────────────┴────────────────────────╯

Sometimes, that’s OK as not all scenarios are designed to take instant action. The ‘http-crawl-non_statics’ had 17 events and was considering action against 2 IPs, but never ‘Overflowed’ aka took action.

The http-probing did, however. And one of the two IPs had action take against them

Bucket Metrics:
╭──────────────────────────────────────┬───────────────┬───────────┬──────────────┬────────┬─────────╮
│                Bucket                │ Current Count │ Overflows │ Instantiated │ Poured │ Expired │
├──────────────────────────────────────┼───────────────┼───────────┼──────────────┼────────┼─────────┤
│ crowdsecurity/http-crawl-non_statics │ -             │ -         │ 2            │ 17     │ 2       │
│ crowdsecurity/http-probing           │ -             │ 1         │ 2            │ 15     │ 1       │
╰──────────────────────────────────────┴───────────────┴───────────┴──────────────┴────────┴─────────╯

You can ask crowdsec to explain what’s going on with a -v and see that clients are asking for things that don’t exist.

  ├ s00-raw
  | ├ 🟢 crowdsecurity/non-syslog (first_parser)
  | └ 🔴 crowdsecurity/syslog-logs
  ├ s01-parse
  | └ 🟢 crowdsecurity/caddy-logs (+19 ~2)
  |   └ update evt.Stage : s01-parse -> s02-enrich
  |   └ create evt.Parsed.request : /0/icon/Forman,%20M.L.%20
  |   ...
  |   └ create evt.Meta.http_status : 404
  |   ...
  ├-------- parser success 🟢
  ├ Scenarios
    ├ 🟢 crowdsecurity/http-crawl-non_statics
    └ 🟢 crowdsecurity/http-probing

If you look at the rules (sudo cscli hub list) for http-probing, you’ll see it looks for 404s (file not found). If you get more than 10 in 10 seconds, it ‘overflows’ and the IP get baned.

Whitelist

The trouble is, some web apps generate a lot of 404s as they try and load page elements in case they exist. This generates lots of 404s and bans. In this case, we must whitelist the application with an expression that checks to see if it was an icon request, like above.

sudo vi /etc/crowdsec/parsers/s02-enrich/some-app-whitelist.yaml
name: crowdsecurity/whitelists 
description: "Whitelist 404s for icon requests" 
whitelist: 
  reason: "icon request" 
  expression:   
    - evt.Parsed.request startsWith '/0/icon/'

5.1.3 - Custom Parser

When checking out the detailed metrics you may find that log entries aren’t being parsed. Maybe the log format has changed or you’re logging additional data the author didn’t anticipate. The best thing is to add your own parser.

Types of Parsers

There are several type of parsers and they are used in stages. Some are designed to work with the raw log entries while others are designed to take pre-parsed data and add or enrich it. This way you can do branching and not every parser needs to now how to read a syslog message.

Their Local Path will tell you what stage they kick in at. Use sudo cscli parsers list to display the details. s00-raw works with the ‘raw’ files while s01 and s02 work further down the pipeline. Currently, you can only create s00 and s01 level parsers.

Integrating with Scenarios

Useful parsers supply data that Scenarios are interested in. You can create a parser that watches the system logs for ‘FOOBAR’ entries, extracts the ‘FOOBAR-LEVEL`, and passes it on. But if nothing is looking for ‘FOOBARs’ then nothing will happen.

Let’s say you’ve added the Caddy collection. It’s pulled in a bunch of Scenarios you can view with sudo cscli scenarios list. If you look at one of the assicated files you’ll see a filter section where they look for ’evt.Meta.http_path’ and ’evt.Parsed.verb’. They are all different though, so how do you know what data to supply?

Your best bet is to take an existing parser and modify it.

Examples

Note - CrowdSec is pretty awesome and after talking in the discord they’ve already accomodated both these scenarios within a relase cycle or two. So these two examples are solved. I’m sure you’ll find new ones, though ;-)

A Web Example

Let’s say that you’ve installed the Caddy collection, but you’ve noticed basic auth login failures don’t trigger the parser. So let’s add a new file and edit it.

sudo cp /etc/crowdsec/parsers/s01-parse/caddy-logs.yaml /etc/crowdsec/parsers/s01-parse/caddy-logs-custom.yaml

You’ll notice two top level sections where the parsing happens; nodes and statics and some grok pattern matching going on.

Nodes allow you try multiple patterns and if any match, the whole section is considered successful. I.e. if the log could have either the standard HTTPDATE or a CUSTOMDATE, as long as it has one it’s good and the matching can move on. Statics just goes down the list extracting data. If any fail the whole event is considered a fail and dropped as unparseable.

All the pasrsed data gets attached to event as ’evt.Parsed.something’ and some of the statics are moving it to evt values the Senarios will be looking for Caddy logs are JSON formatted and so basically already parsed and this example makes use of the JsonExtract method quite a bit.

# We added the caddy logs in the acquis.yaml file with the label 'caddy' and so we use that as our filter here
filter: "evt.Parsed.program startsWith 'caddy'"
onsuccess: next_stage
# debug: true
name: caddy-logs-custom
description: "Parse custom caddy logs"
pattern_syntax:
 CUSTOMDATE: '%{DAY:day}, %{MONTHDAY:monthday} %{MONTH:month} %{YEAR:year} %{TIME:time} %{WORD:tz}'
nodes:
  - nodes:
    - grok:
        pattern: '%{NOTSPACE} %{NOTSPACE} %{NOTSPACE} \[%{HTTPDATE:timestamp}\]%{DATA}'
        expression: JsonExtract(evt.Line.Raw, "common_log")
        statics:
          - target: evt.StrTime
            expression: evt.Parsed.timestamp
    - grok:
        pattern: "%{CUSTOMDATE:timestamp}"
        expression: JsonExtract(evt.Line.Raw, "resp_headers.Date[0]")
        statics:
          - target: evt.StrTime
            expression: evt.Parsed.day + " " + evt.Parsed.month + " " + evt.Parsed.monthday + " " + evt.Parsed.time + ".000000" + " " + evt.Parsed.year
    - grok:
        pattern: '%{IPORHOST:remote_addr}:%{NUMBER}'
        expression: JsonExtract(evt.Line.Raw, "request.remote_addr")
    - grok:
        pattern: '%{IPORHOST:remote_ip}'
        expression: JsonExtract(evt.Line.Raw, "request.remote_ip")
    - grok:
        pattern: '\["%{NOTDQUOTE:http_user_agent}\"]'
        expression: JsonExtract(evt.Line.Raw, "request.headers.User-Agent")
statics:
  - meta: log_type
    value: http_access-log
  - meta: service
    value: http
  - meta: source_ip
    expression: evt.Parsed.remote_addr
  - meta: source_ip
    expression: evt.Parsed.remote_ip
  - meta: http_status
    expression: JsonExtract(evt.Line.Raw, "status")
  - meta: http_path
    expression: JsonExtract(evt.Line.Raw, "request.uri")
  - target: evt.Parsed.request #Add for http-logs enricher
    expression: JsonExtract(evt.Line.Raw, "request.uri")
  - parsed: verb
    expression: JsonExtract(evt.Line.Raw, "request.method")
  - meta: http_verb
    expression: JsonExtract(evt.Line.Raw, "request.method")
  - meta: http_user_agent
    expression: evt.Parsed.http_user_agent
  - meta: target_fqdn
    expression: JsonExtract(evt.Line.Raw, "request.host")
  - meta: sub_type
    expression: "JsonExtract(evt.Line.Raw, 'status') == '401' && JsonExtract(evt.Line.Raw, 'request.headers.Authorization[0]') startsWith 'Basic ' ? 'auth_fail' : ''"

The very last line is where a status 401 is checked. It looks for a 401 and a request for Basic auth. However, this misses events where someone asks for a resource that is protected and the serer responds telling you Basic is needed. I.e. when a bot is poking at URLs on your server ignoring the prompts to login. You can look at the log entries more easily with this command to follow the log and decode it while you recreate failed attempts.

sudo tail -f /var/log/caddy/access.log | jq

To change this, update the expression to also check the response header with an additional ? (or) condition.

    expression: "JsonExtract(evt.Line.Raw, 'status') == '401' && JsonExtract(evt.Line.Raw, 'request.headers.Authorization[0]') startsWith 'Basic ' ? 'auth_fail' : ''"xtract(evt.Line.Raw, 'status') == '401' && JsonExtract(evt.Line.Raw, 'resp_headers.Www-Authenticate[0]') startsWith 'Basic ' ? 'auth_fail' : ''"

Syslog Example

Let’s say you’re using dropbear and failed logins are not being picked up by the ssh parser

To see what’s going on, you use the crowdsec command line interface. The shell command is cscli and you can ask it about it’s metrics to see how many lines it’s parsed and if any of them are suspicious. Since we just restarted, you may not have any syslog lines yet, so let’s add some and check.

ssh [email protected]
logger "This is an innocuous message"

cscli metrics
INFO[28-06-2022 02:41:33 PM] Acquisition Metrics:
+------------------------+------------+--------------+----------------+------------------------+
|         SOURCE         | LINES READ | LINES PARSED | LINES UNPARSED | LINES POURED TO BUCKET |
+------------------------+------------+--------------+----------------+------------------------+
| file:/var/log/messages | 1          | -            | 1              | -                      |
+------------------------+------------+--------------+----------------+------------------------+

Notice that the line we just read is unparsed and that’s OK. That just means it wasn’t an entry the parser cared about. Let’s see if it responds to an actual failed login.

dbclient some.remote.host

# Enter some bad passwords and then exit with a Ctrl-C. Remember, localhost attempts are whitelisted so you must be remote.
[email protected]'s password:
[email protected]'s password:

cscli metrics
INFO[28-06-2022 02:49:51 PM] Acquisition Metrics:
+------------------------+------------+--------------+----------------+------------------------+
|         SOURCE         | LINES READ | LINES PARSED | LINES UNPARSED | LINES POURED TO BUCKET |
+------------------------+------------+--------------+----------------+------------------------+
| file:/var/log/messages | 7          | -            | 7              | -                      |
+------------------------+------------+--------------+----------------+------------------------+

Well, no luck. We will need to adjust the parser

sudo cp /etc/crowdsec/parsers/s01-parse/sshd-logs.yaml /etc/crowdsec/parsers/s01-parse/sshd-logs-custom.yaml

Take a look at the logfile and copy an example line over to https://grokdebugger.com/. Use a pattern like

Bad PAM password attempt for '%{DATA:user}' from %{IP:source_ip}:%{INT:port}

Assuming you get the pattern worked out, you can then add a section to the bottom of the custom log file you created.

  - grok:
      name: "SSHD_AUTH_FAIL"
      pattern: "Login attempt for nonexistent user from %{IP:source_ip}:%{INT:port}"
      apply_on: message

5.1.4 - On Alpine

Install

There are some packages available, but (as of 2022) they are a bit behind and don’t include the config and service files. So let’s download the latest binaries from Crowsec and create our own.

Download the current release

Note: Download the static versions. Alpine uses a differnt libc than other distros.

cd /tmp
wget https://github.com/crowdsecurity/crowdsec/releases/latest/download/crowdsec-release-static.tgz
wget https://github.com/crowdsecurity/cs-firewall-bouncer/releases/latest/download/crowdsec-firewall-bouncer.tgz

tar xzf crowdsec-firewall*
tar xzf crowdsec-release*
rm *.tgz

Install Crowdsec and Register with The Central API

You cannot use the wizard as it expects systemd and doesn’t support OpenRC. Follow the Binary Install steps from CrowdSec’s binary instrcutions.

sudo apk add bash newt envsubst
cd /tmp/crowdsec-v*

# Docker mode skips configuring systemd
sudo ./wizard.sh --docker-mode

sudo cscli hub update
sudo cscli machines add -a
sudo cscli capi register

# A collection is just a bunch of parsers and scenarios bundled together for convienence
sudo cscli collections install crowdsecurity/linux 

Install The Firewall Bouncer

We need a netfilter tool so install nftables. If you already have iptables installed you can skip this step and set FW_BACKEND to that below when generating the API keys.

sudo apk add nftables

Now we install the firewall bouncer. There is no static build of the firewall bouncer yet from CrowdSec, but you can get one from Alpine testing (if you don’t want to compile it yourself)

# Change from 'edge' to other versions a needed
echo "http://dl-cdn.alpinelinux.org/alpine/edge/testing" >> /etc/apk/repositories
apk update
apk add cs-firewall-bouncer

Now configure the bouncer. We will once again do this manually becase there is not support for non-systemd linuxes with the install script. But cribbing from their install script, we see we can:

cd /tmp/crowdsec-firewall*

BIN_PATH_INSTALLED="/usr/local/bin/crowdsec-firewall-bouncer"
BIN_PATH="./crowdsec-firewall-bouncer"
sudo install -v -m 755 -D "${BIN_PATH}" "${BIN_PATH_INSTALLED}"

CONFIG_DIR="/etc/crowdsec/bouncers/"
sudo mkdir -p "${CONFIG_DIR}"
sudo install -m 0600 "./config/crowdsec-firewall-bouncer.yaml" "${CONFIG_DIR}crowdsec-firewall-bouncer.yaml"

Generate The API Keys

Note: If you used the APK, just do the first two lines to get the API_KEY (echo $API_KEY) and manually edit the file (vim /etc/crowdsec/bouncers/crowdsec-firewall-bouncer.yaml)

cd /tmp/crowdsec-firewall*
CONFIG_DIR="/etc/crowdsec/bouncers/"

SUFFIX=`tr -dc A-Za-z0-9 </dev/urandom | head -c 8`
API_KEY=`sudo cscli bouncers add cs-firewall-bouncer-${SUFFIX} -o raw`
FW_BACKEND="nftables"
API_KEY=${API_KEY} BACKEND=${FW_BACKEND} envsubst < ./config/crowdsec-firewall-bouncer.yaml | sudo install -m 0600 /dev/stdin "${CONFIG_DIR}crowdsec-firewall-bouncer.yaml"

Create RC Service Files

sudo touch /etc/init.d/crowdsec
sudo chmod +x /etc/init.d/crowdsec
sudo rc-update add crowdsec

sudo vim /etc/init.d/crowdsec
#!/sbin/openrc-run

command=/usr/local/bin/crowdsec
command_background=true

pidfile="/run/${RC_SVCNAME}.pid"

depend() {
   need localmount
   need net
}

Note: If you used the package from Alpine testing above it came with a service file. Just rc-update add cs-firewall-bouncer and skip this next step.

sudo touch /etc/init.d/cs-firewall-bouncer
sudo chmod +x /etc/init.d/cs-firewall-bouncer
sudo rc-update add cs-firewall-bouncer

sudo vim /etc/init.d/cs-firewall-bouncer
#!/sbin/openrc-run

command=/usr/local/bin/crowdsec-firewall-bouncer
command_args="-c /etc/crowdsec/bouncers/crowdsec-firewall-bouncer.yaml"
pidfile="/run/${RC_SVCNAME}.pid"
command_background=true

depend() {
  after firewall
}

Start The Services and Observe The Results

Start up the services and view the logs to see that everything started properly

sudo service start crowdsec
sudo service cs-firewall-bouncer status

sudo tail /var/log/crowdsec.log
sudo tail /var/log/crowdsec-firewall-bouncer.log

# The firewall bouncer should tell you about how it's inserting decisions it got from the hub

sudo cat /var/log/crowdsec-firewall-bouncer.log

time="28-06-2022 13:10:05" level=info msg="backend type : nftables"
time="28-06-2022 13:10:05" level=info msg="nftables initiated"
time="28-06-2022 13:10:05" level=info msg="Processing new and deleted decisions . . ."
time="28-06-2022 14:35:35" level=info msg="100 decisions added"
time="28-06-2022 14:35:45" level=info msg="1150 decisions added"
...
...

# If you are curious about what it's blocking
sudo nft list table crowdsec
...

6 - Web

6.1 - Content Mgmt

There are many ways to manage and produce web content. Traditionally, you’d use a large application with roles and permissions.

A more modern approach is to use a distributed version control system, like git, and a site generator.

Static Site Generators are gaining popularity as they produce static HTML with javascript and CSS that can be deployed to any Content Delivery Network without need for server-side processing.

Astro is great, as is Hugo, with the latter being around longer and having more resources.

6.1.1 - Hugo

Hugo is a Static Site Generator (SSG) that turns Markdown files into static web pages that can be deployed anywhere.

Like WordPress, you apply a ’theme’ to style your content. But rather than use a web-inteface to create content, you directly edit the content in markdown files. This lends itself well tomanaging the content as code and appeals to those who prefer editing text.

However, unlike other SSGs, you don’t have to be a front-end developer to get great results and you can jump in with a minimal investment of time.

6.1.1.1 - Hugo Install

Requirements

I use Debian in this example, but any apt-based distro will be similar.

Preparation

Enable and pin the Debian Backports and Testing repos so you can get recent versions of Hugo and needed tools.

–> Enable and Pin

Installation

Hugo requires git and go

# Assuming you have enable backports as per above
sudo apt install -t bullseye-backports git
sudo apt install -t bullseye-backports golang-go

For a recent version of Hugo you’ll need to go to the testing repo. The extended version is recommended by Hugo and it’s chosen by default.

# This pulls in a number of other required packages, so take a close look at the messages for any conflicts. It's normally fine, though. 
sudo apt install -t testing  hugo

Configuration

A quick test right from the quickstart page to make sure everything works

hugo new site quickstart
cd quickstart
git init
git submodule add https://github.com/theNewDynamic/gohugo-theme-ananke themes/ananke
echo "theme = 'ananke'" >> config.toml
hugo server

Open up a browser to http://localhost:1313/ and you you’ll see the default ananke-themed site.

Next Steps

The ananke theme you just deployed is nice, but a much better theme is Docsy. Go give that a try.

–> Deploy Docsy on Hugo

6.1.1.2 - Docsy Install

Docsy is a good-looking Hugo theme that provides a landing page, blog, and a documentation sub-sites using bootstrap CSS.

The documentation site in particular let’s you turn a directory of text files into a documentation tree with relative ease. It even has a collapsible left nav bar. That is harder to find than you’d think.

Preparation

Docsy requires Hugo. Install that if you haven’t already. It also needs a few other things; postcss, postcss-cli, and autoprefixer from the Node.JS ecosystem. These should be installed in the project directory as version requirements change per theme.

mkdir some.site.org
cd some.site.org
sudo apt install -t testing nodejs npm
npm install -D autoprefixer 
npm install -D postcss
npm install -D postcss-cli

Installation

Deploy Docsy as a Hugo module and pull in the example site so we have a skeleton to work with. We’re using git, but we’ll keep it local for now.

git clone https://github.com/google/docsy-example.git .
hugo server

Browse to http://localhost:1313 and you should see the demo “Goldydocs” site.

Now you can proceed to configure Docsy!

Updating

The Docsy theme gets regular updates. To incorporate those you only have to run this command. Do this now, actually, to get any theme updates the example site hasn’t incoporated yet.

cd /path/to/my-existing-site
hugo mod get -u github.com/google/docsy

Troubleshooting

hugo

Error: Error building site: POSTCSS: failed to transform “scss/main.css” (text/css)>: Error: Loading PostCSS Plugin failed: Cannot find module ‘autoprefixer’

And then when you try to install the missing module

The following packages have unmet dependencies: nodejs : Conflicts: npm npm : Depends: node-cacache but it is not going to be installed

You may have already have installed Node.JS. Skip trying to install it from the OS’s repo and see if npm works. Then proceed with postcss install and such.

6.1.1.3 - Docsy Config

Let’s change the basics of the site in the config.toml file. I put some quick sed commands here, but you can edit by hand as well. Of note is the Github integration. We prepoulate it here for future use, as it allows quick edits in your browser down the road.

SITE=some.site.org
GITHUBID=someUserID
sed -i "s/Goldydocs/$SITE/" config.toml
sed -i "s/The Docsy Authors/$SITE/" config.toml
sed -i "s/example.com/$SITE/" config.toml
sed -i "s/example.org/$SITE/" config.toml
sed -i "s/google\/docsy-example/$GITHUBID\/$SITE/" config.toml 
sed -i "s/USERNAME\/REPOSITORY/$GITHUBID\/$SITE/" config.toml 
sed -i "s/https:\/\/policies.google.com//" config.toml
sed -i "s/https:\/\/github.com\/google\/docsy/https:\/\/github.com\/$GITHUBID/" config.toml
sed -i "s/github_branch/#github_branch/" config.toml

If you don’t plan to translate your site into different languages, you can dispense with some of the extra languages as well.

# Delete the 20 or so lines starting at "lLanguage] and stopping at the "[markup]" section,
# including the english section.
vi config.tml

# Delete the folders from 'content/' as well, leaving 'en'
rm -rf content/fa content/no

You should also set a default meta description or the engine will put in in the bootstrap default and google will summarize all your pages with that

vi config.toml
[params]
copyright = "some.site.org"
privacy_policy = "/privacy"
description = "My personal website to document what I know and how I did it"

Keep and eye on the site in your browser as you make changes. When you’re ready to start with the main part of adding content, take a look at the next section.

Docsy Operation

Notes

You can’t dispense with the en folder yet, as it breaks some github linking functionality you may want to take advantage of later

6.1.1.4 - Docsy Operate

This is a quick excerpt from the Docsy Content and Customization docs. Definitely spend time with those after reading the overview here.

Directory Layout

Content is, appropriately enough, in the content directory, and it’s subdirectories line up with the top-level navigation bar of the web site. About, Documentation, etc corresponds to content/about, content/docs and so on.

The directories and files you create will be the URL that you get with one important exception, filenames are converted to a ‘slug’, mimicking how index files work. For example, If you create the file docs/tech/mastadon.md the URL will be /docs/tech/mastadon/. This is for SEO (Search Engine Optimization).

The other thing you’ll see are _index.html files. In the example above, the URL /docs/tech/ has no content, as it’s a folder. But you can add a _index.md or .html to give it some. Avoid creating index.md or tech.md (a file that matches the name of a subdirectory). Either of those will block Hugo from generating content for any subdirectories.

The Landing Page and Top Nav Pages

The landing page itself is the content/_index.html file and the background is featured-background.jpg. The other top-nav pages are in the content folders with _index files. You may notice the special header variable “menu: main: weight: " and that is what flags that specific page as worth of being in the top menu. Removing that, or adding that (and a linkTitle) will change the top nav.

The Documentation Page and Left Nav Bar

One of the most important features of the Docsy template is the well designed documentation section that features a Section menu, or left nav bar. This menu is built automatically from the files you put in the docs folder, as long as you give them a title. (See Front Matter, below). They are ordered by date but you can add a weight to change that.

It doesn’t collapse by default and if you have a lot of files, you’ll want to enable that.

# Search and set in your config.toml
sidebar_menu_compact = true

Front Matter

The example files have a section at the top like this. It’s not strictly required, but you must have at least the title or they won’t show up in the left nav tree.

---
title: "Examples"
---

Page Content and Short Codes

In addition to normal markdown or html, you’ll see frequent use of ‘shortcodes’ that do things that normal markdown can’t. These are built in to Hugo and can be added by themes, and look like this;

{{% blocks/lead color="dark" %}}
Some Important Text
{{% /blocks/lead %}}

Diagrams

Docsy supports mermaid and a few other tools for creating illustrations from code, such as KaTeX, Mermaid, Diagrams.net, PlantUML, and MarkMap. Simply use a codeblock.

```mermaid
graph LR
 one --> two
```

Generate the Website

Once you’re satisfied with what you’ve got, tell hugo to generate the static files and it will populate the folder we configured earliet

hugo

Publish the Web Site

Everything you need is in the public folder and all you need do is copy it to a web server. You can even use git, which I advise since we’re already using it to pull in and update the module.

–> Local Git Deployment

Bonus Points

If you have a large directory structure full of markdown files already, you can kick-start the process of adding frontmatter like this;

find . -type f | \
while read X
do
  TITLE=$(basename ${X%.*})
  FRONTMATTER=$(printf -- "---\ntitle = ${TITLE}\n---")
  sed -i "1s/^/$FRONTMATTER\n/" "$X"
done

6.1.1.5 - Docsy Github

You may have noticed the links on the right like “Edit this page” that takes one to Github. Let’s set those up.

On Github

Go to github and create a new repository. Use the name of your side for the repo name, such as “some.site.org”. If you want to use something else, you can edit your config.toml file to adjust.

Locally

You man have noticed that Github suggested some next steps with a remote add using the name “origin”. Docsy is already using that, however, from when you cloned it. So we’ll have to pick a new name.

cd /path/to/my-existing-site
git remote add github https://github.com/yourID/some.site.org

Let’s change our default banch to “main” to match Github’s defaults.

git branch -m main

Now we can add, commit and push it up to Github

git add --all
git commit -m "first commit of new site"
git push github

You’ll notice something interesting when you go back to look at Github; all the contributers on the right. That’s because you’re dealing with a clone of Docsy and you can still pull in updates and changes from original project.

It may have been better to clone it via github

6.2 - Content Deployment

Automating deployment as part of a general continuous integration strategy is best-practice these days. Web content should be similarly treated.

I.e. version controlled and deployed with git.

6.2.1 - Local Git Deployment

Overview

Let’s create a two-tiered system that goes from dev to prod using a post-commit trigger

graph LR Development --git / rsync---> Production

The Development system is your workstation. git commit will trigger a build and rsync.

The Production system is a web server. Any web server will do as long as you have SSH access and can update a web-root folder.

I use Hugo in this example, but any system that has an output (or build) folder works similarly.

Configuration

The first thing we need to know is where wee are going, so lets prepare production first.

Production System

This server probably uses folders like /var/www/XXXXX for its web root. Use that or create a new folder and make yourself the owner.

sudo mkdir /var/www/some.site.org
sudo chown -R $USER /var/www/some.site.org
echo "Hello" > /var/www/some.site.org/index.html

Edit your web server’s config to make sure you can view that web page. Also check that rsync is available from the command line.

Development System

Hugo builds static html in a public directory. To generate the HTML, simply type hugo

cd /path/to/my-existing-site
hugo
ls public

We don’t actually want this folder in git and most themes (if you’re using Hugo) already exclude it. Look for a .gitignore file to and create/add if needed.

# Notice /public is at the top of the git ignore file
cat .gitignore

/public
package-lock.json
.hugo_build.lock
...

Assuming you have some content, let’s add and commit it.

git add --all
git commit -m "Initial Commit"

Note: All of these git commands work because pulling in a theme initialized the directory. If you’re doing something else you’ll need to git init.

The last step is to create a hook that will build and deploy after a commit.

cd /path/to/my-existing-site
touch .git/hooks/post-commit
chmod +x .git/hooks/post-commit
vi .git/hooks/post-commit
#!/bin/sh
hugo --cleanDestinationDir
rsync --recursive --delete public/ [email protected]:/var/www/some.site.org

This script ensures that the remote directory matches your local directory. When you’re ready to update the remote site:

git add --all
git commit --allow-empty -m "trigger update"

If you mess up the production files, you can just call the hook manually.

cd /path/to/my-existing-site
touch .git/hooks/post-commit

Troubleshooting

bash: line 1: rsync: command not found

Double check that the remote host has rsync.

6.3 - Content Delivery

6.3.1 - Cloudflare

  • Cloudflare acts as a reverse proxy to hide your server’s IP address
  • Takes over your DNS and directs requests to the closest site
  • Injects JavaScript analytics
    • If the browser’s “do not track” is on, JS isn’t injected.
  • Can uses a tunnel and remove encryption overhead

6.4 - Servers

6.4.1 - Caddy

Caddy is a web server that runs SSL by default by automatically grabing a cert from Let’s Encrypt. It comes as a stand-alone binary, written in Go, and makes a decent reverse proxy.

6.4.1.1 - Installation

Installation

Caddy recommends “using our official package for your distro” and for debian flavors they include the basic instructions you’d expect.

Configuration

The easiest way to configure Caddy is by editing the Caddyfile

sudo vi /etc/caddy/Caddyfile
sudo systemctl reload caddy.service

Sites

You define websites with a block that includes a root and the file_server directive. Once you reload, and assuming you already have the DNS in place, Caddy will reach out to Let’s Encrypt, acquire a certificate, and automatically forward from port 80 to 443

site.your.org {        
    root * /var/www/site.your.org
    file_server
}

Authentication

You can add basicauth to a site by creating a hash and adding a directive to the site.

caddy hash-password
site.your.org {        
    root * /var/www/site.your.org
    file_server
    basicauth { 
        allen SomeBigLongStringFromTheCaddyHashPasswordCommand
    }
}

Reverse Proxy

Caddy also makes a decent reverse proxy.

site.your.org {        
    reverse_proxy * http://some.server.lan:8080
}

You can also take advantage of path-based reverse proxy. Note the rewrite to accommodate the trailing-slash potentially missing.

site.your.org {
    rewrite /audiobooks /audiobooks/
    handle_path /audiobooks/* {
        uri strip_prefix /audiobooks/
        reverse_proxy * http://some.server.lan:8080
    }
}

Include Blocks

You can define common elements at the top and include them on multiple sites. This helps when you have many sites.

(logging) {
    log {
        output file /var/log/caddy/access.log
    }
}
site.your.org {
    import logging     
    reverse_proxy * http://some.server.lan:8080
}

Modules

Caddy is a single binary so when adding a new module (aka feature) you are essentially downloading a new version that has them compiled in. You can find the list of packages at their download page.

Do this at the command line with caddy itself.

sudo caddy add-package github.com/mholt/caddy-webdav
systemctl restart caddy

Security

Drop Unknown Domains

Caddy will accept connections to port 80, announce that it’s a Caddy web server and redirect you to https before realizing it doesn’t have a site or cert for you. Configure this directive at the bottom so it drops immediately.

http:// {
    abort
}

Crowdsec

Caddy runs as it’s own user and is fairly memory-safe. But installing Crowdsec helps identify some types of intrusion attempts.

[TODO]

Troubleshooting

You can test your config file and look at the logs like so

caddy validate --config /etc/caddy/Caddyfile
journalctl --no-pager -u caddy

6.4.1.2 - WebDAV

Caddy can also serve WebDAV requests with the appropriate module. This is important because for many clients, such as Kodi, WebDAV is significantly faster.

sudo caddy add-package github.com/mholt/caddy-webdav
sudo systemctl restart caddy
{   # Custom modules require order of precedence be defined
    order webdav last
}
site.your.org {
    root * /var/www/site.your.org
    webdav * 
}

You can combine WebDAV and Directly Listing - highly recommended - so you can browse the directory contents with a normal web browser as well. Since WebDAV doesn’t use the GET method, you can use the @get filter to route those to the file_server module so it can serve up indexes via the browse argument.

site.your.org {
    @get method GET
    root * /var/www/site.your.org
    webdav *
    file_server @get browse        
}

Sources

https://github.com/mholt/caddy-webdav https://marko.euptera.com/posts/caddy-webdav.html

6.4.1.3 - MFA

The package caddy-security offers a suite of auth functions. Among these is MFA and a portal for end-user management of tokens.

Installation

# Install a version of caddy with the security module 
sudo caddy add-package github.com/greenpau/caddy-security
sudo systemctl restart caddy

Configuration

/var/lib/caddy/.local/caddy/users.json

caddy hash-password

Troubleshooting

journalctl –no-pager -u caddy