Crowdsec, a collaborative behavior detection engine

Posted May 22, 2021 by Adrian Wyssmann ‐ 11 min read

Last Wednesday I was attending the DevOps Connect: DevSecOps at RSAC 2021, whereas a cool project was presented: Crowdsec, a collaborative behavior detection engine, coupled with a global IP reputation network

What is Crowdsec?

The github project says

Crowdsec is an open-source, lightweight software, detecting peers with aggressive behaviors to prevent them from accessing your systems.

and

CrowdSec is a free, modern & collaborative behavior detection engine, coupled with a global IP reputation network. It stacks on fail2ban’s philosophy but is IPV6 compatible and 60x faster (Go vs Python), uses Grok patterns to parse logs and YAML scenario to identify behaviors. collaborative behavior detection engine, coupled with a global IP reputation network

The interesting of the project is the crowd part, whereas the behavior scenarios and the bad ips detected on the base of these, are shared to the community.

How does crodwsec work?

crowdsec has two main elements:

  • Crowdsec-agent is an open-source and lightweight software that allows you to detect peers with malevolent behaviors and block them from accessing your systems at various level (infrastructural, system, applicative)
  • Bouncers are standalone software pieces in charge of acting upon a decision taken by crowdsec: block an IP, present a captcha, enforce MFA on a given user, etc.
Crowdsec Architecture
Crowdsec Architecture (c) https://doc.crowdsec.net/Crowdsec/v1/

How does cordwsec work?

The following diagram shows, how exactly it works:

cs processing
Processing illustrated (c)https://github.com/crowdsecurity/crowdsec
  1. The crowdsec-agent consumes and parses logs from different data sources. This is done by a parser, which is a YAML configuration file that describes how a string is being parsed. The string can be a log line, or a field extracted from a previous parser. Most parsers use the GROK processor but can use additional modules.

  2. The log data may be enriched by so called enrichers, which add extra context.

    An example is the geoip-enrich that adds origin country, origin AS and origin IP range to an event.

  3. The agent then passes an Event - i.e. a parsed/normalized log line - to the scenario, which then are matched against the scenarios.

    • scenarios are YAML files which describe a set of events which characterizing the scenario. Some examples:
    • scenarios can be of different types (leaky, trigger, counter), and are based on various factors, such as
      • the speed/frequency of the leaky bucket
      • the capacity of the leaky bucket
      • the characteristic(s) of eligible event(s) : “log type XX with field YY set to ZZ”
      • various filters/directives that can alter the bucket’s behavior, such as groupby, distinct or blackhole
  4. Based on this, the event is qualified using the leaky bucket

    The leaky bucket is an algorithm based on an analogy of how a bucket with a constant leak will overflow if either the average rate at which water is poured in exceeds the rate at which the bucket leaks or if more water than the capacity of the bucket is poured in all at once. It can be used to determine whether some sequence of discrete events conforms to defined limits on their average and peak rates or frequencies, e.g. to limit the actions associated to these events to these rates or delay them until they do conform to the rates. It may also be used to check conformance or limit to an average rate alone, i.e. remove any variation from the average.

    The [crowdsec-agent] creates one or more buckets, for the events with matching characteristics. When any of these buckets overflows, the scenario has been triggered (an attach was detected)

  5. A [postoverflow] will be applied on overflows (scenario results) before the decision is written to local DB or pushed to API.

    This is a parser used for “expensive” enrichment/parsing process that you do not want to perform on all incoming events, but rather on decision that are about to be taken. An example could be slack/mattermost enrichment plugin that requires human confirmation before applying the decision or reverse-dns lookup operations.

  6. After the [crowdsec-agent] generates an alert and eventually one or more associated Decisions:

    • An alert is the runtime representation of a bucket overflow and servers as an information
    • A Decision is the representation of the consequence of a bucket overflow: a decision against an IP, a range, an AS, a Country, a User, a Session etc. See also Decision object documentation
  7. Those information (the signal, the associated decisions) are then sent to crowdsec’s local API and stored in the database

  8. The Bouncers will consume the data via the local API and act upon a decision taken by crowdsec: block an IP, present a captcha, enforce MFA on a given user, etc.

    Bouncers can be found in the crowdsec-hub and installed via the cscli

How to install?

The installation is pretty simple:

  • install the agent either from repo (Debian and Ubuntu only) or from tarball
  • install one or multiple bouncers as documented in the [corwdsec-hub]

After that you might install further Collections - these are bundles of parsers, scenarios, postoverflows that form a coherent package and are present in /etc/crowdsec/collections/

With installing the agent you also get the cscli, the cli to manage crowdsec.

I installed crowdsec on a new test server, which also has nginx running. Thus, I also installed the cs-firewall-bouncer and the cs-nginx-bouncer.

We can see what scenarios are there - by default not so much:

cscli scenarios list
--------------------------------------------------------------------------------

 NAME                  📦 STATUS   VERSION  LOCAL PATH
 --------------------------------------------------------------------------------
 crowdsecurity/ssh-bf  ✔️  enabled  0.1      /etc/crowdsec/scenarios/ssh-bf.yaml
--------------------------------------------------------------------------------   

So let us install additional [scnearios] by installing valid collections - in our case we would like to have the stuff for nginix as well:

# cscli collections install crowdsecurity/nginx
INFO[21-05-2021 07:28:24 PM] crowdsecurity/nginx-logs : OK
INFO[21-05-2021 07:28:24 PM] Enabled parsers : crowdsecurity/nginx-logs
INFO[21-05-2021 07:28:24 PM] crowdsecurity/http-logs : OK
INFO[21-05-2021 07:28:24 PM] Enabled parsers : crowdsecurity/http-logs
INFO[21-05-2021 07:28:24 PM] crowdsecurity/http-crawl-non_statics : OK
INFO[21-05-2021 07:28:24 PM] Enabled scenarios : crowdsecurity/ttp-crawl-non_statics
INFO[21-05-2021 07:28:24 PM] crowdsecurity/http-probing : OK
INFO[21-05-2021 07:28:24 PM] Enabled scenarios : crowdsecurity/http-probing
INFO[21-05-2021 07:28:24 PM] crowdsecurity/http-bad-user-agent : OK
INFO[21-05-2021 07:28:24 PM] downloading data 'https://raw.githubusercontent.com/crowdsecurity/sec-lists/master/web/bad_user_agents.txt' in '/var/lib/crowdsec/data/bad_user_agents.txt'
INFO[21-05-2021 07:28:25 PM] Enabled scenarios : crowdsecurity/http-bad-user-agent
INFO[21-05-2021 07:28:25 PM] crowdsecurity/http-path-traversal-probing : OK
INFO[21-05-2021 07:28:25 PM] downloading data 'https://raw.githubusercontent.com/crowdsecurity/sec-lists/master/web/path_traversal.txt' in '/var/lib/crowdsec/data/http_path_traversal.txt'
[email protected]:~/cs-nginx-bouncer-v0.0.4#
INFO[21-05-2021 07:28:25 PM] crowdsecurity/http-sensitive-files : OK
INFO[21-05-2021 07:28:25 PM] downloading data 'https://raw.githubusercontent.com/crowdsecurity/sec-lists/master/web/sensitive_data.txt' in '/var/lib/crowdsec/data/sensitive_data.txt'
INFO[21-05-2021 07:28:25 PM] Enabled scenarios : crowdsecurity/http-sensitive-files
INFO[21-05-2021 07:28:25 PM] crowdsecurity/http-sqli-probing : OK
INFO[21-05-2021 07:28:25 PM] downloading data 'https://raw.githubusercontent.com/crowdsecurity/sec-lists/master/web/sqli_probe_patterns.txt' in '/var/lib/crowdsec/data/sqli_probe_patterns.txt'
INFO[21-05-2021 07:28:26 PM] Enabled scenarios : crowdsecurity/http-sqli-probing
INFO[21-05-2021 07:28:26 PM] crowdsecurity/http-xss-probing : OK
INFO[21-05-2021 07:28:26 PM] downloading data 'https://raw.githubusercontent.com/crowdsecurity/sec-lists/master/web/xss_probe_patterns.txt' in '/var/lib/crowdsec/data/xss_probe_patterns.txt'
INFO[21-05-2021 07:28:26 PM] Enabled scenarios : crowdsecurity/http-xss-probing
INFO[21-05-2021 07:28:26 PM] crowdsecurity/http-backdoors-attempts : OK
INFO[21-05-2021 07:28:26 PM] downloading data 'https://raw.githubusercontent.com/crowdsecurity/sec-lists/master/web/backdoors.txt' in '/var/lib/crowdsec/data/backdoors.txt'
INFO[21-05-2021 07:28:26 PM] Enabled scenarios : crowdsecurity/http-backdoors-attempts
INFO[21-05-2021 07:28:27 PM] ltsich/http-w00tw00t : OK
INFO[21-05-2021 07:28:27 PM] Enabled scenarios : ltsich/http-w00tw00t
INFO[21-05-2021 07:28:27 PM] crowdsecurity/http-generic-bf : OK
INFO[21-05-2021 07:28:27 PM] Enabled scenarios : crowdsecurity/http-generic-bf
INFO[21-05-2021 07:28:27 PM] crowdsecurity/base-http-scenarios : OK
WARN[21-05-2021 07:28:27 PM] crowdsecurity/base-http-scenarios : overwrite
INFO[21-05-2021 07:28:27 PM] Enabled collections : crowdsecurity/base-http-scenarios
INFO[21-05-2021 07:28:27 PM] crowdsecurity/nginx : OK
INFO[21-05-2021 07:28:27 PM] /etc/crowdsec/collections/base-http-scenarios.yaml already exists.
INFO[21-05-2021 07:28:27 PM] Enabled collections : crowdsecurity/nginx
INFO[21-05-2021 07:28:27 PM] Enabled crowdsecurity/nginx
INFO[21-05-2021 07:28:27 PM] Run 'sudo systemctl reload crowdsec' for the new configuration to be effective.

Now this looks better:

#cscli scenarios list
--------------------------------------------------------------------------------------------------------------------------
 NAME                                       📦 STATUS   VERSION  LOCAL PATH                                               
--------------------------------------------------------------------------------------------------------------------------
 ltsich/http-w00tw00t                       ✔️  enabled  0.1      /etc/crowdsec/scenarios/http-w00tw00t.yaml               
 crowdsecurity/http-crawl-non_statics       ✔️  enabled  0.2      /etc/crowdsec/scenarios/http-crawl-non_statics.yaml      
 crowdsecurity/http-sensitive-files         ✔️  enabled  0.2      /etc/crowdsec/scenarios/http-sensitive-files.yaml        
 crowdsecurity/http-generic-bf              ✔️  enabled  0.1      /etc/crowdsec/scenarios/http-generic-bf.yaml             
 crowdsecurity/http-probing                 ✔️  enabled  0.2      /etc/crowdsec/scenarios/http-probing.yaml                
 crowdsecurity/http-sqli-probing            ✔️  enabled  0.2      /etc/crowdsec/scenarios/http-sqli-probing.yaml           
 crowdsecurity/ssh-bf                       ✔️  enabled  0.1      /etc/crowdsec/scenarios/ssh-bf.yaml                      
 crowdsecurity/http-bad-user-agent          ✔️  enabled  0.4      /etc/crowdsec/scenarios/http-bad-user-agent.yaml         
 crowdsecurity/http-xss-probing             ✔️  enabled  0.2      /etc/crowdsec/scenarios/http-xss-probing.yaml            
 crowdsecurity/http-backdoors-attempts      ✔️  enabled  0.2      /etc/crowdsec/scenarios/http-backdoors-attempts.yaml     
 crowdsecurity/http-path-traversal-probing  ✔️  enabled  0.2      /etc/crowdsec/scenarios/http-path-traversal-probing.yaml 
--------------------------------------------------------------------------------------------------------------------------

At last, to understand the basics, let’s have a look at a scenario - for example /etc/crowdsec/scenarios/ssh-bf.yaml

# ssh bruteforce
type: leaky
name: crowdsecurity/ssh-bf
description: "Detect ssh bruteforce"
filter: "evt.Meta.log_type == 'ssh_failed-auth'"
leakspeed: "10s"
references:
  - http://wikipedia.com/ssh-bf-is-bad
capacity: 5
groupby: evt.Meta.source_ip
blackhole: 1m
reprocess: true
labels:
 service: ssh
 type: bruteforce
 remediation: true
---
# ssh user-enum
type: leaky
name: crowdsecurity/ssh-bf_user-enum
description: "Detect ssh user enum bruteforce"
filter: evt.Meta.log_type == 'ssh_failed-auth'
groupby: evt.Meta.source_ip
distinct: evt.Meta.target_user
leakspeed: 10s
capacity: 5
blackhole: 1m
labels:
 service: ssh
 type: bruteforce
 remediation: true

Without going in the details, but this scenario checks for failed ssh authentications and will ban the IP

The remediation label, if set to true indicate the the originating IP should be ban

So after let the server run trough night, we can see that there were already some ssh bruteforce attacks happening as you can see for example ID=18:

#cscli alerts list
+----+---------------------+----------------------+---------+--------------------------------+-----------+--------------------------------+
| ID |        VALUE        |        REASON        | COUNTRY |               AS               | DECISIONS |           CREATED AT           |
+----+---------------------+----------------------+---------+--------------------------------+-----------+--------------------------------+
| 19 | Community blocklist | update : +70/-0 IPs  |         |                                | ban:70    | 2021-05-22 09:31:40.986177268  |
|    |                     |                      |         |                                |           | +0200 +0200                    |
| 18 | Ip:221.181.185.19   | crowdsecurity/ssh-bf | CN      |  China Mobile communications   | ban:1     | 2021-05-22 09:27:15.27054985   |
|    |                     |                      |         | corporation                    |           | +0200 +0200                    |
| 17 | Community blocklist | update : +70/-0 IPs  |         |                                | ban:70    | 2021-05-22 07:31:40.826655134  |
|    |                     |                      |         |                                |           | +0200 +0200                    |
| 16 | Ip:117.111.4.187    | crowdsecurity/ssh-bf | KR      |  LGTELECOM                     | ban:1     | 2021-05-22 06:45:08.860810037  |
|    |                     |                      |         |                                |           | +0200 +0200                    |
| 15 | Community blocklist | update : +70/-0 IPs  |         |                                | ban:70    | 2021-05-22 05:31:41.218482708  |
|    |                     |                      |         |                                |           | +0200 +0200                    |
| 14 | Ip:87.241.1.186     | crowdsecurity/ssh-bf | IT      |  COLT Technology Services      | ban:1     | 2021-05-22 04:31:30.738582084  |
|    |                     |                      |         | Group Limited                  |           | +0200 +0200                    |
| 13 | Community blocklist | update : +70/-0 IPs  |         |                                | ban:70    | 2021-05-22 03:31:40.959522765  |
|    |                     |                      |         |                                |           | +0200 +0200                    |
| 12 | Community blocklist | update : +70/-0 IPs  |         |                                | ban:70    | 2021-05-22 01:31:40.989139871  |
|    |                     |                      |         |                                |           | +0200 +0200                    |
| 11 | Ip:152.243.39.117   | crowdsecurity/ssh-bf | BR      |  TELEFÔNICA BRASIL S.A         | ban:1     | 2021-05-21 23:59:15.896610819  |
|    |                     |                      |         |                                |           | +0200 +0200                    |
| 10 | Community blocklist | update : +70/-0 IPs  |         |                                | ban:70    | 2021-05-21 23:31:41.232447733  |
|    |                     |                      |         |                                |           | +0200 +0200                    |
|  9 | Community blocklist | update : +70/-0 IPs  |         |                                | ban:70    | 2021-05-21 21:31:41.040828362  |
|    |                     |                      |         |                                |           | +0200 +0200                    |
|  8 | Community blocklist | update : +100/-0 IPs |         |                                | ban:100   | 2021-05-21 19:31:39.717115305  |
|    |                     |                      |         |                                |           | +0200 +0200                    |
|  7 | Community blocklist | update : +100/-0 IPs |         |                                | ban:100   | 2021-05-21 19:14:41.613088159  |
|    |                     |                      |         |                                |           | +0200 +0200                    |
|  6 | Ip:122.54.199.123   | crowdsecurity/ssh-bf | PH      |  Philippine Long Distance      | ban:1     | 2021-05-21 18:51:39.40700368   |
|    |                     |                      |         | Telephone Company              |           | +0200 +0200                    |
|  5 | Ip:122.54.199.123   | crowdsecurity/ssh-bf | PH      |  Philippi^ne Long Distance      | ban:1     | 2021-05-21 18:50:25.838205635  |
|    |                     |                      |         | Telephone Company              |           | +0200 +0200                    |
|  4 | Ip:122.54.199.123   | crowdsecurity/ssh-bf | PH      |  Philippine Long Distance      | ban:1     | 2021-05-21 18:49:05.51782191   |
|    |                     |                      |         | Telephone Company              |           | +0200 +0200                    |
|  3 | Ip:122.54.199.123   | crowdsecurity/ssh-bf | PH      |  Philippine Long Distance      | ban:1     | 2021-05-21 18:47:55.211302934  |
|    |                     |                      |         | Telephone Company              |           | +0200 +0200                    |
|  2 | Ip:122.54.199.123   | crowdsecurity/ssh-bf | PH      |  Philippine Long Distance      | ban:1     | 2021-05-21 18:46:42.387240727  |
|    |                     |                      |         | Telephone Company              |           | +0200 +0200                    |
|  1 | Community blocklist | update : +100/-0 IPs |         |                                | ban:100   | 2021-05-21 18:46:43.586154854  |
|    |                     |                      |         |                                |           | +0200 +0200                    |
+----+---------------------+----------------------+---------+--------------------------------+-----------+--------------------------------+

You already see that the IP 221.181.185.19 was banned, but you might check further detaisl

cscli alerts inspect 18

################################################################################################

 - ID         : 18
 - Date       : 2021-05-22T09:28:32+02:00
 - Machine    : 143077311f7740eca31f1b88932cf8b2D94OFapOYkL9k8Tg
 - Simulation : false
 - Reason     : crowdsecurity/ssh-bf
 - Events Count : 13
 - Scope:Value: Ip:221.181.185.19
 - Country    : CN
 - AS         : China Mobile communications corporation

 - Active Decisions  :
+-----+-------------------+--------+--------------------+---------------------------+
| ID  |    SCOPE:VALUE    | ACTION |     EXPIRATION     |        CREATED AT         |
+-----+-------------------+--------+--------------------+---------------------------+
| 729 | Ip:221.181.185.19 | ban    | 2h39m15.380012907s | 2021-05-22T09:28:32+02:00 |
+-----+-------------------+--------+--------------------+---------------------------+

The crowd aspect

As mentioned in the beginning the cool thing about this project is the crowd aspect. For every alert with it’s associated decisions, the meta information about the alert are shared with our central api :

  • The source ip that triggered the alert
  • The scenario that was triggered
  • The timestamp of the attack

After processing the information in the central hub, the information is redistribute relevant blocklists to all the participants.

What else?

There is much more to it, for example there is a dashboard, metrics that can be managed with prometheus and a forensic mode. Also are there a lot of more interesting scenarios, especially also in conjunction with e.g. nginx. But for now that’s it, you have already a pretty good idea how cool that project is.