Web crawlers
Configuration to ban malicious Web crawlers. Here the idea is that most attackers will first try to scan what to attack on a server.
We stick to paths no unmalicious human should try by themselves.
List:
/.env
/password.txt
/passwords.txt
/config\.json
- Rationale: .env and password(s).txt, config.json are often searched by bots, as they can contain sensitive information, such as database credentials. Do not include the third path if a client must retrieve a config.json file.
/info\.php
- Rationale: info.pgp is a file often written for debugging purposes, which contains
<?php phpinfo() ?>
. This function exposes way too much information about the PHP environment, which is very useful when looking for security holes.
- Rationale: info.pgp is a file often written for debugging purposes, which contains
/wp-login\.php
/wp-includes
- Rationale: Wordpress default authentication path. Do not include if you use Wordpress.
/owa/auth/logon.aspx
- Rationale: Outlook authentication path. Do not include if Outlook is in use on your infrastructure.
/auth.html
/auth1.html
- Rationale: I don't know what it is, but it has been tried by numerous bots on my webserver. Do not include if you use this path on your infrastructure.
/dns-query
- Rationale: DOH (DNS Over HTTPS) standard path. Do not include if have a DOH server on your infrastructure.
(Feel free to add your own discoveries to this list!)
By adding (?:[^/" ]*/)
at the beginning of each regex, we also cover all subpaths.
As a pattern, we'll use ip. See here.
Example:
{
streams: {
nginx: {
cmd: ['...'], // see ./nginx.md
filters: {
slskd: {
regex: [
// (?:[^/" ]*/)* is a "non-capturing group" regex that allow for subpath(s)
// example: /code/.env should be matched as well as /.env
// ^^^^^
@'^<ip>.*"GET /(?:[^/" ]*/)*\.env ',
@'^<ip>.*"GET /(?:[^/" ]*/)*password.txt ',
@'^<ip>.*"GET /(?:[^/" ]*/)*passwords.txt ',
@'^<ip>.*"GET /(?:[^/" ]*/)*config\.json ',
@'^<ip>.*"GET /(?:[^/" ]*/)*info\.php ',
@'^<ip>.*"GET /(?:[^/" ]*/)*wp-login\.php',
@'^<ip>.*"GET /(?:[^/" ]*/)*wp-includes',
@'^<ip>.*"GET /(?:[^/" ]*/)*owa/auth/logon.aspx ',
@'^<ip>.*"GET /(?:[^/" ]*/)*auth.html ',
@'^<ip>.*"GET /(?:[^/" ]*/)*auth1.html ',
@'^<ip>.*"GET /(?:[^/" ]*/)*dns-query ',
],
action: banFor('720h'),
},
},
},
},
}