Blocking IP's from IPSet file sources

Hello,
My OpenWrt router of 7 years just died today and for the time being I’m forced to plug in the ethernet cable directly.

Hosts blocking worked fine, the hblock package which I recommend to anyone looking for an easy hosts file manager.

Now, at blocking IP’s, I’m a bit stuck. I have heard about various packages that do this: ipset, iprange, the firehol script which doesn’t longer work, and instructions such as these: linux - How to import multiple ip’s to Ipset? - Unix & Linux Stack Exchange ; Country block/allow in Linux with ipset, iptables and systemd | anotherday7

Thing is I need to add IPsets from files, found at links. So I need a script to fetch the files from the Internet, and add the IP’s from each of those files to the firewall via ipset (?)

Can you guys help me with a script for that?

Thank you very much

It probably heavily depends on what you are trying to scrape from and how its formatted.
Most likely things like curl,wget,grep,sed,awk would be helpful.

But a lot of what you want is probably here:
https://wiki.archlinux.org/index.php/Ipset

Indeed, there were more sources suggesting that the restore feature of ipset is better to batch block the 100.000+ IP’s I need.

Here is the configuration file of OpenWrt’s BanIP, it shows exactly how it formats the text for those particular sources of IPsets:

option ban_src_desc ‘List of public DoH providers (DNS over HTTPS) (IPv4/IPv6)’
option ban_src_rset ‘/^(([0-9]{1,3}.){3}[0-9]{1,3}(/[0-9]{1,2})?)([[:space:]]|$)/{print "add DoH "$1}’
option ban_src_rset_6 ‘/^([0-9a-fA-F]{0,4}:){1,7}[0-9a-fA-F]{0,4}(:/[0-9]{1,2})?([[:space:]]|$)/{print "add DoH_6 "$1}’

I managed to use the simplest command possible, as I need both standalone Ip addresses and IP Sets blocked.

Right now it is using normal add feature instead of import feature, which is slow. If someone can help me use the import feature I would be greatful!

Also, it has one fault: it reads ALL lines from the ipset files, not just the lines containing IP addresses. Although it works fine, it spams the logs with some parsing errors. How can I build in that filter?

sudo ipset -N myset nethash
sudo ipset add myset 198.54.126.120
curl https://iplists.firehol.org/files/firehol_level1.netset > /home/hitman/.data/firehol_level1.netset
curl https://iplists.firehol.org/files/firehol_level2.netset > /home/hitman/.data/firehol_level2.netset
for ip in $(cat /home/hitman/.data/firehol_level1.netset /home/hitman/.data/firehol_level2.netset); do sudo ipset -A myset $ip;done
sudo iptables -I INPUT -m set --match-set myset src -j DROP

There are other ways to do this … but to just remove all lines beginning with # you could:

sudo ipset -N myset nethash
sudo ipset add myset 198.54.126.120
curl https://iplists.firehol.org/files/firehol_level1.netset > /home/hitman/.data/firehol_level1.netset
curl https://iplists.firehol.org/files/firehol_level2.netset > /home/hitman/.data/firehol_level2.netset
sed -i '/^#/d' ~/.data/firehol_level{1,2}.netset  ## << this is my addition
for ip in $(cat /home/hitman/.data/firehol_level1.netset /home/hitman/.data/firehol_level2.netset); do sudo ipset -A myset $ip;done
sudo iptables -I INPUT -m set --match-set myset src -j DROP
1 Like

Thank you. That sed command removed the parsing errors, leaving other information data intact, which is exactly what I need.

I integrated it and completed my blocklists, here it is:

sudo ipset -N myset nethash
sudo ipset add myset 198.54.126.120
curl https://iplists.firehol.org/files/firehol_level1.netset > ~/.data/firehol_level1.netset
curl https://iplists.firehol.org/files/firehol_level2.netset > ~/.data/firehol_level2.netset
curl https://iplists.firehol.org/files/firehol_level3.netset > ~/.data/firehol_level3.netset
curl https://raw.githubusercontent.com/dibdot/DoH-IP-blocklists/master/doh-ipv4.txt > ~/.data/doh-ipv4.txt
sed -i ‘/^#/d’ ~/.data/firehol_level{1,2,3}.netset ~/.data/doh-ipv4.txt
for ip in $(cat ~/.data/firehol_level1.netset ~/.data/firehol_level2.netset ~/.data/firehol_level3.netset ~/.data/doh-ipv4.txt); do sudo ipset -A myset $ip;done
sudo iptables -I INPUT -m set --match-set myset src -j DROP

It works perfectly, but there is one problem now… There are now 50.000 addresses and the process takes around 4 minutes. Which is a bit slow.

I’ve heard here that I can use the restore command to import the IP lists much faster, from minutes to seconds: https://unix.stackexchange.com/a/411805

But, in order to use this feature I have to do a bit of a hack: I have to join the files with cat, then insert a beginning into the big file (in my case is “create myset hash:net family inet hashsize 16384 maxelem 65536”), and have the term "add myset " in front of every IP.

I know how to do the first two, but inserting something in front of the IP’s, I don’t know.

The resulting text file should end up looking like this:

create myset hash:net family inet hashsize 16384 maxelem 65536
add myset 23.160.208.250
add myset 183.62.197.115

And it will be put in the command “ipset restore < (file)”. In theory doing by this method will take 5-10 seconds instead of the pretty uncomfortable 4 minutes.

sudo ipset -N myset nethash
sudo ipset add myset 198.54.126.120
curl https://iplists.firehol.org/files/firehol_level1.netset > ~/.data/firehol_level1.netset
curl https://iplists.firehol.org/files/firehol_level2.netset > ~/.data/firehol_level2.netset
curl https://iplists.firehol.org/files/firehol_level3.netset > ~/.data/firehol_level3.netset
curl https://raw.githubusercontent.com/dibdot/DoH-IP-blocklists/master/doh-ipv4.txt > ~/.data/doh-ipv4.txt
cat ~/.data/firehol_level{1,2,3}.netset ~/.data/doh-ipv4.txt > /tmp/iplist.txt
sed -i '/^#/d' /tmp/iplist.txt
awk '{print "add myset " $0;}' /tmp/iplist.txt > /tmp/ipmasterraw.txt
sed '1s/^/create myset hash:net family inet hashsize 16384 maxelem 65536\n/' /tmp/ipmasterraw.txt > ~/.data/ipmaster.txt
sudo ipset restore < ~/.data/ipmaster.txt

I decided to introduce awk and make use of /tmp/ for most things besides the original curl/lists themselves and the final master which will also end up populating in ~/.data/

Anyways … think that should work :wink:

PS … you can also make use of -!, ex: sudo ipset restore -! < ~/.data/ipmaster.txt
Which will ignore errors.

PPS … I almost forgot to mention … please be careful with your ‘single quotes’ … I noticed in your copied code above the sed command was rendered inoperable because of using

sed -i ‘/^#/d’

instead of

sed -i '/^#/d'
1 Like

What a delight! It works perfectly!

I am blown away, it takes less than a second, half a second at most :smiley:

For anyone that might need this: I actually had to use the -! option, otherwise, for whatever reason, the resulting ipset would be incomplete (probably some other bad line in my sources other than #) and the -! actually continues to parse instead of stopping. :

Also, the first command needs to be removed (sudo ipset -N…), because when importing the ipset (didn’t know that) actually creates it. Otherwise there will be an error saying that it already exists.

Thank you very, very much sir! I owe you a cold one :beer:

Oh yeah, oops, just copied the whole thing.
That’ll teach us :laughing:

Glad it works.
You can probably optimize it a bit more by hacking at the separate operations … and if you dont need any of the generated files … just do everything in /tmp/ which will also perform better (thanks to tmpfs).

Cheers :slight_smile:

1 Like

Heh. I got bored … try this out:

mkdir -p /tmp/iplists
curl https://raw.githubusercontent.com/dibdot/DoH-IP-blocklists/master/doh-ipv4.txt https://iplists.firehol.org/files/firehol_level{1,2,3}.netset >> /tmp/iplists/iplist.txt
sed -ni '/^[0-9]/p' /tmp/iplists/iplist.txt
awk '{print "add myset " $0;}' /tmp/iplists/iplist.txt > /tmp/iplists/ipmaster.txt
sed -i '1s/^/create myset hash:net family inet hashsize 16384 maxelem 65536\n/' /tmp/iplists/ipmaster.txt
sudo ipset restore -! < /tmp/iplists/ipmaster.txt
sudo ipset add myset 198.54.126.120

(I checked and I think the reason for errors without -! is because of duplicate lines … you could use sort | uniq to remove them first … but it might just be faster to ‘force-feed’ it without cleaning)

I have a feeling this version would be even faster :wink:

[edit - fixed placement of ipset]

1 Like

Hehe, putting it on a diet nicely :wink: Yeah it’s even faster now, 1 second instead of 3 for the download bits, less than 0.5 seconds for the merge and import. Not bad!

Didn’t even know curl was able to just merge multiple url’s into a file… Nice!

And yeah indeed using tmpfs is better, my orignal idea is that I can keep backups or see if there are any problems (data hoarder me), but the sed -ni will throw anything else away anyway.

Even nicer both ipv4 and ipv6 addresses (with letters) are being kept by this sed -ni ‘/^[0-9]/p’. What does it do more precisely? At one point in the future I will have to use ipv6 for work, not using it right now.

Interesting thing. I have tried now sort | uniq /tmp/iplist.txt > /tmp/test.txt (this is what you meant?) and it executed instantly. So I will probably include it aswell as just another precaution.

I appreciate you looking at this.

LE:

That one goes at the end :stuck_out_tongue: Ran it by mistake as is, and it needs to have an imported ipset first before addiing something else. Weird things these I have to agree :))

A few ways are outlined here … you could probably do it with sort alone …

On the previous iteration I just remove any line beginning with ‘#’ … this version now removes any line that doesnt begin with a number ‘[0-9]’.
(or thats what should have been happening)

Funny you mention the difference for ipv4/6 between the 2 …
maybe it was actually parsed buggy in some way and we accidentally cleared it up :blush:

1 Like

I settled down for a “sort | uniq” without the -u argument, tested it right now and it adds zero delay, so it goes in too :slight_smile:

Got it. We are getting to something very solid here sir.

To make it perfect, this is how the Openwrt BanIP filters IP addresses from the files. For ipv4 it’s number fields separated by dots, and for ipv6 it’s strings of characters separated by colons. It also appears to add the specific text field required before any ip address (it means openwrt is also using the import feature we’re using here, nice).

How do we integrate it into the filter?

I think those correspond to functions defined elsewhere in the code. we dont know what its doing with those strings.
But yes if I am reading that right it goes about it by reading for the correct format of the entries (by defining numeral/alphabet, seperators,etc) and only sourcing those lines.

Yes I think that has no use here, it’s a custom command for their software. We need just the filter part of it, after the first quote.

Is that a sed or awk command?

LE: Found a problem now with the sed -ni ‘/^[0-9]/p’, it doesn’t remove # that are after the ip addresses. For some reason they have that too sometimes. How messy are those damn source files? :smiley:

ipset v7.6: Error in line 20022: Syntax error: cannot parse 176.221.42.32#: resolving to IPv4 address failed

Heh. yes, it looked like awk to me … but it seems to need some extra options or something.

As to the lists … well … yeah I guess that a good reason to go about it their way.
(there are other approaches of course)

1 Like

It does indeed. You know scripting much better than I do, it says: ^ backslash not last character on line when trying to run

awk '/^(([0-9]{1,3}\.){3}[0-9]{1,3}(\/[0-9]{1,2})?)([[:space:]]|$)/{print \"add bogon \"\$1}'

Where could it fail?

////

LE: I removed the backlashes surrounding the print command and now it worked. Weird. Why were those backlashes there?

awk ‘/^(([0-9]{1,3}.){3}[0-9]{1,3}(/[0-9]{1,2})?)([[:space:]]|$)/{print "add bogon "$1}’

The resulting file looks fine too

Nice. We’re trimming this script even further! This removed the purpose of the fist sed command too.

Nice.
It seems like they were just to indicate ‘real’ characters? Not sure.
(like sometimes you may want to use a text like - but it would be interpereted in the command so you \-)
You might also check if the -! is still needed.

1 Like

Done it.
I researched awk more and basically managed to integrate both filters, to parse both ipv4 and ipv6 (also found what the backlashes were from, part of that proprietary script), and printing the first line needed, and the importing of my custom IP address, all in a single command.

Here it is. Nice and final. Optimized to the point of starvation :slight_smile:

EDIT 1: family inet6 and ip6tables are required in order to decode some ipv6 addresses, so I made the necessary adjustments. It grew in size a bit since some commands have to be duplicated. Also I included a few more lists.

curl https://raw.githubusercontent.com/dibdot/DoH-IP-blocklists/master/doh-ipv4.txt https://raw.githubusercontent.com/dibdot/DoH-IP-blocklists/master/doh-ipv6.txt https://www.blocklist.de/downloads/export-ips_all.txt https://www.team-cymru.org/Services/Bogons/fullbogons-ipv4.txt https://www.team-cymru.org/Services/Bogons/fullbogons-ipv6.txt https://iplists.firehol.org/files/firehol_level{1,2,3}.netset >> /tmp/ips.txt
awk 'NR==1{print "create ipmaster hash:net family inet hashsize 64 maxelem 262144"}''NR==2{print "add ipmaster 198.54.126.120"}''/^(([0-9]{1,3}\.){3}[0-9]{1,3}(\/[0-9]{1,2})?)([[:space:]]|$)/{print "add ipmaster "$1}' /tmp/ips.txt > /tmp/ipmaster.txt && awk 'NR==1{print "create ipmaster6 hash:net family inet6 hashsize 64 maxelem 262144"}''/^([0-9a-fA-F]{0,4}:){1,7}[0-9a-fA-F]{0,4}(:\/[0-9]{1,2})?([[:space:]]|$)/{print "add ipmaster6 "$1}' /tmp/ips.txt > /tmp/ipmaster6.txt
sudo ipset restore -! < /tmp/ipmaster.txt && sudo ipset restore -! < /tmp/ipmaster6.txt
sudo iptables -I INPUT -m set --match-set ipmaster src -j DROP && sudo ip6tables -I INPUT -m set --match-set ipmaster6 src -j DROP

I still have the -! argument because even if I use the strictest sorting, ipset will still throw a parsing error from time to time and leave the ip set incomplete. Just as you said yesterday, to ‘force-feed’ it without cleaning :slight_smile:

Thank you again for your help! If I find within these 3 days any problem with this setup (script and filtering system) I will return, but so far it works great.

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.