Hello,
My OpenWrt router of 7 years just died today and for the time being I’m forced to plug in the ethernet cable directly.
Hosts blocking worked fine, the hblock package which I recommend to anyone looking for an easy hosts file manager.
Now, at blocking IP’s, I’m a bit stuck. I have heard about various packages that do this: ipset, iprange, the firehol script which doesn’t longer work, and instructions such as these: linux - How to import multiple ip’s to Ipset? - Unix & Linux Stack Exchange ; Country block/allow in Linux with ipset, iptables and systemd | anotherday7
Thing is I need to add IPsets from files, found at links. So I need a script to fetch the files from the Internet, and add the IP’s from each of those files to the firewall via ipset (?)
It probably heavily depends on what you are trying to scrape from and how its formatted.
Most likely things like curl,wget,grep,sed,awk would be helpful.
I managed to use the simplest command possible, as I need both standalone Ip addresses and IP Sets blocked.
Right now it is using normal add feature instead of import feature, which is slow. If someone can help me use the import feature I would be greatful!
Also, it has one fault: it reads ALL lines from the ipset files, not just the lines containing IP addresses. Although it works fine, it spams the logs with some parsing errors. How can I build in that filter?
sudo ipset -N myset nethash
sudo ipset add myset 198.54.126.120
curl https://iplists.firehol.org/files/firehol_level1.netset > /home/hitman/.data/firehol_level1.netset
curl https://iplists.firehol.org/files/firehol_level2.netset > /home/hitman/.data/firehol_level2.netset
for ip in $(cat /home/hitman/.data/firehol_level1.netset /home/hitman/.data/firehol_level2.netset); do sudo ipset -A myset $ip;done
sudo iptables -I INPUT -m set --match-set myset src -j DROP
There are other ways to do this … but to just remove all lines beginning with # you could:
sudo ipset -N myset nethash
sudo ipset add myset 198.54.126.120
curl https://iplists.firehol.org/files/firehol_level1.netset > /home/hitman/.data/firehol_level1.netset
curl https://iplists.firehol.org/files/firehol_level2.netset > /home/hitman/.data/firehol_level2.netset
sed -i '/^#/d' ~/.data/firehol_level{1,2}.netset ## << this is my addition
for ip in $(cat /home/hitman/.data/firehol_level1.netset /home/hitman/.data/firehol_level2.netset); do sudo ipset -A myset $ip;done
sudo iptables -I INPUT -m set --match-set myset src -j DROP
But, in order to use this feature I have to do a bit of a hack: I have to join the files with cat, then insert a beginning into the big file (in my case is “create myset hash:net family inet hashsize 16384 maxelem 65536”), and have the term "add myset " in front of every IP.
I know how to do the first two, but inserting something in front of the IP’s, I don’t know.
The resulting text file should end up looking like this:
And it will be put in the command “ipset restore < (file)”. In theory doing by this method will take 5-10 seconds instead of the pretty uncomfortable 4 minutes.
I decided to introduce awk and make use of /tmp/ for most things besides the original curl/lists themselves and the final master which will also end up populating in ~/.data/
Anyways … think that should work
PS … you can also make use of -!, ex: sudo ipset restore -! < ~/.data/ipmaster.txt
Which will ignore errors.
PPS … I almost forgot to mention … please be careful with your ‘single quotes’ … I noticed in your copied code above the sed command was rendered inoperable because of using
I am blown away, it takes less than a second, half a second at most
For anyone that might need this: I actually had to use the -! option, otherwise, for whatever reason, the resulting ipset would be incomplete (probably some other bad line in my sources other than #) and the -! actually continues to parse instead of stopping. :
Also, the first command needs to be removed (sudo ipset -N…), because when importing the ipset (didn’t know that) actually creates it. Otherwise there will be an error saying that it already exists.
Thank you very, very much sir! I owe you a cold one
Oh yeah, oops, just copied the whole thing.
That’ll teach us
Glad it works.
You can probably optimize it a bit more by hacking at the separate operations … and if you dont need any of the generated files … just do everything in /tmp/ which will also perform better (thanks to tmpfs).
(I checked and I think the reason for errors without -! is because of duplicate lines … you could use sort | uniq to remove them first … but it might just be faster to ‘force-feed’ it without cleaning)
I have a feeling this version would be even faster
Hehe, putting it on a diet nicely Yeah it’s even faster now, 1 second instead of 3 for the download bits, less than 0.5 seconds for the merge and import. Not bad!
Didn’t even know curl was able to just merge multiple url’s into a file… Nice!
And yeah indeed using tmpfs is better, my orignal idea is that I can keep backups or see if there are any problems (data hoarder me), but the sed -ni will throw anything else away anyway.
Even nicer both ipv4 and ipv6 addresses (with letters) are being kept by this sed -ni ‘/[1]/p’. What does it do more precisely? At one point in the future I will have to use ipv6 for work, not using it right now.
Interesting thing. I have tried now sort | uniq /tmp/iplist.txt > /tmp/test.txt (this is what you meant?) and it executed instantly. So I will probably include it aswell as just another precaution.
I appreciate you looking at this.
LE:
That one goes at the end Ran it by mistake as is, and it needs to have an imported ipset first before addiing something else. Weird things these I have to agree :))
On the previous iteration I just remove any line beginning with ‘#’ … this version now removes any line that doesnt begin with a number ‘[0-9]’.
(or thats what should have been happening)
Funny you mention the difference for ipv4/6 between the 2 …
maybe it was actually parsed buggy in some way and we accidentally cleared it up
I settled down for a “sort | uniq” without the -u argument, tested it right now and it adds zero delay, so it goes in too
Got it. We are getting to something very solid here sir.
To make it perfect, this is how the Openwrt BanIP filters IP addresses from the files. For ipv4 it’s number fields separated by dots, and for ipv6 it’s strings of characters separated by colons. It also appears to add the specific text field required before any ip address (it means openwrt is also using the import feature we’re using here, nice).
I think those correspond to functions defined elsewhere in the code. we dont know what its doing with those strings.
But yes if I am reading that right it goes about it by reading for the correct format of the entries (by defining numeral/alphabet, seperators,etc) and only sourcing those lines.
Yes I think that has no use here, it’s a custom command for their software. We need just the filter part of it, after the first quote.
Is that a sed or awk command?
LE: Found a problem now with the sed -ni ‘/[1]/p’, it doesn’t remove # that are after the ip addresses. For some reason they have that too sometimes. How messy are those damn source files?
ipset v7.6: Error in line 20022: Syntax error: cannot parse 176.221.42.32#: resolving to IPv4 address failed
Nice.
It seems like they were just to indicate ‘real’ characters? Not sure.
(like sometimes you may want to use a text like - but it would be interpereted in the command so you \-)
You might also check if the -! is still needed.
Done it.
I researched awk more and basically managed to integrate both filters, to parse both ipv4 and ipv6 (also found what the backlashes were from, part of that proprietary script), and printing the first line needed, and the importing of my custom IP address, all in a single command.
Here it is. Nice and final. Optimized to the point of starvation
EDIT 1: family inet6 and ip6tables are required in order to decode some ipv6 addresses, so I made the necessary adjustments. It grew in size a bit since some commands have to be duplicated. Also I included a few more lists.
I still have the -! argument because even if I use the strictest sorting, ipset will still throw a parsing error from time to time and leave the ip set incomplete. Just as you said yesterday, to ‘force-feed’ it without cleaning
Thank you again for your help! If I find within these 3 days any problem with this setup (script and filtering system) I will return, but so far it works great.