~branko's tildelog

random thoughts from ~branko... just ignore

Extract top blocked domain from pi-hole using bash

February 24, 2019 — ~branko

So, I wanted to look around and see what are my top blocked domains on my local network. As far as I know, web interface of pi-hole is offering to just show you top N blocked domains. I wanted them all.

So, once I figured out it is all in log files in /var/log/, it all boiled down to figure out how to detect blocked domains and then it was just one simple script:

cat pihole.log | grep "0.0.0.0$" | sed -e "s/.*\s\(.*\)\sis 0.0.0.0/\1/g" | sort | uniq -c | sort -nr

Output of the script for me looks like:

790 mobile.pipe.aria.microsoft.com
116 winatp-gw-eus.microsoft.com
 52 us-v20.events.data.microsoft.com
 50 tracker.grepler.com
 50 ads.viber.com
 47 v10c.events.data.microsoft.com
 46 settings-win.data.microsoft.com
 40 tracker.trackerfix.com
 16 nexusrules.officeapps.live.com
  6 browser.pipe.aria.microsoft.com
  4 watson.telemetry.microsoft.com
  3 v10.events.data.microsoft.com
  2 nexus.officeapps.live.com
  2 az416426.vo.msecnd.net
  1 vortex.data.microsoft.com

Yes, there are lot of Windows machines on my network:) But since I use ad-tracker, there you cannot see lot of ads blocked.

Here is another example from another of mine pihole servers. This one is:

  • outside of my network,
  • has lot more of lists obtained from this awesome aggregator,
  • family members are not the only users here, and
  • is used in conjuction with openVPN, so I am covered on my mobile too

So, here it is:

2516 mobile.pipe.aria.microsoft.com
  54 app-measurement.com
  31 reports.crashlytics.com
   8 www.googletagmanager.com
   8 www.google-analytics.com
   4 www.googletagservices.com
   4 static.chartbeat.com
   4 js-agent.newrelic.com
   4 googleads.g.doubleclick.net
   4 dev.visualwebsiteoptimizer.com
   3 sb.scorecardresearch.com
   3 cdn.optimizely.com
   2 static.doubleclick.net
   2 settings.crashlytics.com
   2 secure-us.imrworldwide.com
   2 s.webtrends.com
   2 realtime.services.disqus.com
   2 pagead2.googlesyndication.com
   2 logs-01.loggly.com
   2 load.sumome.com
   2 d1z2jf7jlzjs58.cloudfront.net
   2 cdn.segment.io
   2 c1.rfihub.net
   2 c.amazon-adsystem.com
   2 ads.mopub.com
   1 www.zergnet.com
   1 video.adaptv.advertising.com
   1 stats.wp.com
   1 ssl.google-analytics.com
   1 secure.quantserve.com
   1 s.skimresources.com
   1 s.adroll.com
   1 referrer.disqus.com
   1 platform.tumblr.com
   1 pixel.wp.com
   1 p.typekit.net
   1 nexusrules.officeapps.live.com
   1 mads.amazon-adsystem.com
   1 live.sekindo.com
   1 hello.myfonts.net
   1 experience.contextly.com
   1 events.redditmedia.com
   1 engine.adzerk.net
   1 device-metrics-us.amazon.com
   1 d3ezl4ajpp2zy8.cloudfront.net
   1 cx.atdmt.com
   1 collector-medium.lightstep.com
   1 cdn.simpleanalytics.io
   1 cdn.onesignal.com
   1 api.branch.io
   1 ads.adthrive.com
   1 ad-delivery.net

Stay safe online, friends;)

tags: bash, bash-magic, pihole, pi-hole