Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I applaud the DDG spirit but without their own crawling data their future is doomed.

If their data sources ever cut them off, it's over.

They need to build their own crawlers like gigablast.




They aren't collecting much if anything.


>They need to build their own crawlers like gigablast.

Unfortunately the opportunity for that is pretty meek. Many webmasters block crawlers that aren't the top search engines. :-(


Can't duckduckbot just present itself to the webserver as GoogleBot?


In our company we do a forward-confirmed DNS to verify whenever a bot is who it claims to be.

Oddly enough we have blocked legit googlebot/bing/baidu servers, because they fail to properly configure their servers...


Their servers are probably configured fine. They likely have a pool of servers with no reverse DNS to try and catch servers issuing different content to Googlebot


I'm pretty confident that the server 1.2.3.4 which returns crawl-5-6-7-8-googlebot.com it's a badly configured one.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: