Hidden Treasures of TLDs: How I Scraped Hackernews for Domain Names (1 Viewing)

silentg

DomainRetail.com
Gold Notable Member
Joined
Jul 3, 2021
Topics
315
Posts
1,449
Likes
1,277
From
Toronto, ON
Country flag
1678734284454.png

Hidden Treasures of TLDs: How I Scraped Hackernews for Domain Names – Klaus Breyer > CTO writing about Code, Business, Product & Engineering Orgs.

Background Story

As somebody who often starts new projects, I often need to think about project names and domain names.

Derek Sivers once posted about how to find a good (and free) .com Domain , and I found it very inspiring. However, in some cases, there is the project name defined first - or you want to do a good play of words with the name and domain of the project. In those cases, you need to have a suitable TLD.

But finding such domain names is a tricky thing. If you go through Wikipedia, you end up with more than 1.2k TLDs . (Trust me, I did it).

So I needed to narrow it down. And I did so by running it through the filter of a bubble that a) seems relevant to me and b) was large enough: People who read and post on Hackernews . So I had my Raspberry Pi scraping the Hackernews API for about 3 Weeks (because of rate limits), and the results you find up there.

I had a database full of HN Stories since the very beginning, which accumulated to ~1GB.
 
Seems like a neat method if you're looking for dictionary word domains. It actually reminds me of ScrapeBox because you can do something very similar on pretty much any domain with that software. It's kind of an old method that SEO guys used for finding domains with strong backlinks pointing to them from sites like CNN/CBC/Yahoo/HuffingtonPost..etc. Open ScrapeBox, use the Google scraper to scrape pages which are indexed in Google from sites like CNN.com (or any other domain..HackerNews, CBC, HuffPost..etc), then use the domain scraper to scrape external domains off each of the pages you initially scraped. Finally, use a bulk domain availability checker to check the status. You might be surprised how many domains are linked to from articles on sites like CNN, MSNBC, CBC, HackerNews, and others which are sitting out there available to register.
 

Sponsors who contribute to keep dn.ca free for everyone.

Sponsors who contribute to keep dn.ca free.

Back