Introduction
In this video walk-through, we covered using Google operators to perform advanced searches for information gathering. This was part of TryHackMe Google Dorking.
Google is arguably the most famous example of “Search Engines”, I mean who remembers Ask Jeeves? shudders
Now it might be rather patronising explaining how these “Search Engines” work, but there’s a lot more going on behind the scenes then what we see. More importantly, we can leverage this to our advantage to find all sorts of things that a wordlist wouldn’t. Researching as a whole – especially in the context of Cybersecurity encapsulates almost everything you do as a pentester. MuirlandOracle has created a fantastic room on learning the attitudes towards how to research, and what information you can gain from it exactly.
“Search Engines” such as Google are huge indexers – specifically, indexers of content spread across the World Wide Web.
Google has a lot of websites crawled and indexed. Your average Joe uses Google to look up Cat pictures (I’m more of a Dog person myself…). Whilst Google will have many Cat pictures indexed ready to serve to Joe, this is a rather trivial use of the search engine in comparison to what it can be used for.
You can get a curated list of Google Dorks and the rules used to form a query by joining my channel membership here.
Room Answers
Name the key term of what a “Crawler” is used to do
What is the name of the technique that “Search Engines” use to retrieve this information about websites?
What is an example of the type of contents that could be gathered from a website?
Where would “robots.txt” be located on the domain “ablog.com”
If a website was to have a sitemap, where would that be located?
How would we prevent a “Crawler” from indexing the directory “/dont-index-me/”?
What is the extension of a Unix/Linux system configuration file that we might want to hide from “Crawlers”?
What is the typical file structure of a “Sitemap”?
What real life example can “Sitemaps” be compared to?
Name the keyword for the path taken for content on a website
What would be the format used to query the site bbc.co.uk about flood defences
What term would you use to search by file type?
What term can we use to look for login pages?
Video Walk-Through