Introduction
In this video walk-through, we covered using Google operators to perform advanced searches for information gathering. This was part of TryHackMe Google Dorking.
Google is arguably the most famous example of “Search Engines”, I mean who remembers Ask Jeeves? shudders
Now it might be rather patronising explaining how these “Search Engines” work, but there’s a lot more going on behind the scenes then what we see. More importantly, we can leverage this to our advantage to find all sorts of things that a wordlist wouldn’t. Researching as a whole – especially in the context of Cybersecurity encapsulates almost everything you do as a pentester. MuirlandOracle has created a fantastic room on learning the attitudes towards how to research, and what information you can gain from it exactly.
“Search Engines” such as Google are huge indexers – specifically, indexers of content spread across the World Wide Web.
Google has a lot of websites crawled and indexed. Your average Joe uses Google to look up Cat pictures (I’m more of a Dog person myself…). Whilst Google will have many Cat pictures indexed ready to serve to Joe, this is a rather trivial use of the search engine in comparison to what it can be used for.
You can get a curated list of Google Dorks and the rules used to form a query by joining my channel membership here.
Finding Exactly What You Want
Sometimes, you need to find a very specific phrase. The best way to do this is by putting your search term in double quotes. For example, if you search for "penetration testing training"
, Google will only show you results that contain that exact phrase in that exact order. This is super useful for narrowing down your search and getting rid of irrelevant results.
Scoping Your Search to a Specific Site
If you want to search for something only on a particular website, you can use the site:
operator. For instance, site:tryhackme.com
will only show you results from the TryHackMe website. This is a great first step when you’re starting to gather information on a target, as it gives you a good overview of what Google has indexed for that site.
Hunting for Specific File Types
You can also hunt for specific types of files using the filetype:
operator. Let’s say you’re looking for PDF documents about penetration testing. You could search for filetype:pdf pen testing
. This can be incredibly powerful. For example, you could look for configuration files, like filetype:php wwconfig
, which might expose sensitive information.
Searching Within Titles and URLs
Two of my favorite operators are intitle:
and inurl:
.
intitle:
lets you search for keywords that appear in the title of a webpage. A classic example isintitle:"index of"
. This often reveals websites that have directory listing enabled, which means you can browse their file structure freely. You can get even more specific, likeintitle:"index of" "parent directory"
orintitle:"index of" admin
to find directories that might contain administrative files.inurl:
searches for keywords within the URL itself. This is another great way to find sensitive files. For example, a search likefiletype:php inurl:wwconfig
could lead you directly to a WordPress configuration file (wp-config.php
), which often contains database credentials.
By combining these simple operators, you can create some incredibly powerful search queries to uncover information that’s not meant to be public.
TryHackMe Google Dorking Room Answers
Name the key term of what a “Crawler” is used to do
What is the name of the technique that “Search Engines” use to retrieve this information about websites?
What is an example of the type of contents that could be gathered from a website?
Where would “robots.txt” be located on the domain “ablog.com“
If a website was to have a sitemap, where would that be located?
How would we prevent a “Crawler” from indexing the directory “/dont-index-me/”?
What is the extension of a Unix/Linux system configuration file that we might want to hide from “Crawlers”?
What is the typical file structure of a “Sitemap”?
What real life example can “Sitemaps” be compared to?
Name the keyword for the path taken for content on a website
What would be the format used to query the site bbc.co.uk about flood defences
What term would you use to search by file type?
What term can we use to look for login pages?
Video WalkThrough