Premise
In this video walkthrough, we covered how to hunt and identify advanced persistent threat with Splunk by correlating constructing the events to learn how the incident happened. We solved TryHackMe Boss of the SOC V1 room as well.
- The video builds on the introductory “Splunk 101” session.
- It is part of the Blue Premier Series and focuses on analyzing logs and investigating incidents using Splunk.
- Task 5, “Investigating Advanced Persistent Threats (APT),” is the main focus.
Challenge Introduction
Part of the Blue Primer series, learn how to use Splunk to search through massive amounts of information.
The first section of this room consists of a quiz over Splunk. I recommend attempting the quiz while the machine loads as it can take some time. If the VM fails to load, a direct link to the OVA file (Splunk) can be found here. You can also build this manually using the data and instructions found at this link.
Scenario Details
- The exercise uses Splunk to analyze a dataset simulating an APT attack.
- The attack involves a web server compromise that leads to defacement and further exploitation.
- The investigation employs the cyber kill chain model to track the attacker’s actions:
- Reconnaissance: Initial information gathering.
- Weaponization & Delivery: Deploying and delivering malicious payloads.
- Exploitation: Executing the payload and compromising the system.
- Post-exploitation: Maintaining access and lateral movement.
Investigation Process
1. Reconnaissance Phase
- Objective: Identify the IP address scanning the web server.
- Steps:
- Use Splunk’s
index
search to filter through uploaded datasets.- Query:
index="bot_sv1" AND host="I'mReallyNotBatman"
.
- Query:
- Examine logs for source IP addresses associated with high traffic or scanning activity.
- Cross-reference logs to identify signatures (e.g., SQL injection, cross-site scripting) indicating scanning activity.
- Confirm suspicious activity from an external IP (e.g.,
40.81.48.42
) by matching it with attack signatures in Intrusion Detection System (IDS) logs (e.g., Suricata).
- Use Splunk’s
2. Attack Tool Identification
- Objective: Determine the tool used for scanning.
- Steps:
- Analyze HTTP headers in IDS logs for user-agent strings or other identifiers.
- Switch between source types (
stream_http
,Suricata
) to locate relevant data fields likeuser-agent
orcontent-type
. - Identify tools such as web scanners or bots through HTTP header analysis.
Splunk Techniques Demonstrated
- Using Indexes:
- Locate specific datasets containing relevant logs (e.g.,
bot_sv1
for simulated APT data).
- Locate specific datasets containing relevant logs (e.g.,
- Field Narrowing:
- Select specific fields (e.g.,
source_ip
,signature
) to refine searches.
- Select specific fields (e.g.,
- Source Type Filtering:
- Switch between sources like
stream_http
(web server logs) andSuricata
(IDS logs) for context-specific analysis.
- Switch between sources like
- Signature Analysis:
- Examine triggered IDS signatures to categorize malicious activity (e.g., SQL injection, cross-site scripting).
Key Learnings
- Cyber Kill Chain Context:
- Each Splunk query and analysis step maps to stages of the cyber kill chain.
- Log Sources:
- Splunk aggregates various log types (firewall, HTTP, IDS) into searchable datasets, making it a powerful tool for incident investigation.
- Practical Skills:
- The video emphasizes real-world skills such as log filtering, signature identification, and tool detection.
Room Answers | TryHackMe Boss of the SOC V1
Splunk queries always begin with this command implicitly unless otherwise specified. What command is this? When performing additional queries to refine received data this command must be added at the start. This is a prime example of a slight trick question.
search
When searching for values, it’s fairly typical within security to look for uncommon events. What command can we include within our search to find these?
rare
What about the inverse? What if we want the most common security event?
top
When we import data into splunk, what is it stored under?
index
We can create ‘views’ that allow us to consistently pull up the same search over and over again; what are these called?
dashboard
Importing data doesn’t always go as planned and we can sometimes end up with multiple copies of the same data, what command do we include in our search to remove these copies?
dedup
Splunk can be used for more than just a SIEM and it’s commonly used in marketing to track things such as how long a shopping trip on a website lasts from start to finish. What command can we include in our search to track how long these event pairs take?
transaction
In a manner similar to Linux, we can ‘pipe’ search results into further commands, what character do we use for this?
|
In performing data analytics with Splunk (ironically what the tool is at it’s core) it’s useful to track occurrences of events over time, what command do we include to plot this?
timechart
What about if we want to gather general statistical information about a search?
stats
Data imported into Splunk is categorized into columns called what?
fields
When we import data into Splunk we can view it’s point of origination, what is this called? I’m looking for the machine aspect of this here.
host
When we import data into Splunk we can view its point of origination from within a system, what is this called?
source
We can classify these points of origination and group them all together, viewing them as their specific type. What is this called? Use the syntax found within the search query rather than the proper name for this.
sourcetype
When performing functions on data we are searching through we use a specific command prior to the evaluation itself, what is this command?
eval
Love it or hate it regular expression is a massive component to Splunk, what command do we use to specific regex within a search?
rex
It’s fairly common to create subsets and specific views for less technical Splunk users, what are these called?
pivot table
What is the proper name of the time date field in Splunk
_time
How do I specifically include only the first few values found within my search?
head
More useful than you would otherwise imagine, how do I flip the order that results are returned in?
reverse
When viewing search results, it’s often useful to rename fields using user-provided tables of values. What command do we include within a search to do this?
lookup
We can collect events into specific time frames to be used in further processing. What command do we include within a search to do just that?
bucket
We can also define data into specific sections of time to be used within chart commands, what command do we use to set these lengths of time? This is different from the previous question as we are no longer collecting for further processing.
span
When producing statistics regarding a search it’s common to number the occurrences of an event, what command do we include to do this?
count
Last but not least, what is the website where you can find the Splunk apps at?
splunkbase.splunk.com
We can also add new features into Splunk, what are these called?
apps
What does SOC stand for?
security operations center
What does SIEM stand for?
security information and event management
How about BOTS?
boss of the soc
And CIM?
common information model
what is the website where you can find the Splunk forums at?
community.splunk.com
What IP is scanning our web server?
40.80.148.42
What web scanner scanned the server?
Acunetix
What is the IP address of our web server?
192.168.250.70
What content management system is imreallynotbatman.com using?
Joomla
What address is performing the brute-forcing attack against our website?
23.22.63.114
What was the first password attempted in the attack?
12345678
One of the passwords in the brute force attack is James Brodsky’s favorite Coldplay song. Which six character song is it?
yellow
What was the correct password for admin access to the content management system running imreallynotbatman.com?
batman
What was the average password length used in the password brute forcing attempt rounded to closest whole integer?
6
How many seconds elapsed between the time the brute force password scan identified the correct password and the compromised login rounded to 2 decimal places?
92.17
How many unique passwords were attempted in the brute force attempt?
412
What is the name of the executable uploaded by P01s0n1vy?
3791.exe
What is the MD5 hash of the executable uploaded?
AAE3F5A29935E6ABCC2C2754D12A9AF0
What is the name of the file that defaced the imreallynotbatman.com website?
poisonivy-is-coming-for-you-batman.jpeg
This attack used dynamic DNS to resolve to the malicious IP. What fully qualified domain name (FQDN) is associated with this attack?
prankglassinebracket.jumpingcrab.com
What IP address has P01s0n1vy tied to domains that are pre-staged to attack Wayne Enterprises?
23.22.63.114
Based on the data gathered from this attack and common open source intelligence sources for domain names, what is the email address that is most likely associated with P01s0n1vy APT group?
lillian.rose@po1s0n1vy.com
GCPD reported that common TTPs (Tactics, Techniques, Procedures) for the P01s0n1vy APT group if initial compromise fails is to send a spear phishing email with custom malware attached to their intended target. This malware is usually connected to P01s0n1vy’s initial attack infrastructure. Using research techniques, provide the SHA256 hash of this malware.
9709473ab351387aab9e816eff3910b9f28a7a70202e250ed46dba8f820f34a8
What special hex code is associated with the customized malware discussed in the previous question?
53 74 65 76 65 20 42 72 61 6e 74 27 73 20 42 65 61 72 64 20 69 73 20 61 20 70 6f 77 65 72 66 75 6c 20 74 68 69 6e 67 2e 20 46 69 6e 64 20 74 68 69 73 20 6d 65 73 73 61 67 65 20 61 6e 64 20 61 73 6b 20 68 69 6d 20 74 6f 20 62 75 79 20 79 6f 75 20 61 20 62 65 65 72 21 21 21
What does this hex code decode to?
Steve Brant’s Beard is a powerful thing. Find this message and ask him to buy you a beer!!!
What was the most likely IP address of we8105desk on 24AUG2016?
192.168.250.100
What is the name of the USB key inserted by Bob Smith?
MIRANDA_PRI
After the USB insertion, a file execution occurs that is the initial Cerber infection. This file execution creates two additional processes. What is the name of the file?
Miranda_Tate_unveiled.dotm
During the initial Cerber infection a VB script is run. The entire script from this execution, pre-pended by the name of the launching .exe, can be found in a field in Splunk. What is the length in characters of this field?
4490
Bob Smith’s workstation (we8105desk) was connected to a file server during the ransomware outbreak. What is the IP address of the file server?
192.168.250.20
What was the first suspicious domain visited by we8105desk on 24AUG2016?
solidaritedeproximite.org
The malware downloads a file that contains the Cerber ransomware cryptor code. What is the name of that file?
mhtr.jpg
What is the parent process ID of 121214.tmp?
3968
Amongst the Suricata signatures that detected the Cerber malware, which signature ID alerted the fewest number of times?
2816763
The Cerber ransomware encrypts files located in Bob Smith’s Windows profile. How many .txt files does it encrypt?
406
How many distinct PDFs did the ransomware encrypt on the remote file server?
257
What fully qualified domain name (FQDN) does the Cerber ransomware attempt to direct the user to at the end of its encryption phase?
cerberhhyed5frqa.xmfir0.win
Video Walkthrough