Basic File Identification
Before doing anything else, determine what kind of file you are dealing with. File extensions in CTFs are often missing or intentionally misleading.
Identifying File Types
# General file type identification based on magic bytes
file target_file
# Check for specific magic bytes/headers
xxd target_file | head
hexdump -C target_file | head
# TrID - Identifies file types from their binary signatures
trid target_file
Common Magic Bytes
- Windows PE (EXE/DLL):
4D 5A(MZ) - ELF (Linux Binary):
7F 45 4C 46(.ELF) - PDF:
25 50 44 46(%PDF) - JPEG:
FF D8 FF E0(orFF E1,FF E8) - PNG:
89 50 4E 47 0D 0A 1A 0A - ZIP:
50 4B 03 04(PK..)
Hex Editors and Magic Byte Manipulation
Sometimes a file's magic bytes are intentionally corrupted and need to be repaired.
# View hex dump interactively
hexeditor target_file
# Command-line hex dumping
xxd target_file | less
# Convert file to hex dump, edit the text file, then convert back
xxd target_file > file.hex
nano file.hex
xxd -r file.hex > restored_file.bin
# Dump only raw hex (no offsets or ASCII representation)
xxd -p target_file > raw_hex.txt
# Revert raw hex back to binary
xxd -r -p raw_hex.txt > bin_file
Strings & Text Extraction
Text hidden in binary files is one of the most common ways to find flags or clues.
Strings Commands
# Basic strings extraction (default is >= 4 chars)
strings target_file
# Extract strings with minimum length of 10
strings -n 10 target_file
# Extract 16-bit little-endian strings (common in Windows & Registry)
strings -el target_file
# Extract strings and show their byte offset in decimal
strings -t d target_file
# Extract all possible strings (useful if encoding is unknown)
strings -a target_file
Searching for Flags
# Search for specific flag formats (e.g., flag{...}, HTB{...})
strings target_file | grep -iE "flag\{|picoCTF\{|HTB\{"
# Extract the exact flag pattern using regex (-o outputs only the match)
strings target_file | grep -oE "flag\{[^}]+\}"
Metadata Extraction
Metadata often contains hidden clues, passwords, or coordinates.
ExifTool (Universal Metadata)
# Basic metadata extraction
exiftool target_file
# Find hidden or custom tags (shows actual tag names)
exiftool -s target_file
# Show all tags, including unknown/custom tags, organized by group
exiftool -a -u -g1 target_file
# Extract embedded thumbnails or images
exiftool -b -ThumbnailImage target_file > thumbnail.jpg
Other Metadata Tools
# Document metadata anonymization toolkit (reveals what metadata exists)
mat2 -s target_file
# PDF specific metadata
pdfinfo file.pdf
Image Analysis & Steganography
Images are the most common medium for hidden data in CTFs.
Basic Image Inspection
# Check PNG structure and report corruption/anomalies (very useful for corrupted chunks)
pngcheck -v target_file.png
# Read QR codes, Barcodes, etc. from images
zbarimg target_file.png
LSB & Bit-Level Steganography
# zsteg: Detects LSB steganography in PNG/BMP images
zsteg target_file.png
zsteg -a target_file.png # Try all methods (slow)
# Stegsolve (GUI Tool): Essential for viewing bit planes, color channels, and analyzing LSBs
java -jar stegsolve.jar
Password-Protected Steganography
# Steghide: Extracts hidden data from JPG and WAV files (requires password)
steghide extract -sf target_file.jpg
# Stegseek: Extremely fast password cracker for steghide
stegseek target_file.jpg /usr/share/wordlists/rockyou.txt
Audio Analysis
Audio files can hide data in metadata, LSBs, or visually in the spectrogram.
Tools & Techniques
- Spectrograms: Open the file in Audacity or Sonic Visualiser. Change the view from "Waveform" to "Spectrogram". Flags are often drawn in the sound frequencies.
- Morse Code: Listen for beeps or look at the waveform. Short/long pulses usually mean morse code.
# Strings can still apply to audio files!
strings target_audio.wav | grep -i flag
# Check for hidden files appended to the audio file
binwalk target_audio.wav
Archive Analysis & Password Cracking
Extracting and cracking archives is a staple of file analysis.
Decompression
# Standard formats
unzip target.zip
tar -xvf target.tar.gz
7z x target.7z
Cracking Archive Passwords
# ZIP files
zip2john target.zip > zip.hash
john --wordlist=/usr/share/wordlists/rockyou.txt zip.hash
# Faster alternative for ZIPs
fcrackzip -u -D -p /usr/share/wordlists/rockyou.txt target.zip
# RAR files
rar2john target.rar > rar.hash
john --wordlist=/usr/share/wordlists/rockyou.txt rar.hash
File Carving & Extraction
When files are embedded inside other files (like a firmware image or a document), you need to carve them out.
Binwalk
# Scan file for embedded signatures
binwalk target_file
# Automatically extract known file types
binwalk -e target_file
# Force extraction using an entropy graph (finds compressed/encrypted data)
binwalk -E target_file
# Extract specific signatures only (e.g., zip)
binwalk -D 'zip archive:zip' target_file
Foremost & Scalpel
These tools recover files based on their headers, footers, and internal data structures, bypassing filesystem analysis.
# Carve all known file types using foremost
foremost -i target_file -o output_dir/
# Carve specific file types (e.g., jpg, pdf) using foremost
foremost -t jpg,pdf -i target_file -o output_dir/
# Scalpel is highly customizable based on /etc/scalpel/scalpel.conf
scalpel -c /etc/scalpel/scalpel.conf -o output_dir/ target_file
Document Analysis
Malicious documents and PDFs often hide macros or scripts.
PDF Analysis
# High-level overview of PDF structures (looks for /JS, /JavaScript, /OpenAction)
pdfid target_file.pdf
# Search PDF objects and streams for specific content
pdf-parser -a target_file.pdf
pdf-parser --search javascript target_file.pdf
# Interactive/Deep PDF analysis
peepdf -i target_file.pdf
# peepdf useful commands: info, tree, object <id>, extract <id>
Microsoft Office Documents (OLE)
# Analyze OLE files (old Office formats like .doc, .xls) for macros
oleid target_file.doc
# Extract and analyze VBA macros
olevba target_file.doc
olevba target_file.docx
# Modern Office formats (.docx, .xlsx) are simply ZIP archives!
unzip target_file.docx -d output_dir/
# Search the extracted internal XML files (e.g., word/document.xml) for flags
Executables & Binaries (PE / ELF)
While full reverse engineering is a separate field, basic static analysis is a critical first step.
# Detailed binary information (architecture, OS, dynamically linked, etc)
rabin2 -I target_binary
readelf -h target_elf
# Look at imported functions/symbols
readelf -s target_elf
objdump -T target_binary
Data Manipulation & Decoding
CTFs often involve decoding layers of Base64, Hex, or custom encodings.
Command Line Decoding
# Base64 Decode
echo "ZmxhZ3thYmNkZX0=" | base64 -d
# URL Decode
echo "%66%6c%61%67" | python3 -c "import sys, urllib.parse as ul; print(ul.unquote_plus(sys.stdin.read()))"
CyberChef / KEYSEC
Web-based tools (or local alternatives) are incredibly versatile.
- Data formats: Base64, Hex, Binary, Octal.
- Operations: XOR, AES Decryption, Bit Shifting.
- File operations: ExtractFiles, Magic (attempts to auto-decode based on known patterns).
Summary Checklist for Unknown Files
- Run
fileandtridto see what the file format actually is. - Check the raw bytes with
xxd | headto verify magic bytes. - Run
stringsand grep for the flag format (grep -iE "flag\{"). - Run
exiftoolto check for hidden metadata or comments. - Run
binwalk -eto check for and extract hidden files. - If it's an image, check
zstegandstegsolve. - If it's a document, treat
.docx/.xlsxas ZIPs andunzipthem. - If it's an audio file, open it in Audacity and view the Spectrogram.