[📦 EXECUTABLE FILE FORMATS]
// Dissect the anatomy of executables. Master PE, ELF, and Mach-O formats to understand how programs are structured, loaded, and executed on different operating systems.
[🎯 LEARNING_OBJECTIVES]
> learning_outcomes.list
- [✓]Understand PE, ELF, and Mach-O file structure and headers
- [✓]Parse executable sections, imports, and exports
- [✓]Identify packed/encrypted executables
- [✓]Extract and analyze embedded resources
> prerequisites.cfg
- - Completed Lesson 2: Assembly Basics
- - Understanding of hexadecimal
- - Basic file system knowledge
[📋 YARA_INTRODUCTION]
🔗 Why YARA in a File Formats Lesson?
File format analysis and YARA go hand-in-hand! Once you understand PE, ELF, and Mach-O structures, you need a way to automatically detect and classify files based on these characteristics.
🔍 Detection Use Cases:
- • Find files with suspicious PE imports
- • Detect ELF files with rootkit characteristics
- • Identify Mach-O files with code injection
- • Classify malware families by structure
🎯 Practical Applications:
- • Scan thousands of files automatically
- • Create detection rules for security tools
- • Build threat intelligence databases
- • Automate malware triage processes
💡 Think of it this way: You're learning to read the "DNA" of files (PE/ELF/Mach-O), and YARA is your tool to search for specific "genetic patterns" across entire file systems!
🤔 What is YARA?
YARA is like a "super-powered search tool" for files. Think of it as:
- • Google search but for binary files and malware
- • Pattern matching with advanced logic capabilities
- • Detective tool that helps identify suspicious files
- • Industry standard used by security professionals worldwide
Why Learn YARA?
- 🔍 Malware Detection: Find hidden threats
- 🎯 Threat Hunting: Search for attack patterns
- 🚨 Incident Response: Quickly classify files
- 📊 Research: Categorize malware families
🚀 Quick Start Guide
1. Installation (macOS)
2. Basic Usage
3. Rule Structure
📚 Learning Resources
Beginner Materials:
Learning Progression:
[🛠️ ANALYSIS_TOOLS]
$Command Line Arsenal
🔍 file / hexdump
Quick file identification and hex analysis
⚡ objdump / readelf
Deep section and header analysis
🗂️ strings / nm
Extract strings and symbol tables
💎Professional Tools
🔥 PE-bear / ELF Parser
Specialized parsers for each format
🔬 CFF Explorer
Complete PE file editor and analyzer
⚡ HxD / 010 Editor
Advanced hex editors with templates
[🏢 PORTABLE_EXECUTABLE_(PE)]
🎯 Why PE Format Mastery is Critical:
Windows Dominance
Most malware targets Windows, making PE analysis essential for threat hunters
Hiding Techniques
Packers, crypters, and rootkits manipulate PE structure to evade detection
Import Analysis
API imports reveal malware capabilities before dynamic analysis
Resource Extraction
Embedded payloads, configs, and certificates hidden in resource sections
📋 PE File Structure
PE files have a layered structure designed for efficient loading and execution. Understanding this hierarchy is key to effective analysis.
PE File Layout
Legacy compatibility
"This program cannot be run..."
PE signature + COFF Header
Essential execution info
Map of all sections
Executable code
Initialized data
Resources (icons, strings)
Critical Headers
DOS Header (IMAGE_DOS_HEADER)
COFF Header
Optional Header
🔗 Import/Export Analysis
Import Analysis
Imports reveal which APIs the malware uses - often the first clue to its capabilities.
Export Analysis
Exports show functions that other programs can call - useful for DLL analysis.
Analysis Workflow:
- 1. Check for suspicious export names
- 2. Cross-reference with import analysis
- 3. Look for ordinal-only exports (obfuscation)
- 4. Identify callback functions and hooks
[🐧 EXECUTABLE_LINKABLE_FORMAT_(ELF)]
🐧 Why ELF Analysis Matters:
Linux Dominance
Servers, IoT devices, Android - ELF is everywhere in modern infrastructure
Rootkit Analysis
Linux rootkits manipulate ELF structures for stealth and persistence
Symbol Stripping
Malware often strips symbols, making ELF header analysis crucial
Library Injection
Dynamic linking allows sophisticated injection and hooking techniques
📋 ELF File Structure
ELF File Layout
File identification & layout
Segment info for loader
Section info for linking
Executable code
Initialized data
Uninitialized data
Function/variable names
String storage
ELF Header Deep Dive
Magic Numbers & Identification
Critical Fields
⚠️ Malware Indicators
- • Modified e_entry pointing to shellcode
- • Unusual e_machine values for target platform
- • Corrupted program/section header counts
- • Non-standard ELF magic variations
🔗 Dynamic Linking & Dependencies
Library Dependencies
Normal Dependencies
- • libc.so.6 (standard C library)
- • libm.so.6 (math library)
- • libpthread.so.0 (threading)
Suspicious Dependencies
- • Unknown .so files in /tmp
- • Libraries with random names
- • Missing NEEDED entries (static linking)
- • Unusual library paths
Symbol Analysis
Symbol Types
High-Risk Functions
- • system(), execve() - Command execution
- • socket(), connect() - Network activity
- • dlopen(), dlsym() - Dynamic loading
- • ptrace() - Anti-debugging/injection
[🍎 MACH_O_FORMAT]
🍎 Mach-O in Modern Security:
macOS Malware Rise
Growing macOS user base attracts more sophisticated malware targeting Mach-O
iOS/Mobile Security
iPhone apps use Mach-O - critical for mobile malware analysis
Code Signing
Apple's code signing stored in Mach-O structures - bypass techniques exist
Dylib Hijacking
Dynamic library loading vulnerabilities unique to Mach-O format
📋 Mach-O File Structure
Mach-O Layout
CPU type, file type, load commands count
Instructions for loader/linker
Actual segment contents
Mach-O Magic Numbers
Load Commands Deep Dive
Load commands tell the system how to load and link the executable. Understanding these is key to Mach-O analysis.
Essential Load Commands
Security Implications
- • LC_LOAD_DYLIB: Library hijacking vectors
- • LC_RPATH: Runtime search path manipulation
- • Modified segments: Code injection points
- • Missing signatures: Unsigned/tampered binaries
Analysis Commands
[🔬 HANDS_ON_WORKSHOP]
Time to put your knowledge to work! We'll analyze real samples from each format, identifying key structures and potential security issues.
YARA Integration in Practice
As you work through each exercise, think about how the patterns you discover could become YARA rules. The suspicious APIs, file structures, and strings you identify are exactly what YARA searches for!
⚠️ Important: These are text-based educational simulations that contain the patterns and strings of real malware without being executable. Tools like objdump
, readelf
, and otool
won't work. Focus on strings
and grep
for pattern analysis. Some magic numbers may not be present at the exact file offsets - the educational value is in learning to recognize the structural patterns and suspicious strings!
1PE Analysis: Suspected Trojan
🛡️ Safe Analysis Environment
This sample is a safe educational simulation designed to demonstrate PE structure analysis without any harmful behavior.
Step 1: Basic Information
Step 2: String Analysis (Import Patterns)
Step 3: Structure Analysis
Step 4: String Analysis
📊 Analysis Summary
Malware Category
Dropper/Downloader Trojan
Risk Level
High - Downloads additional payloads
Persistence
Registry Run key modification
2ELF Analysis: Suspected Rootkit
🛡️ Educational Sample
Safe rootkit simulation for learning ELF analysis techniques. Contains no actual malicious functionality.
Step 1: File Identification
Step 2: String Analysis (Symbol Patterns)
Step 3: Library Dependencies
Step 4: Section Analysis
🔍 Rootkit Indicators Found
- • Anti-debugging: ptrace() usage to detect analysis
- • Dynamic injection: dlopen() for runtime library loading
- • Syscall hooking: Functions to intercept system calls
- • RWX sections: Self-modifying code capability
- • Temp libraries: Malicious shared objects in /tmp
3Mach-O Analysis: macOS Malware
🛡️ macOS Sample
Educational Mach-O simulation demonstrating dylib hijacking and code signing bypass patterns. Completely safe for analysis.
Step 1: File Type & Architecture
Step 2: Load Commands
Step 3: Code Signing Analysis
Step 4: Dynamic Libraries
🍎 macOS Malware Characteristics
Bypass Techniques
Unsigned code, dylib hijacking
Persistence
Framework injection, @rpath manipulation
Stealth
Exploits code signing gaps
[🏆 MASTER_CHALLENGE]
🎯 Multi-Format Analysis Challenge
You've been given a suspicious file that appears to be polymorphic malware - it changes format based on the target system. Your mission: analyze all three variants.
Phase 1: Format Detection
- • Identify which format each sample uses
- • Extract magic numbers and signatures
- • Determine target architectures
Phase 2: Cross-Platform Analysis
- • Compare import/symbol tables across formats
- • Find common functionality indicators
- • Map equivalent APIs between platforms
Phase 3: Advanced Techniques
- • Extract embedded payloads from each format
- • Identify packing/obfuscation techniques
- • Create YARA rules for detection
🛠️ Tools You'll Need
Command Arsenal
📦 Challenge Materials
Download all samples and the step-by-step analysis guide to complete the master challenge.