Rust Binary Analysis
PhD Research on Rust Binary Analysis and Malware Detection
View on GitHubRust Binary Analysis
Comprehensive guide to analyzing Rust binaries for malware detection and reverse engineering.
Overview
Rust binaries present unique challenges for reverse engineering due to:
- Monomorphization: Generic code expanded at compile time
- Name Mangling: Complex symbol naming scheme
- Trait Objects: Dynamic dispatch via vtables
- Zero-cost Abstractions: High-level code compiled to efficient machine code
- Standard Library Inlining: Heavy inlining of std library code
This guide covers the essential concepts, tools, and techniques for analyzing Rust binaries.
Rust Compilation Pipeline
Compilation Stages
Source Code (.rs)
↓
AST (Abstract Syntax Tree)
↓
HIR (High-level IR)
↓
MIR (Mid-level IR)
↓
LLVM IR
↓
Machine Code
Key Transformations
- Macro Expansion: Macros expanded before type checking
- Type Checking: Rust’s borrow checker and type system
- Monomorphization: Generic types instantiated for each concrete type
- Optimization: LLVM optimization passes
- Code Generation: Final machine code generation
Unique Rust Features
1. Monomorphization
What is it?
- Generic functions/types are specialized for each concrete type used
- Results in code duplication but zero runtime overhead
Impact on Analysis:
fn process<T>(value: T) { /* ... */ }
// Generates separate functions:
process::<i32>(42); // _ZN7process3i32E
process::<String>(s); // _ZN7process6StringE
Detection Strategy:
- Look for similar code patterns with different types
- Analyze symbol names for type parameters
- Track vtable usage patterns
2. Trait Objects and Vtables
What are they?
- Dynamic dispatch mechanism in Rust
- Similar to C++ virtual functions
- Runtime polymorphism through fat pointers
Structure:
Trait Object (Fat Pointer)
├── Data Pointer → Actual object
└── Vtable Pointer → Virtual method table
Vtable Layout:
Vtable
├── Drop function pointer
├── Size
├── Alignment
└── Method pointers
Analysis Techniques:
- Locate vtable structures in
.dataor.rdatasections - Trace fat pointer usage (2-word pointers)
- Map vtable methods to trait implementations
3. Name Mangling
Rust Name Mangling Scheme:
_ZN{namespace_len}{namespace}{fn_len}{function}...E
Example:
_ZN4core3fmt9Arguments6new_v117h1234567890abcdefE
│ └─┬─┘ └┬┘ └──┬───┘ └─┬──┘
│ │ │ │ └─ Hash
│ │ │ └───────── Function name
│ │ └───────────────── Module name
│ └────────────────────── Crate name
└─────────────────────────── Mangling prefix
Demangling Tools:
rustfilt: Command-line demangler- IDA Pro: Built-in Rust demangling
- Ghidra: Rust demangler script
c++filt: Can demangle some Rust symbols
# Using rustfilt
echo "_ZN4core3fmt9Arguments6new_v117h1234567890abcdefE" | rustfilt
# Output: core::fmt::Arguments::new_v1
4. Error Handling (Result and Option)
Pattern Recognition:
// Result<T, E> enum
enum Result<T, E> {
Ok(T),
Err(E),
}
// Option<T> enum
enum Option<T> {
Some(T),
None,
}
Binary Signatures:
- Discriminant checks (tag values 0/1)
- Panic unwinding code
- Error propagation patterns
5. Ownership and Borrowing
Binary Implications:
- No garbage collector → predictable memory operations
- RAII patterns → destructors called at scope exit
- Move semantics → explicit data transfers
Analysis Impact:
- Clear memory allocation/deallocation patterns
- Predictable stack frame cleanup
- Less heap fragmentation
Binary Analysis Techniques
Static Analysis
1. String Analysis
# Extract strings
strings binary.exe | grep -i "rust"
# Look for Rust-specific strings
- "panicked at"
- "src/lib.rs"
- Cargo package names
2. Symbol Analysis
# List symbols
nm binary.exe
# Demangle Rust symbols
nm binary.exe | rustfilt
# Find Rust core functions
nm binary.exe | grep "_ZN" | rustfilt | grep "core::"
3. Section Analysis
# Analyze binary sections
readelf -S binary.elf
# Check for Rust-specific sections
- .text (large due to monomorphization)
- .rodata (strings, vtables)
- .data.rel.ro (vtables)
Dynamic Analysis
1. Debugging
# GDB with Rust support
gdb binary
(gdb) set language rust
(gdb) break main
(gdb) run
2. Tracing
# System call tracing
strace -f binary
# Library call tracing
ltrace binary
3. Memory Analysis
- Monitor heap allocations (jemalloc/system allocator)
- Track ownership transfers
- Identify panic/unwinding behavior
Tooling
Disassemblers
IDA Pro
- Strong Rust support
- Automatic symbol demangling
- Trait object recognition
- FLIRT signatures for std library
Configuration:
Options → Compiler:
- Set compiler to "Visual C++" or "GNU C++"
- Enable Rust name demangling
Ghidra
- Community Rust analysis scripts
- Custom type libraries for Rust
- Vtable analysis capabilities
Plugins:
- ghidra-rust-demangler
- Rust type libraries
Binary Ninja
- Rust demangling support
- API for custom Rust analysis
- Community plugins
Decompilers
Challenges:
- Heavy inlining makes decompilation verbose
- Monomorphization creates similar code blocks
- Name mangling obscures function purpose
Best Practices:
- Demangle symbols first
- Start with exported/public functions
- Use cross-references to understand code flow
- Look for panic handlers to identify error paths
Specialized Tools
- cargo-binutils: Analyze Rust binaries
cargo install cargo-binutils cargo objdump -- -d cargo nm cargo size - twiggy: Code size profiler
cargo install twiggy twiggy top binary.wasm - cargo-bloat: Find what takes space
cargo install cargo-bloat cargo bloat --release
Identifying Rust Binaries
Static Indicators
✅ Strong Indicators:
- Rust panic strings: “panicked at”, “attempt to”
- Source paths: “src/lib.rs”, “.cargo/registry”
- Mangled symbols starting with
_ZN - Cargo package metadata strings
✅ Moderate Indicators:
- Large
.textsection (monomorphization) - Many similar functions (generic instantiations)
- Specific allocator patterns (jemalloc)
- No C++ RTTI structures
Automated Detection
Using file/strings:
# Check for Rust indicators
strings binary.exe | grep -E "(rust|cargo|panicked)"
# Look for source paths
strings binary.exe | grep "\.rs:"
Using cargo-modules (if source available):
cargo modules generate graph --lib | dot -Tpng > graph.png
Common Patterns
1. Main Function
Rust main:
; Typical Rust main wrapper
push rbp
mov rbp, rsp
call _ZN3std2rt10lang_start ; std::rt::lang_start
2. Panic Handler
Panic signature:
; Panic location structure
; [filename_ptr, filename_len, line, column]
lea rdi, [rip + panic_loc]
call _ZN4core9panicking9panic_fmt
3. String Slices
&str representation:
; [pointer, length]
mov rdi, string_ptr
mov rsi, string_len
Analysis Workflow
1. Initial Triage
- Identify as Rust binary
- Extract strings and symbols
- Demangle symbol names
- Map imports/exports
2. Control Flow Analysis
- Locate main function
- Identify panic handlers
- Map trait object usage
- Trace error propagation
3. Data Flow Analysis
- Track string operations
- Monitor memory allocations
- Analyze crypto usage
- Identify network I/O
4. Behavioral Analysis
- System calls
- File operations
- Network connections
- Registry/config access
Advanced Topics
1. Async/Await Analysis
- State machines from async functions
- Future trait implementations
- Tokio/async-std runtime detection
2. Macro-Generated Code
- Procedural macro output
- Derive macros (Debug, Serialize, etc.)
- Pattern recognition in generated code
3. Unsafe Code
- Raw pointer operations
- FFI boundaries
- Inline assembly
Case Studies
Analyzing a Simple Rust Binary
See Basic PL Concepts for a detailed walkthrough.
Real-World Malware
- BlackCat/ALPHV ransomware (Rust-based)
- TrickBot (Rust modules)
- Rustbucket macOS malware
Resources
Official Documentation
Research Papers
- “Rust Binary Analysis, Feature by Feature” - Checkpoint Research
- rust-re-tour
Tools
- rustfilt - Symbol demangler
- cargo-binutils
- cargo-asm - View assembly output