YARA (Yet Another Ridiculous Acronym) is the standard language for identifying and classifying malware samples based on textual or binary patterns. If MITRE ATT&CK describes behavior, YARA describes characteristics.
It is the primary tool used by CTI analysts to hunt for malware families across datasets like VirusTotal.
Anatomy of a YARA Rule
A YARA rule consists of three main sections: meta, strings, and condition.
rule Suspicious_Powershell_Downloader {
meta:
description = "Detects PowerShell script downloading content"
author = "CTI Analyst"
tlp = "WHITE"
strings:
$s1 = "Net.WebClient" nocase
$s2 = "DownloadString" nocase
$hex_pattern = { 4D 5A 90 } // HEX for PE File Header
condition:
all of them
}
1. Meta
Contains descriptive information. This does not affect detection but is crucial for organization and context.
2. Strings
Defines the patterns to look for.
-
Text Strings: Plain text (e.g.,
"password"). -
Hex Strings: Byte sequences, useful for binary analysis.
-
Regex: Regular expressions for flexible matching.
3. Condition
Boolean logic that determines when the rule matches.
-
any of them: Matches if at least one string is found. -
$s1 and $s2: Matches only if both are found. -
filesize < 500KB: Matches only small files.
Best Practices
-
Be Specific: Generic strings like "Microsoft" will generate too many false positives.
-
Use Hex: Malware authors can change variable names, but changing compiled bytecode is harder. This targets the "Tools" level of the Pyramid of Pain.