Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Scanner class to provide options to init() #445

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
repos:
- repo: https://github.com/psf/black
rev: "22.6.0"
rev: "24.2.0"
hooks:
- id: black
- repo: https://github.com/pre-commit/pre-commit-hooks
Expand All @@ -17,16 +17,16 @@ repos:
args:
- -b main
- repo: https://github.com/PyCQA/flake8
rev: "4.0.1"
rev: "7.0.0"
hooks:
- id: flake8
- repo: https://github.com/PyCQA/isort
rev: "5.11.5"
rev: "5.13.2"
hooks:
- id: isort
args: ["--profile", "black", "--filter-files"]
# - repo: https://github.com/pre-commit/mirrors-mypy
# rev: v0.961
# rev: "1.9.0"
# hooks:
# - id: mypy
# additional_dependencies:
Expand Down
19 changes: 17 additions & 2 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -767,6 +767,21 @@ This option is set to false by default.
## Scanners
Each scanner parses files of a specific flavor and performs data collection and/or file extraction on them. Scanners are typically named after the type of file they are intended to scan (e.g. "ScanHtml", "ScanPe", "ScanRar") but may also be named after the type of function or tool they use to perform their tasks (e.g. "ScanExiftool", "ScanHeader", "ScanOcr").

Scanners implement the following functions:

- init(options)
- Executes when Strelka starts
- scan(data, file, options, expire_at))
- Does the work of the scanner when a file is submitted

Scanners have access to the following helpers:
- emit_file(data, name, flavors)
- Submits data extracted in a scanner as a new file

Scanners should use the following standard event fields:
- flags
- Highlights error conditions or important attributes e.g. "bad_format", "encrypted_headers"

### Scanner List
The table below describes each scanner and its options. Each scanner has the hidden option "scanner_timeout" which can override the distribution scanner_timeout.

Expand All @@ -789,7 +804,7 @@ The table below describes each scanner and its options. Each scanner has the hid
| ScanFalconSandbox | Sends files to an instance of Falcon Sandbox | `server` -- URL of the Falcon Sandbox API inteface <br>`priority` -- Falcon Sandbox priority assigned to the task (defaults to `3`)<br>`timeout` -- amount of time (in seconds) to wait for the task to upload (defaults to `60`)<br>`envID` -- list of numeric envrionment IDs that tells Falcon Sandbox which sandbox to submit a sample to (defaults to `[100]`)<br>`api_key` -- API key used for authenticating to Falcon Sandbox (defaults to None, optionally read from environment variable "FS_API_KEY")<br>`api_secret` -- API secret key used for authenticating to Falcon Sandbox (defaults to None, optionally read from environment variable "FS_API_SECKEY") |
| ScanFooter | Collects file footer | `length` -- number of footer characters to log as metadata (defaults to `50`) <br> `encodings` -- list of output encodings, any of `classic`, `raw`, `hex`, `backslash` |
| ScanGif | Extracts data embedded in GIF files | N/A |
| ScanGzip | Decompresses gzip files | N/A
| ScanGzip | Decompresses gzip files | N/A
| ScanHash | Calculates file hash values | N/A |
| ScanHeader | Collects file header | `length` -- number of header characters to log as metadata (defaults to `50`) <br> `encodings` -- list of output encodings, any of `classic`, `raw`, `hex`, `backslash` |
| ScanHtml | Collects metadata and extracts embedded files from HTML files | `parser` -- sets the HTML parser used during scanning (defaults to `html.parser`) <br> `max_links` -- Maximum amount of links to output in hyperlinks field (defaults to `50`) |
Expand Down Expand Up @@ -839,7 +854,7 @@ The table below describes each scanner and its options. Each scanner has the hid
| ScanXml | Log metadata and extract files from XML files | `extract_tags` -- list of XML tags that will have their text extracted as child files (defaults to empty list)<br>`metadata_tags` -- list of XML tags that will have their text logged as metadata (defaults to empty list) |
| ScanYara | Scans files with YARA rules | `location` -- location of the YARA rules file or directory (defaults to `/etc/strelka/yara/`)<br>`compiled` -- Enable use of compiled YARA rules, as well as the path.<br>`store_offset` -- Stores file offset for YARA match<br>`offset_meta_key` -- YARA meta key that must exist in the YARA rule for the offset to be stored.<br>`offset_padding` -- Amount of data to be stored before and after offset for additional context.<br>`category_key` -- Metadata key used to extract categories for YARA matches.<br>`categories` -- List of categories to organize YARA rules, which can be individually toggled to show metadata.<br>`show_meta` -- Toggles whether to show metadata for matches in each category.<br>`meta_fields` -- Specifies which metadata fields should be extracted for display.<br>`show_all_meta` -- Displays all metadata for each YARA rule match when enabled. |
| ScanZip | Extracts files from zip archives | `limit` -- maximum number of files to extract (defaults to `1000`)<br>`limit_metadata` -- stop adding file metadata when `limit` is reached (defaults to true)<br>`size_limit` -- maximum size for extracted files (defaults to `250000000`)<br>`crack_pws` -- use a dictionary to crack encrypted files (defaults to false)<br>`log_pws` -- log cracked passwords (defaults to true)<br>`password_file` -- location of passwords file for zip archives (defaults to `/etc/strelka/passwords.dat`) |
| ScanZlib | Decompresses gzip files | N/A
| ScanZlib | Decompresses gzip files | N/A

## Tests
As Strelka consists of many scanners and dependencies for those scanners. Pytests are particularly valuable for testing the ongoing functionality of Strelka and it's scanners. Tests allow users to write test cases that verify the correct behavior of Strelka scanners to ensure that the scanners remain reliable and accurate. Additionally, using pytests can help streamline the development process, allowing developers to focus on writing new features and improvements for the scanners. Strelka contains a set of standard test fixture files that represent the types of files Strelka ingests. Test fixtures can also be loaded remotely with the helper functions `get_remote_fixture` and `get_remote_fixture_archive` for scanner tests that need malicious samples.
Expand Down
3 changes: 3 additions & 0 deletions src/python/strelka/scanners/scan_antiword.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,9 @@ class ScanAntiword(strelka.Scanner):
Defaults to '/tmp/'.
"""

def init(self, options):
pass

def scan(self, data, file, options, expire_at):
tmp_directory = options.get("tmp_directory", "/tmp/")

Expand Down
3 changes: 3 additions & 0 deletions src/python/strelka/scanners/scan_base64.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@
class ScanBase64(strelka.Scanner):
"""Decodes base64-encoded file."""

def init(self, options):
pass

def scan(self, data, file, options, expire_at):
decoded = base64.b64decode(data)

Expand Down
3 changes: 3 additions & 0 deletions src/python/strelka/scanners/scan_base64_pe.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@
class ScanBase64PE(strelka.Scanner):
"""Decodes base64-encoded file."""

def init(self, options):
pass

def scan(self, data, file, options, expire_at):
with io.BytesIO(data) as encoded_file:
extract_data = b""
Expand Down
2 changes: 1 addition & 1 deletion src/python/strelka/scanners/scan_batch.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ class ScanBatch(strelka.Scanner):
lexer: Pygments lexer ('batch') used to parse the file.
"""

def init(self):
def init(self, options):
self.lexer = lexers.get_lexer_by_name("batch")

def scan(self, data, file, options, expire_at):
Expand Down
3 changes: 3 additions & 0 deletions src/python/strelka/scanners/scan_bmp_eof.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ class ScanBmpEof(strelka.Scanner):
the expected marker.
"""

def init(self, options):
pass

def scan(self, data, file, options, expire_at):
expectedSize = int.from_bytes(data[2:6], "little")
actualSize = len(data)
Expand Down
3 changes: 3 additions & 0 deletions src/python/strelka/scanners/scan_bzip2.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@
class ScanBzip2(strelka.Scanner):
"""Decompresses bzip2 files."""

def init(self, options):
pass

def scan(self, data, file, options, expire_at):
with io.BytesIO(data) as bzip2_io:
with bz2.BZ2File(filename=bzip2_io) as bzip2_obj:
Expand Down
3 changes: 3 additions & 0 deletions src/python/strelka/scanners/scan_ccn.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,9 @@ def digits_of(n):
def is_luhn_valid(self, card_number):
return self.luhn_checksum(card_number) == 0

def init(self, options):
pass

def scan(self, data, file, options, expire_at):
# re_amex = re.compile(rb"[^0-9](3[47][0-9]{13})[^0-9]")
# re_disc = re.compile(rb"[^0-9](6[0-9]{15})[^0-9]")
Expand Down
2 changes: 1 addition & 1 deletion src/python/strelka/scanners/scan_cuckoo.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ class ScanCuckoo(strelka.Scanner):
password: See description above.
"""

def init(self):
def init(self, options):
self.username = None
self.password = None
self.auth_check = False
Expand Down
3 changes: 3 additions & 0 deletions src/python/strelka/scanners/scan_delay.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@
class ScanDelay(strelka.Scanner):
"""Delays scanner execution."""

def init(self, options):
pass

def scan(self, data, file, options, expire_at):
delay = options.get("delay", 5.0)

Expand Down
3 changes: 3 additions & 0 deletions src/python/strelka/scanners/scan_dmg.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,9 @@ class ScanDmg(strelka.Scanner):

EXCLUDED_ROOT_DIRS = ["[SYSTEM]"]

def init(self, options):
pass

def scan(self, data, file, options, expire_at):
file_limit = options.get("limit", 1000)
tmp_directory = options.get("tmp_file_directory", "/tmp/")
Expand Down
3 changes: 3 additions & 0 deletions src/python/strelka/scanners/scan_docx.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@ class ScanDocx(strelka.Scanner):
Defaults to False.
"""

def init(self, options):
pass

def scan(self, data, file, options, expire_at):
extract_text = options.get("extract_text", False)
with io.BytesIO(data) as docx_io:
Expand Down
3 changes: 3 additions & 0 deletions src/python/strelka/scanners/scan_donut.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@
class ScanDonut(strelka.Scanner):
"""Extracts configs and modules from donut payloads"""

def init(self, options):
pass

def scan(self, data, file, options, expire_at):
tmp_directory = options.get("tmp_directory", "/tmp/")

Expand Down
3 changes: 3 additions & 0 deletions src/python/strelka/scanners/scan_elf.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,9 @@
class ScanElf(strelka.Scanner):
"""Collects metadata from ELF files."""

def init(self, options):
pass

def scan(self, data, file, options, expire_at):
elf = ELF.parse(raw=list(data))

Expand Down
3 changes: 3 additions & 0 deletions src/python/strelka/scanners/scan_email.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,9 @@ class ScanEmail(strelka.Scanner):
including inline images.
"""

def init(self, options):
pass

def scan(self, data, file, options, expire_at):
"""
Processes the email, extracts metadata and attachments, and optionally generates a thumbnail.
Expand Down
3 changes: 3 additions & 0 deletions src/python/strelka/scanners/scan_encrypted_doc.py
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,9 @@ class ScanEncryptedDoc(strelka.Scanner):
Defaults to /etc/strelka/passwords.dat.
"""

def init(self, options):
pass

def scan(self, data, file, options, expire_at):
jtr_path = options.get("jtr_path", "/jtr/")
tmp_directory = options.get("tmp_file_directory", "/tmp/")
Expand Down
3 changes: 3 additions & 0 deletions src/python/strelka/scanners/scan_encrypted_zip.py
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,9 @@ class ScanEncryptedZip(strelka.Scanner):
Defaults to /etc/strelka/passwords.dat.
"""

def init(self, options):
pass

def scan(self, data, file, options, expire_at):
jtr_path = options.get("jtr_path", "/jtr/")
tmp_directory = options.get("tmp_file_directory", "/tmp/")
Expand Down
3 changes: 3 additions & 0 deletions src/python/strelka/scanners/scan_entropy.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,8 @@
class ScanEntropy(strelka.Scanner):
"""Calculates entropy of files."""

def init(self, options):
pass

def scan(self, data, file, options, expire_at):
self.event["entropy"] = entropy.shannon_entropy(data)
2 changes: 1 addition & 1 deletion src/python/strelka/scanners/scan_exception.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ class ScanException(strelka.Scanner):
Defaults to 0 (unlimited).
"""

def init(self):
def init(self, options):
pass

def scan(self, data, file, options, expire_at):
Expand Down
3 changes: 3 additions & 0 deletions src/python/strelka/scanners/scan_exiftool.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@ class ScanExiftool(strelka.Scanner):
Defaults to '/tmp/'.
"""

def init(self, options):
pass

def scan(self, data, file, options, expire_at):
tmp_directory = options.get("tmp_directory", "/tmp/")

Expand Down
Loading
Loading