Skip to content

Commit

Permalink
Merge pull request #85 from mitre/feature-celery
Browse files Browse the repository at this point in the history
So many commits, much celery, wow
  • Loading branch information
Drewsif authored Mar 12, 2018
2 parents 387a3a9 + cc4ccb2 commit 09004f7
Show file tree
Hide file tree
Showing 155 changed files with 35,633 additions and 2,294 deletions.
22 changes: 21 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,21 @@ report.json
__pycache__/
*.py[cod]
*.swp
*.swo

# C extensions
*.so
*.dll

#PyCharm
.idea
#Keys dir

# VSCode
.vscode/

# Keys dir
keys/

# Distribution / packaging
.Python
env/
Expand All @@ -34,14 +42,17 @@ var/
*.egg-info/
.installed.cfg
*.egg

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
.pytest_cache
htmlcov/
Expand All @@ -50,16 +61,25 @@ htmlcov/
.cache
nosetests.xml
coverage.xml
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Sqlite DB
sqlite.db
task_db
testing.db

# Tmp Upload Dir
utils/tmp/
12 changes: 12 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
- repo: git@github.com:pre-commit/pre-commit-hooks
sha: v1.2.0
hooks:
- id: end-of-file-fixer
- id: trailing-whitespace
- id: check-merge-conflict
- id: detect-private-key
- id: mixed-line-ending
- id: flake8
args:
- --ignore=E126,E127,E128,E402
- --max-line-length=120
6 changes: 6 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,12 @@ python:
- "3.4"
- "3.5"
- "3.6"
env:
- MOZ_HEADLESS=1
addons:
firefox: latest
before_install:
- npm install -g geckodriver
install:
- yes "" | sudo -HE ./install.sh
- pip install -r requirements.txt
Expand Down
1 change: 0 additions & 1 deletion LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -364,4 +364,3 @@ Exhibit B - "Incompatible With Secondary Licenses" Notice
This Source Code Form is "Incompatible
With Secondary Licenses", as defined by
the Mozilla Public License, v. 2.0.

172 changes: 88 additions & 84 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,84 +1,88 @@
MultiScanner
============
[![Build Status](https://travis-ci.org/mitre/multiscanner.svg)](https://travis-ci.org/mitre/multiscanner)

Introduction
------------
MultiScanner is a file analysis framework that assists the user in evaluating a set
of files by automatically running a suite of tools for the user and aggregating the output.
Tools can be custom built python scripts, web APIs, software running on another machine, etc.
Tools are incorporated by creating modules that run in the MultiScanner framework.

Modules are designed to be quickly written and easily incorporated into the framework.
Currently written and maintained modules are related to malware analytics, but the framework is not limited to that
scope. For a list of modules you can look in [modules](modules), descriptions and config
options can be found in [docs/modules.md](docs/modules.md)

Requirements
------------
Python 3.6 is recommended. Compatibility with 2.7+ and
3.4+ is supported but not as thoroughly maintained and tested. Please submit an issue
or a pull request fixing any issues found with other versions of Python.


An installer script is included in the project [install.sh](<install.sh>), which
installs the prerequisites on most systems.

Installation
------------
### MultiScanner ###
If you're running on a RedHat or Debian based linux distribution you should try and run
[install.sh](<install.sh>). Otherwise the required python packages are defined in
[requirements.txt](<requirements.txt>).

MultiScanner must have a configuration file to run. Generate the MultiScanner default
configuration by running `python multiscanner.py init` after cloning the repository.
This command can be used to rewrite the configuration file to its default state or,
if new modules have been written, to add their configuration to the configuration
file.

### Analytic Machine ###
Default modules have the option to be run locally or via SSH. The development team
runs MultiScanner on a Linux host and hosts the majority of analytical tools on
a separate Windows machine. The SSH server used in this environment is freeSSHd
from <http://www.freesshd.com/>.

A network share accessible to both the MultiScanner and the Analytic Machines is
required for the multi-machine setup. Once configured, the network share path must
be identified in the configuration file, config.ini. To do this, set the `copyfilesto`
option under `[main]` to be the mount point on the system running MultiScanner.
Modules can have a `replacement path` option, which is the network share mount point
on the analytic machine.

Module Writing
--------------
Modules are intended to be quickly written and incorporated into the framework.
A finished module must be placed in the modules folder before it can be used. The
configuration file does not need to be manually updated. See [docs/module\_writing.md](<docs/module_writing.md>)
for more information.

Module Configuration
--------------------
Modules are configured within the configuration file, config.ini. See
[docs/modules.md](<docs/modules.md>) for more information.

Python API
----------
MultiScanner can be incorporated as a module in another projects. Below is a simple
example of how to import MultiScanner into a Python script.

``` python
import multiscanner
output = multiscanner.multiscan(FileList)
Results = multiscanner.parse_reports(output, python=True)
```

Results is a dictionary object where each key is a filename of a scanned file.

`multiscanner.config_init(filepath)` will create a default configuration file at
the location defined by filepath.

Other Reading
-------------
For more information on module configuration or writing modules check the
[docs](<docs>) folder.
MultiScanner
============
[![Build Status](https://travis-ci.org/mitre/multiscanner.svg)](https://travis-ci.org/mitre/multiscanner)

Introduction
------------
MultiScanner is a file analysis framework that assists the user in evaluating a set
of files by automatically running a suite of tools for the user and aggregating the output.
Tools can be custom built python scripts, web APIs, software running on another machine, etc.
Tools are incorporated by creating modules that run in the MultiScanner framework.

Modules are designed to be quickly written and easily incorporated into the framework.
Currently written and maintained modules are related to malware analytics, but the framework is not limited to that
scope. For a list of modules you can look in [modules](modules), descriptions and config
options can be found in [docs/modules.md](docs/modules.md)

Requirements
------------
Python 3.6 is recommended. Compatibility with 2.7+ and
3.4+ is supported but not as thoroughly maintained and tested. Please submit an issue
or a pull request fixing any issues found with other versions of Python.


An installer script is included in the project [install.sh](<install.sh>), which
installs the prerequisites on most systems.

Installation
------------
### MultiScanner ###
If you're running on a RedHat or Debian based linux distribution you should try and run
[install.sh](<install.sh>). Otherwise the required python packages are defined in
[requirements.txt](<requirements.txt>).

MultiScanner must have a configuration file to run. Generate the MultiScanner default
configuration by running `python multiscanner.py init` after cloning the repository.
This command can be used to rewrite the configuration file to its default state or,
if new modules have been written, to add their configuration to the configuration
file.

### Analytic Machine ###
Default modules have the option to be run locally or via SSH. The development team
runs MultiScanner on a Linux host and hosts the majority of analytical tools on
a separate Windows machine. The SSH server used in this environment is freeSSHd
from <http://www.freesshd.com/>.

A network share accessible to both the MultiScanner and the Analytic Machines is
required for the multi-machine setup. Once configured, the network share path must
be identified in the configuration file, config.ini. To do this, set the `copyfilesto`
option under `[main]` to be the mount point on the system running MultiScanner.
Modules can have a `replacement path` option, which is the network share mount point
on the analytic machine.

Module Writing
--------------
Modules are intended to be quickly written and incorporated into the framework.
A finished module must be placed in the modules folder before it can be used. The
configuration file does not need to be manually updated. See [docs/module\_writing.md](<docs/module_writing.md>)
for more information.

Module Configuration
--------------------
Modules are configured within the configuration file, config.ini. See
[docs/modules.md](<docs/modules.md>) for more information.

Python API
----------
MultiScanner can be incorporated as a module in another projects. Below is a simple
example of how to import MultiScanner into a Python script.

``` python
import multiscanner
output = multiscanner.multiscan(FileList)
Results = multiscanner.parse_reports(output, python=True)
```

Results is a dictionary object where each key is a filename of a scanned file.

`multiscanner.config_init(filepath)` will create a default configuration file at
the location defined by filepath.

Distributed MultiScanner
------------------------
MultiScanner is also part of a distributed, scalable file analysis framework, complete with distributed task management, web interface, REST API, and report storage. Please set [Distributed Multiscanner](<docs/distributed_multiscanner.md>) for more details. Additionally, we distribute a standalone Docker container with the base set of features (web UI, REST API, ElasticSearch node) as an introduction to the capabilities of this Distributed MultiScanner. See [here](<docs/docker_standalone.md>) for more details. (*Note*: this standalone container should not be used in production, it is simply a primer on what a full installation would look like).

Other Reading
-------------
For more information on module configuration or writing modules check the
[docs](<docs>) folder.
26 changes: 13 additions & 13 deletions TODO.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
This is a list of things that are wanted features

# Feature Updates #
- **Better output** - Printing json to the console is not super pretty. Maybe making an HTML output available for an analyst?
- **Module logging** - Create an API that allows modules to log errors and messages to a file.
- **Multiprocessing** - Currently modules are only run as threads, giving modules access to a multiprocessing pool for cpu intensive modules would be good.
- **Ability for modules to submit files** - Having modules be able to extract files that should be scanned and included in the report could be helpful in some use cases.
- **Maliciousness Weight** - Allow an analyst to define custom weights to results to priorities what to look at. Also having a "is malicious" flag if a file breaches a threshold
- **REST API** - Creating a script that provides a web api to submit files and pull reports.

# New Modules #
- OPSWAT Metascan
- PEframe https://github.com/guelfoweb/peframe
This is a list of things that are wanted features

# Feature Updates #
- **Better output** - Printing json to the console is not super pretty. Maybe making an HTML output available for an analyst?
- **Module logging** - Create an API that allows modules to log errors and messages to a file.
- **Multiprocessing** - Currently modules are only run as threads, giving modules access to a multiprocessing pool for cpu intensive modules would be good.
- **Ability for modules to submit files** - Having modules be able to extract files that should be scanned and included in the report could be helpful in some use cases.
- **Maliciousness Weight** - Allow an analyst to define custom weights to results to priorities what to look at. Also having a "is malicious" flag if a file breaches a threshold
- **REST API** - Creating a script that provides a web api to submit files and pull reports.

# New Modules #
- OPSWAT Metascan
- PEframe https://github.com/guelfoweb/peframe
7 changes: 5 additions & 2 deletions __init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
import sys
import os
import sys

sys.path.insert(0, os.path.dirname(__file__))
from . import multiscanner, storage

from . import multiscanner

common = multiscanner.common
multiscan = multiscanner.multiscan
parse_reports = multiscanner.parse_reports
Expand Down
Loading

0 comments on commit 09004f7

Please sign in to comment.