WebSecScanner

Passive web application security analyzer for HAR files and raw HTTP captures — zero network calls, CI-ready, SARIF-native.

The problem

Most web application security issues are not exotic — they're misconfigurations sitting in plain sight in every HTTP response: a missing Content-Security-Policy, a session cookie without HttpOnly, a CORS endpoint that reflects the request origin with credentials, a stack trace leaking the framework version, a form that posts over http://.

Active scanners (the kind that fire payloads at a live target) are powerful but heavy: they need authorization, they generate traffic, they can disrupt production, and you cannot run them in an air-gapped CI job.

WebSecScanner takes the opposite approach. It is passive: you capture traffic you already have access to — a HAR export from browser DevTools, a proxy log, or plain curl -i output — and the scanner audits it offline. No packets are sent. Nothing leaves your machine. That makes it safe to drop into any CI pipeline and run on every pull request.

What it checks

Analyzer	Rules	Examples of what it catches
Security headers	`WSS-HDR-001..007`	Missing `X-Content-Type-Options`, clickjacking exposure, weak `Referrer-Policy`, missing/`report-only` CSP, legacy XSS auditor, missing COOP/Permissions-Policy
Content-Security-Policy	`WSS-CSP-001..007`	`unsafe-inline` / `unsafe-eval`, wildcard `*` sources, plaintext `http:` sources, missing `object-src 'none'` / `base-uri` / `default-src`
Cookies	`WSS-CK-001..005`	Missing `Secure` / `HttpOnly` / `SameSite`, `SameSite=None` without `Secure`, `__Host-`/`__Secure-` prefix violations
CORS	`WSS-CORS-001..004`	Wildcard origin with credentials, trusted `null` origin, reflected-origin-with-credentials, wildcard allowed-headers
Information disclosure	`WSS-INFO-001..005`	Version banners, stack traces / SQL errors, directory listings, leaked secrets (AWS/GCP/GitHub/Slack/Stripe/JWT/PEM), Luhn-valid card numbers
Transport	`WSS-TRANS-001..005`	Cleartext HTTP, missing/weak HSTS, active mixed content, insecure form actions

Every finding carries a severity, CWE, OWASP mapping, evidence (redacted where sensitive), and concrete remediation guidance.

$ websec-scanner rules        # full machine + human readable rule catalogue

Architecture

flowchart LR
    A[HAR / raw .http] --> B[Parsers]
    B -->|HTTPResponse[]| C[Scanner]
    C --> D{Analyzers}
    D --> E[Findings]
    E --> F[Scoring 0-100 + verdict]
    F --> G[Reporters: console / json / csv / sarif / html]

See docs/architecture.md for the full design, scoring model, and verdict bands.

Installation

# From source (recommended while iterating)
git clone https://github.com/srini-cybersec/websec-scanner.git
cd websec-scanner
python -m venv .venv && source .venv/bin/activate
pip install -e .

# Or bootstrap everything (venv + dev tools + tests) in one shot
./scripts/setup.sh

Requires Python 3.11+. Runtime dependencies are minimal: click, rich, jinja2, pyyaml.

Usage

Scan a HAR export

websec-scanner scan capture.har

╭───────────────────────────── WebSecScanner ─────────────────────────────╮
│ Source: capture.har                                                      │
│ Responses analysed: 2                                                    │
│ Total findings: 14                                                       │
│ Risk score: 100/100                                                      │
│ Verdict: CRITICAL                                                        │
╰──────────────────────────────────────────────────────────────────────────╯
Critical: 0  High: 4  Medium: 5  Low: 3  Info: 2
                                  Findings
┏━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Severity ┃ Rule          ┃ Title                   ┃ Location              ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━┩
│ High     │ WSS-CK-001    │ Cookie missing Secure   │ entry[0] …/v1/profile │
│ High     │ WSS-CORS-003  │ CORS reflects Origin    │ entry[0] …/v1/profile │
│ …        │ …             │ …                       │ …                     │
└──────────┴───────────────┴─────────────────────────┴───────────────────────┘

Scan a raw HTTP response

A raw capture is a status line + headers + body. Optional # @url: / # @method: annotations let a standalone response carry its context:

# @url: https://shop.example.com/account
HTTP/2 200
content-type: text/html
set-cookie: SESSIONID=abc; Path=/
content-security-policy: default-src 'self'; script-src 'self' 'unsafe-inline' *

<!doctype html>...

websec-scanner scan response.http
curl -is https://example.com | websec-scanner scan /dev/stdin --url https://example.com

Output formats

websec-scanner scan capture.har -f json   -o report.json
websec-scanner scan capture.har -f csv    -o findings.csv
websec-scanner scan capture.har -f html   -o report.html
websec-scanner scan capture.har -f sarif  -o results.sarif   # GitHub Code Scanning

CI gating

Fail the build when anything at or above a severity is present:

websec-scanner scan capture.har --fail-on high      # exit 1 on HIGH/CRITICAL
websec-scanner scan ./captures/ --min-severity medium --exclude-rule WSS-HDR-004

scan accepts a single file or a directory (every *.har, *.http, *.txt, *.raw, *.resp is parsed and analysed together).

Programmatic API

from websec_scanner.core.scanner import Scanner
from websec_scanner.parsers import parse_path

scanner = Scanner()
result = scanner.scan(parse_path("capture.har"), source="capture.har")

print(result.risk_score, result.verdict.value)
for finding in result.findings:
    print(finding.severity.label, finding.rule_id, finding.title)

A runnable version lives in examples/demo.py.

Configuration

Precedence (highest first): CLI flags → environment (WEBSEC_*) → .websec-scanner.yml → defaults.

# .websec-scanner.yml
min_severity: INFO
fail_on: HIGH
redact: true
max_body_bytes: 2000000
excluded_rules:
  - WSS-HDR-004

export WEBSEC_FAIL_ON=critical
export WEBSEC_EXCLUDED_RULES="WSS-HDR-004,WSS-HDR-005"
export WEBSEC_REDACT=true

Docker

docker build -t websec-scanner .

# Mount the directory with your captures read-only at /data
docker run --rm -v "$PWD/examples:/data:ro" websec-scanner scan /data/sample.har -f sarif

The image is hardened: multi-stage build, non-root user (UID 10001), no shell entrypoint, and docker-compose.yml adds read_only, cap_drop: ALL, and no-new-privileges.

Development

./scripts/setup.sh                 # venv + deps + tests
black src/ tests/ --line-length 88
ruff check src/ tests/
mypy src/ --ignore-missing-imports --no-strict-optional
bandit -r src/ -ll
pytest tests/ --cov=src/websec_scanner --cov-report=term-missing

122 tests, 94% coverage
black / ruff / mypy clean
bandit — 0 findings

Security considerations

Passive by design. WebSecScanner makes zero network connections and emits no telemetry. It only reads the files you give it.
Evidence redaction. Detected secrets and card numbers are redacted by default (AKIA…MPLE); disable with redact: false only when handling the output securely.
Body-size cap. Response bodies are truncated to max_body_bytes before analysis to bound memory on pathological captures.
Heuristic findings. Passive analysis cannot confirm exploitability the way an active test can (e.g. CORS reflection is inferred from a single captured Origin). Treat findings as high-quality leads, and validate before filing.
Captures may contain secrets. HAR files frequently embed cookies, tokens, and request bodies. Handle them like any sensitive artifact; don't commit them.

Contributing

Contributions are welcome! Please:

Fork and create a feature branch.
Add tests for new rules/behaviour (keep coverage ≥ 85%).
Ensure black, ruff, mypy, bandit, and pytest all pass.
Open a pull request describing the change and the threat it addresses.

New detection rules should include a stable WSS-<DOMAIN>-NNN id, a CWE/OWASP mapping, and remediation text.

License

Disclaimer. WebSecScanner is intended for defensive security, security education, and authorized assessment of systems and traffic you own or are permitted to analyze. You are responsible for how you use it.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
docs		docs
examples		examples
scripts		scripts
src/websec_scanner		src/websec_scanner
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.websec-scanner.yml		.websec-scanner.yml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WebSecScanner

The problem

What it checks

Architecture

Installation

Usage

Scan a HAR export

Scan a raw HTTP response

Output formats

CI gating

Programmatic API

Configuration

Docker

Development

Security considerations

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

WebSecScanner

The problem

What it checks

Architecture

Installation

Usage

Scan a HAR export

Scan a raw HTTP response

Output formats

CI gating

Programmatic API

Configuration

Docker

Development

Security considerations

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages