Skip to content

srini-cybersec/websec-scanner

Repository files navigation

WebSecScanner

Passive web application security analyzer for HAR files and raw HTTP captures — zero network calls, CI-ready, SARIF-native.

CI Python 3.11+ License: MIT Code style: black Security: bandit


The problem

Most web application security issues are not exotic — they're misconfigurations sitting in plain sight in every HTTP response: a missing Content-Security-Policy, a session cookie without HttpOnly, a CORS endpoint that reflects the request origin with credentials, a stack trace leaking the framework version, a form that posts over http://.

Active scanners (the kind that fire payloads at a live target) are powerful but heavy: they need authorization, they generate traffic, they can disrupt production, and you cannot run them in an air-gapped CI job.

WebSecScanner takes the opposite approach. It is passive: you capture traffic you already have access to — a HAR export from browser DevTools, a proxy log, or plain curl -i output — and the scanner audits it offline. No packets are sent. Nothing leaves your machine. That makes it safe to drop into any CI pipeline and run on every pull request.

What it checks

Analyzer Rules Examples of what it catches
Security headers WSS-HDR-001..007 Missing X-Content-Type-Options, clickjacking exposure, weak Referrer-Policy, missing/report-only CSP, legacy XSS auditor, missing COOP/Permissions-Policy
Content-Security-Policy WSS-CSP-001..007 unsafe-inline / unsafe-eval, wildcard * sources, plaintext http: sources, missing object-src 'none' / base-uri / default-src
Cookies WSS-CK-001..005 Missing Secure / HttpOnly / SameSite, SameSite=None without Secure, __Host-/__Secure- prefix violations
CORS WSS-CORS-001..004 Wildcard origin with credentials, trusted null origin, reflected-origin-with-credentials, wildcard allowed-headers
Information disclosure WSS-INFO-001..005 Version banners, stack traces / SQL errors, directory listings, leaked secrets (AWS/GCP/GitHub/Slack/Stripe/JWT/PEM), Luhn-valid card numbers
Transport WSS-TRANS-001..005 Cleartext HTTP, missing/weak HSTS, active mixed content, insecure form actions

Every finding carries a severity, CWE, OWASP mapping, evidence (redacted where sensitive), and concrete remediation guidance.

$ websec-scanner rules        # full machine + human readable rule catalogue

Architecture

flowchart LR
    A[HAR / raw .http] --> B[Parsers]
    B -->|HTTPResponse[]| C[Scanner]
    C --> D{Analyzers}
    D --> E[Findings]
    E --> F[Scoring 0-100 + verdict]
    F --> G[Reporters: console / json / csv / sarif / html]
Loading

See docs/architecture.md for the full design, scoring model, and verdict bands.

Installation

# From source (recommended while iterating)
git clone https://github.com/srini-cybersec/websec-scanner.git
cd websec-scanner
python -m venv .venv && source .venv/bin/activate
pip install -e .

# Or bootstrap everything (venv + dev tools + tests) in one shot
./scripts/setup.sh

Requires Python 3.11+. Runtime dependencies are minimal: click, rich, jinja2, pyyaml.

Usage

Scan a HAR export

websec-scanner scan capture.har
╭───────────────────────────── WebSecScanner ─────────────────────────────╮
│ Source: capture.har                                                      │
│ Responses analysed: 2                                                    │
│ Total findings: 14                                                       │
│ Risk score: 100/100                                                      │
│ Verdict: CRITICAL                                                        │
╰──────────────────────────────────────────────────────────────────────────╯
Critical: 0  High: 4  Medium: 5  Low: 3  Info: 2
                                  Findings
┏━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Severity ┃ Rule          ┃ Title                   ┃ Location              ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━┩
│ High     │ WSS-CK-001    │ Cookie missing Secure   │ entry[0] …/v1/profile │
│ High     │ WSS-CORS-003  │ CORS reflects Origin    │ entry[0] …/v1/profile │
│ …        │ …             │ …                       │ …                     │
└──────────┴───────────────┴─────────────────────────┴───────────────────────┘

Scan a raw HTTP response

A raw capture is a status line + headers + body. Optional # @url: / # @method: annotations let a standalone response carry its context:

# @url: https://shop.example.com/account
HTTP/2 200
content-type: text/html
set-cookie: SESSIONID=abc; Path=/
content-security-policy: default-src 'self'; script-src 'self' 'unsafe-inline' *

<!doctype html>...
websec-scanner scan response.http
curl -is https://example.com | websec-scanner scan /dev/stdin --url https://example.com

Output formats

websec-scanner scan capture.har -f json   -o report.json
websec-scanner scan capture.har -f csv    -o findings.csv
websec-scanner scan capture.har -f html   -o report.html
websec-scanner scan capture.har -f sarif  -o results.sarif   # GitHub Code Scanning

CI gating

Fail the build when anything at or above a severity is present:

websec-scanner scan capture.har --fail-on high      # exit 1 on HIGH/CRITICAL
websec-scanner scan ./captures/ --min-severity medium --exclude-rule WSS-HDR-004

scan accepts a single file or a directory (every *.har, *.http, *.txt, *.raw, *.resp is parsed and analysed together).

Programmatic API

from websec_scanner.core.scanner import Scanner
from websec_scanner.parsers import parse_path

scanner = Scanner()
result = scanner.scan(parse_path("capture.har"), source="capture.har")

print(result.risk_score, result.verdict.value)
for finding in result.findings:
    print(finding.severity.label, finding.rule_id, finding.title)

A runnable version lives in examples/demo.py.

Configuration

Precedence (highest first): CLI flags → environment (WEBSEC_*) → .websec-scanner.yml → defaults.

# .websec-scanner.yml
min_severity: INFO
fail_on: HIGH
redact: true
max_body_bytes: 2000000
excluded_rules:
  - WSS-HDR-004
export WEBSEC_FAIL_ON=critical
export WEBSEC_EXCLUDED_RULES="WSS-HDR-004,WSS-HDR-005"
export WEBSEC_REDACT=true

Docker

docker build -t websec-scanner .

# Mount the directory with your captures read-only at /data
docker run --rm -v "$PWD/examples:/data:ro" websec-scanner scan /data/sample.har -f sarif

The image is hardened: multi-stage build, non-root user (UID 10001), no shell entrypoint, and docker-compose.yml adds read_only, cap_drop: ALL, and no-new-privileges.

Development

./scripts/setup.sh                 # venv + deps + tests
black src/ tests/ --line-length 88
ruff check src/ tests/
mypy src/ --ignore-missing-imports --no-strict-optional
bandit -r src/ -ll
pytest tests/ --cov=src/websec_scanner --cov-report=term-missing
  • 122 tests, 94% coverage
  • black / ruff / mypy clean
  • bandit0 findings

Security considerations

  • Passive by design. WebSecScanner makes zero network connections and emits no telemetry. It only reads the files you give it.
  • Evidence redaction. Detected secrets and card numbers are redacted by default (AKIA…MPLE); disable with redact: false only when handling the output securely.
  • Body-size cap. Response bodies are truncated to max_body_bytes before analysis to bound memory on pathological captures.
  • Heuristic findings. Passive analysis cannot confirm exploitability the way an active test can (e.g. CORS reflection is inferred from a single captured Origin). Treat findings as high-quality leads, and validate before filing.
  • Captures may contain secrets. HAR files frequently embed cookies, tokens, and request bodies. Handle them like any sensitive artifact; don't commit them.

Contributing

Contributions are welcome! Please:

  1. Fork and create a feature branch.
  2. Add tests for new rules/behaviour (keep coverage ≥ 85%).
  3. Ensure black, ruff, mypy, bandit, and pytest all pass.
  4. Open a pull request describing the change and the threat it addresses.

New detection rules should include a stable WSS-<DOMAIN>-NNN id, a CWE/OWASP mapping, and remediation text.

License

MIT © 2026 srini-cybersec


Disclaimer. WebSecScanner is intended for defensive security, security education, and authorized assessment of systems and traffic you own or are permitted to analyze. You are responsible for how you use it.

About

Passive web application security analyzer for HAR files and raw HTTP captures — security headers, CSP, cookies, CORS, disclosure & transport checks with SARIF output. Zero network calls.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages