Security Architecture
Vaultex is designed with one mandate: sensitive financial data never reaches an external AI model in raw form. This page documents every control in the request lifecycle.
Every outbound prompt is passed to a local Microsoft Presidio Analyzer running the en_core_web_lg spaCy model. Presidio identifies 14+ entity types using a combination of Named Entity Recognition (NER) and rule-based pattern matching.
Detected entities include: PERSON, SSN (Social Security Number), EMAIL_ADDRESS, PHONE_NUMBER, ACCOUNT_NUMBER (ACC- prefix pattern), LOAN_ID (LOAN-YYYY-NNNNNN pattern), DATE_OF_BIRTH, CREDIT_CARD, IBAN_CODE, PASSPORT, DRIVER_LICENSE, IP_ADDRESS, URL, and MEDICAL_LICENSE.
The Presidio engine runs entirely on your infrastructure. No prompt text is ever sent to a third-party NER API.
Each detected PII span is replaced with a deterministic token: {{ENTITY_TYPE_N}}. Determinism is guaranteed within a session — the same person gets {{PERSON_1}} in every row of a CSV upload, enabling analytics that span multiple records.
The token ↔ original value mapping is stored in a Redis hash, scoped to the session UUID and Fernet-encrypted at rest. The Redis instance runs in your infrastructure — never in a shared cloud.
Financial values (balances, credit scores, interest rates, risk flags, DPD counts) are explicitly preserved. They are never added to the tokenization list regardless of their proximity to PII.
The sanitized prompt (with all PII replaced by tokens) is forwarded to your chosen LLM provider: Anthropic Claude, OpenAI GPT, or a local Ollama model. Provider API keys are stored in Redis as Fernet-encrypted blobs — never in plaintext.
The gateway supports per-request model selection via the X-Provider and X-Model headers. The system prompt prepended to every call instructs the LLM to treat tokens as stable identifiers and never request clarification about their meaning.
The LLM response is passed back through the detokenization layer. Before any original value is substituted back in, the gateway checks the user's JWT role claim against the RBAC entity permission table.
Junior Analyst — no detokenization (sees all tokens).
Senior Analyst — PERSON and EMAIL are restored.
VP Risk — all personal entities are restored.
Admin — full PII restored + access to audit console.
This means role-level data access is enforced at the network layer, not the application layer. It cannot be bypassed by modifying client code.
Every request is logged with: timestamp, user ID, JWT role, correlation ID, provider + model used, number of PII entities detected by type, and request/response latency. No raw PII is written to the log.
Logs are append-only. There is no API endpoint that allows a log record to be modified or deleted. Log retention defaults to 30 days on Starter and 90 days on Professional. Enterprise plans configure custom retention.
The admin dashboard exposes log search, export (CSV), and regulator-ready summary reports.
Vaultex ships as a single Docker image. All components (FastAPI gateway, Presidio NER engine, Redis) run in your network. There is no telemetry, no usage reporting, and no callback to Vaultex infrastructure in the Starter tier.
Professional and Enterprise plans connect to the Vaultex licensing API for seat validation only — no prompt data, no customer data, no PII.
The full source code is available under MIT license. Your security team can audit every line.
GLBA (Gramm–Leach–Bliley Act): The Safeguards Rule requires financial institutions to implement administrative, technical, and physical safeguards for customer information. Vaultex's tokenization layer ensures that customer NPI (Nonpublic Personal Information) is not transmitted in plaintext to third-party services.
GDPR (General Data Protection Regulation): Tokenization reduces the personal-data footprint of AI processing by ensuring that data sent to a third-party LLM provider does not constitute "personal data" under GDPR Article 4(1) when the LLM cannot reasonably re-identify the data subject from tokens alone.
Disclaimer: Final compliance determinations rest with your legal team, DPO, and applicable regulators. Vaultex provides technical controls — not legal compliance guarantees.