-
Notifications
You must be signed in to change notification settings - Fork 904
Description
Is your feature request related to a problem? Please describe.
New Presidio users often don’t know where to start when choosing between spaCy, transformers, or LLM-augmented pipelines. This slows adoption and makes customization harder than necessary.
Describe the solution you'd like
Introduce three high-level, opinionated starter modes that users can immediately run and compare:
- Fast (spaCy)– Minimal, lightweight, low-latency configuration.
- Balanced (Transformers) – Uses a transformer NER model for improved accuracy with moderate latency.
- Accurate (Transformers + LLM) – Hybrid pipeline prioritizing accuracy/recall, optionally calling an LLM for difficult cases.
Each mode includes:
A small config users can copy/paste.
A single notebook demonstrating all three modes on the same dataset.
Brief notes on expected performance/latency trade-offs.
This gives users a simple, structured starting point before diving into domain-specific customization.
Describe alternatives you've considered
- Expanding documentation only—too abstract, still leaves users without clear starting configurations.
- Providing fully domain-specific recipes first—higher effort, without giving users a generic baseline they can adopt anywhere.
Additional context
This is the first incremental step in building a broader “recipes” section for Presidio. The three modes establish a foundation on top of which domain-specific examples can later be added.