Skip to content

Recipes: add three starter modes (fast / balanced / accurate) #1809

@RonShakutai

Description

@RonShakutai

Is your feature request related to a problem? Please describe.
New Presidio users often don’t know where to start when choosing between spaCy, transformers, or LLM-augmented pipelines. This slows adoption and makes customization harder than necessary.

Describe the solution you'd like
Introduce three high-level, opinionated starter modes that users can immediately run and compare:

  • Fast (spaCy)– Minimal, lightweight, low-latency configuration.
  • Balanced (Transformers) – Uses a transformer NER model for improved accuracy with moderate latency.
  • Accurate (Transformers + LLM) – Hybrid pipeline prioritizing accuracy/recall, optionally calling an LLM for difficult cases.

Each mode includes:
A small config users can copy/paste.
A single notebook demonstrating all three modes on the same dataset.
Brief notes on expected performance/latency trade-offs.
This gives users a simple, structured starting point before diving into domain-specific customization.

Describe alternatives you've considered

  • Expanding documentation only—too abstract, still leaves users without clear starting configurations.
  • Providing fully domain-specific recipes first—higher effort, without giving users a generic baseline they can adopt anywhere.

Additional context
This is the first incremental step in building a broader “recipes” section for Presidio. The three modes establish a foundation on top of which domain-specific examples can later be added.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions