GDPR-compliant AI with zero context leakage
We pseudonymize before any model sees input—so people stay private and data stays useful.
G-SHARP™ – Protect the person. Preserve the data.
European patent
application filed (EPO 25206504.0)
How it works (high level)
G-SHARP™ stepwise transforms raw text into placeholders, then reassembles a fully pseudonymized version. Context is stripped before any AI step, so models only ever receive de-contextualized word lists—not the original text. :contentReference[oaicite:1]{index=1}
Known inputs (e.g., names, IDs) are replaced first. Pattern-matched entities (addresses, dates, IDs) follow. Typos are handled by fuzzy matching. :contentReference[oaicite:2]{index=2}
We then split text into tokens, remove frequent words to remove context, and shuffle what remains. AI detects only GDPR-relevant tokens from that randomized list (first names, last names, etc.). :contentReference[oaicite:3]{index=3}
Finally, an iterative check step validates and fixes edge cases, and the text is reconstructed with placeholders. :contentReference[oaicite:4]{index=4}
Step {{StepNumber}} — Detect Language
The system quickly checks likely languages to configure subsequent steps.
Die schulische Leistungsentwicklung der Schülerin Suzanne Fischer (SchulID 5W1773, Schwester von Jonas Fisher aus der 10B), geboren am 3. Juli 2007 und wohnhaft in der Göthestraße 12, 79100 Freiburg …
Step {{StepNumber}} — Replace Known Inputs
All provided known inputs are replaced with placeholders (e.g., “Suzanne” →
[Vorname:1]). :contentReference[oaicite:5]{index=5}
Die schulische Leistungsentwicklung der Schülerin Suzanne[Vorname:1] Fischer[Nachname:1] (SchulID 5W1773, Schwester von Jonas Fisher aus de 10B), geboren am 3. Juli 2007, wohnhaft in der Göthestraße 12, 79100 Freiburg …
Step {{StepNumber}} — Pattern-matched Entities
Addresses, dates, IDs matched by regex/patterns become placeholders (e.g., “Göthestraße
12” → [Straße:0], “3. Juli 2007” → [Datum:0]).
:contentReference[oaicite:6]{index=6}
… geboren am 3. Juli 2007 [Datum:0] , wohnhaft in der Göthestraße 12 [Straße:0] , 79100 [Postleitzahl:0] Freiburg …
Step {{StepNumber}} — Fuzzy Match
Detect likely typos (e.g., “Fisher” ≈ “Fischer”) using Levenshtein and normalize to known inputs. :contentReference[oaicite:7]{index=7}
… Schwester von Jonas Fisher [Nachname:1] aus der …
Step {{StepNumber}} — Tokenization
Break up the text into a list of individual word tokens.
Die schulische Leistungsentwicklung der Schülerin [Vorname:3] [Nachname:1] ( SchulID 5W1773, Schwester von Jonas [Nachname:1] aus der 10B), geboren am [Datum:0] und wohnhaft [Straße:0], [Postleitzahl:0] in Freiburg …
Step {{StepNumber}} — Remove Frequent Words
Common function words and punctuation are removed since they are not PII but do provide context.
Step {{StepNumber}} — Insert Decoy PII Tokens
Add realistic decoys to the list as an active protection layer before AI labeling.
Step {{StepNumber}} — Shuffle Remaining and Decoy Tokens
Shuffle the remaining and decoy tokens to remove residual context before AI processing.
Step {{StepNumber}} — Iterative AI on De-contextualized Tokens
AI receives only the shuffled list and converts all GDPR-information to
placeholders, e.g., “Jonas” → [Vorname:4], “10D” → [Identifikation:2])
A.I.
Step {{StepNumber}} — Iterative Checks
From minimal context, AI corrects placeholders (e.g. 10B is not an identifier but a class) and fixes misses (e.g., catch Zürcher as a last name). :contentReference[oaicite:11]{index=11}
… aus der [Identifikation:2] [Klasse:1] ). …
Step {{StepNumber}} — Remove Decoy PII Word Tokens
We’re starting to piece things back together. The decoy PII word tokens that were added earlier have served their purpose of hiding context from AI. They are removed.
Step {{StepNumber}} — Restore Original Order
Restore the word tokens to their original order.
Step {{StepNumber}} — Re-insert Previously Removed Word Tokens
All words, placeholders and punctuation that were previously removed are re-insterted to get back to the original list of word tokens, but now fully pseudonymized.
Step {{StepNumber}} — Reassemble Pseudonymized Text
Concatenate tokens back to text with placeholders only, ready for compliant AI processing & later re-identification inside the tenant. Presto! Done! :contentReference[oaicite:12]{index=12}
The pseudonymized text can be used as GDPR-compliant AI input.
Die Schülerin [Vorname:3] [Nachname:1] (SchullID [Identifikation:1], Schwester von [Vorname:3] [Nachname:1] aus der [Klasse:1]), geboren am [Datum:0]und wohnhaft in der [Straße:0], [Postleitzahl:0] Freiburg …