refactor: include province and annotation pipeline

This commit is contained in:
2025-07-24 12:50:30 +02:00
parent da7b09dab3
commit e2536c1899
18 changed files with 402 additions and 355 deletions
+15 -22
View File
@@ -1,31 +1,24 @@
## Instructions:
You are analyzing Congolese full names. For each input, return:
Identify the identified_name (native Congolese part) and identified_surname (non-native, French or English part) from the provided full name.
Return null if a part cannot be identified. Do not alter the original name or add any additional information.
- "identified_name": the native name part of the full name
- "identified_surname": the French or English, usually last part of the full name (can also be composed of multiple words)
- "identified_category":
- "simple" if the native name has no connector
- "compose" if it includes connectors like "wa", "ya", etc.
if you cannot identify any field, return null for that field.
do not alter the original name, just identify the parts.
do not add any additional information or explanations.
## Example:
- "tshabu ngandu bernard"
```json
## Examples:
```
"tshabu ngandu bernard"
{
"identified_name": "tshabu ngandu",
"identified_surname": "bernard",
"identified_category": "simple"
"identified_surname": "bernard"
}
```
- "ilunga wa ilunga albert"
```json
"tshisekedi wa mulumba"
{
"identified_name": "ilunga wa ilunga",
"identified_surname": "albert",
"identified_category": "compose"
"identified_name": "tshisekedi wa mulumba",
"identified_surname": null
}
"ntumba wasokadio marie france"
{
"identified_name": "ntumba wasokadio",
"identified_surname": "marie france"
}
```