Commit Graph

  • b7fc90ef71 [notebook] add results main bernard-ng 2025-10-18 22:57:21 +02:00
  • c463e6ed7e Rename model notation reference (#13) Bernard Ngandu 2025-10-18 16:06:59 +02:00
  • ad600ef565 Add technical architecture report (#12) Bernard Ngandu 2025-10-18 15:43:28 +02:00
  • 8160bb0f6f docs: update README bernard-ng 2025-10-07 23:58:47 +02:00
  • f2ac0c9769 fix: add github workflow bernard-ng 2025-10-07 23:21:35 +02:00
  • d3b3840278 fix: nn models pad_sequences bernard-ng 2025-10-06 00:37:29 +02:00
  • cb22c06628 fix: remove svm model bernard-ng 2025-10-06 00:03:54 +02:00
  • fad7ff9277 chore(release): v1.0.0 1.0.0 bernard-ng 2025-10-05 23:45:35 +02:00
  • 8f90fdd625 fix: notebooks bernard-ng 2025-10-05 23:23:58 +02:00
  • 137dea7fe5 fix: models bernard-ng 2025-10-05 21:54:25 +02:00
  • 9dd4f759b3 refactoring: uv bernard-ng 2025-10-05 18:14:15 +02:00
  • f3b06fbd07 feat: regions clusters bernard-ng 2025-10-03 11:58:36 +02:00
  • 912d518106 feat: support gpu bernard-ng 2025-09-29 22:52:08 +02:00
  • a1d500830b feat: support gpu bernard-ng 2025-09-29 21:07:23 +02:00
  • 9e35f95107 feat: statistics tests bernard-ng 2025-09-28 23:50:40 +02:00
  • 9039e9a4cf feat: statistics tests bernard-ng 2025-09-28 17:16:02 +02:00
  • fc469a037e feat: implementation of transition matrices (P_male, P_female, P_both) by province and activation of synthetic name generation by province, as well as analysis of letter, 3-gram, 4-gram and 5-gram frequencies. feature/name-analysis-region amaury 2025-09-27 03:32:25 +02:00
  • 773ebf32c6 Adding surname transition analysis with Markov models, frequency studies, and visualizations, including cleaned surname preprocessing, province sampling, bigram/trigram stats, and male–female transition comparisons amaury 2025-09-26 13:20:37 +02:00
  • ef4ec70fcc feat: generate names based on gender bernard-ng 2025-09-25 23:45:44 +02:00
  • 817081b443 feat: stabilize name analysis bernard-ng 2025-09-25 23:17:49 +02:00
  • 4874b178c9 Name Analysis (#9) Amaury Cansa 2025-09-24 20:23:40 +02:00
  • dda83510ac fix: add missing regions in region_mapper bernard-ng 2025-09-23 00:05:35 +02:00
  • c1b502c878 feat: add osm data bernard-ng 2025-09-21 16:23:44 +02:00
  • 63e23d6600 fix: normalize hyper params bernard-ng 2025-09-21 13:10:07 +02:00
  • 83d21c640b feat: add more baseline expirements bernard-ng 2025-09-21 00:06:01 +02:00
  • e41b15a863 feat: document models bernard-ng 2025-09-20 23:35:54 +02:00
  • dd2a9f2711 refactor: clean up imports and improve gender normalization method bernard-ng 2025-09-20 22:55:24 +02:00
  • 0816207a2c fix: use full_name feature for all models bernard-ng 2025-08-19 19:36:04 +02:00
  • 7101cea5e7 fix: dependencies in requirements.txt bernard-ng 2025-08-19 17:38:56 +02:00
  • d4e8e2a34e remove max_len from config bernard-ng 2025-08-19 08:04:48 +02:00
  • cab5f63809 fix: update default template path in argument parser bernard-ng 2025-08-17 16:31:11 +02:00
  • 33c7aceb0c feat: remove data heavy viz bernard-ng 2025-08-17 16:03:46 +02:00
  • b65aad6ac6 feat: add visualizations for gender, province, and name length distributions in dashboard bernard-ng 2025-08-17 15:52:15 +02:00
  • f70b4be6e0 feat: add NER testing interface and evaluation statistics handling bernard-ng 2025-08-17 15:33:16 +02:00
  • 6faf9f355e fix: NER training loop bernard-ng 2025-08-17 14:15:12 +02:00
  • 3122c92f5e fix: escape csv field to avoid error on empty fields bernard-ng 2025-08-17 13:39:19 +02:00
  • ed60f9deff docs: add instruction for NER processing bernard-ng 2025-08-16 22:37:39 +02:00
  • e08084797f feat: Experiment Builder bernard-ng 2025-08-16 22:14:55 +02:00
  • cf1cbac1a8 hotfixes bernard-ng 2025-08-16 20:34:45 +02:00
  • 84f7d41a84 feat: web application multipage support bernard-ng 2025-08-16 19:05:24 +02:00
  • 7b652d6999 hotfixes bernard-ng 2025-08-15 08:08:11 +02:00
  • 9601c5e44d feat: enhance logging and memory management across modules bernard-ng 2025-08-13 23:09:05 +02:00
  • 47e52d130c hotfixes bernard-ng 2025-08-12 23:17:18 +02:00
  • 3977d5c313 feat: implement NER dataset feature engineering with multiple transformation formats bernard-ng 2025-08-12 00:11:46 +02:00
  • d5a4aaaf4a feat: add NER annotation step and integrate into pipeline bernard-ng 2025-08-11 07:13:09 +02:00
  • 6d39c3afc1 feat: enhance training pipeline with research templates and experiment configuration bernard-ng 2025-08-08 23:48:55 +02:00
  • 96291b4ad0 refactor: update configuration loading and ensure directory existence across modules bernard-ng 2025-08-07 00:36:32 +02:00
  • 104d7e1146 refactor: rename setup_config_and_logging to setup_config and update references bernard-ng 2025-08-06 22:50:04 +02:00
  • 9338d6eab8 feat: implement unified configuration loading and logging setup across entry points bernard-ng 2025-08-06 22:17:02 +02:00
  • d7aa24a935 refactor: reorganize project structure and enhance model verbosity bernard-ng 2025-08-06 21:57:10 +02:00
  • ad8db43748 Add analysis and map of categories of dominant first names, surnames and middle names by province with GeoPandas (#7) Amaury Cansa 2025-08-06 08:37:36 +02:00
  • 80496feb99 Added full name analysis with grouping by first name, last name and postname by region, gender and former provinces. Extraction via identified_name, co-occurrence heatmaps, filtering of simple cases only, and restructuring of regional mappings, co-occurrence, and heatmaps by first name, last name and middle name. (#6) Amaury Cansa 2025-08-05 21:15:06 +02:00
  • f4689faf80 refactoring: add initial pipeline configuration and model classes bernard-ng 2025-08-04 16:12:25 +02:00
  • 19c66fd0ee fix: dataype bernard-ng 2025-07-25 10:42:02 +02:00
  • 14fc302b28 fix: eda with latest dataset bernard-ng 2025-07-24 19:32:44 +02:00
  • cbe3b0ecf2 feat: fix annotated datatype bernard-ng 2025-07-24 17:17:52 +02:00
  • 9f410ca674 refactor: fix logging bernard-ng 2025-07-24 14:27:54 +02:00
  • 326b854615 refactor: fix logging bernard-ng 2025-07-24 14:18:16 +02:00
  • 5e5e07c601 refactor: prompt engineering bernard-ng 2025-07-24 14:14:03 +02:00
  • 72c7007404 refactor: prompt engineering bernard-ng 2025-07-24 13:28:59 +02:00
  • 2b63c37f4e refactor: optimization, no need to annotate entire dataset bernard-ng 2025-07-24 13:16:47 +02:00
  • e2536c1899 refactor: include province and annotation pipeline bernard-ng 2025-07-24 12:50:30 +02:00
  • da7b09dab3 mapping of regions (educational provinces) into the current political provinces, then into 11 large former provinces to facilitate distribution (#5) 1Cansa 2025-07-23 23:41:30 +02:00
  • eacbb94a48 experiment: using LLM for initial annotation bernard-ng 2025-07-18 22:49:45 +02:00
  • 78355eb1d1 feat: add analysis exploration bernard-ng 2025-07-18 09:33:57 +02:00
  • 1aed22016a Add functionality to display top middle names and surnames by region and sex with flexible filtering; (#4) 1Cansa 2025-07-03 11:47:23 +02:00
  • efd97911d3 feat: create evaluation dataset bernard-ng 2025-07-03 10:16:52 +02:00
  • 0888d94596 feat: balanced dataset loading bernard-ng 2025-06-30 01:32:10 +02:00
  • eb139ee09a fix: artifacts saving and dataset loading bernard-ng 2025-06-24 21:49:03 +02:00
  • fb95c72ab7 fix: lstm model bernard-ng 2025-06-24 09:40:42 +02:00
  • d8980ec328 Firstnames treatment (#3) 1Cansa 2025-06-23 15:37:48 +02:00
  • 88bb2f207e docs: add gender inference instructions bernard-ng 2025-06-21 10:53:02 +02:00
  • 25f1df46d8 feat: improve inference for logreg model bernard-ng 2025-06-21 10:35:48 +02:00
  • a46a5f7924 feat: improve inference for logreg model bernard-ng 2025-06-21 10:34:26 +02:00
  • 33d096f8ff fix: dataset path bernard-ng 2025-06-20 16:48:03 +02:00
  • b20f96a450 fix: dependencies bernard-ng 2025-06-20 16:45:54 +02:00
  • c829cac51c Add exploratory data analysis (#1) 1Cansa 2025-06-20 16:41:06 +02:00
  • 1d58e3ccc4 feat: add gender base models architectures bernard-ng 2025-06-20 16:38:48 +02:00
  • f454ba7938 Initial commit bernard-ng 2025-06-19 18:45:11 +02:00