Name Analysis (#9)

* feat: implement representative sampling by province (~500k records), extract surnames from the first token of name, build letter transition matrices (frequency and probability), add heatmap visualization for transitions, and integrate a Markov chain–based name generator.

* Implemented letter frequency analysis with histograms, computed bigram and trigram frequencies, and displayed the top results in tabular format. Rebuilt the transition probability matrix, and developed a name generator capable of producing realistic outputs based on surname data.
This commit is contained in:
Amaury Cansa
2025-09-24 20:23:40 +02:00
committed by GitHub
parent dda83510ac
commit 4874b178c9
2 changed files with 2229 additions and 219 deletions
+227 -219
View File
File diff suppressed because one or more lines are too long
+2002
View File
File diff suppressed because one or more lines are too long