Name Analysis (#9)
* feat: implement representative sampling by province (~500k records), extract surnames from the first token of name, build letter transition matrices (frequency and probability), add heatmap visualization for transitions, and integrate a Markov chain–based name generator. * Implemented letter frequency analysis with histograms, computed bigram and trigram frequencies, and displayed the top results in tabular format. Rebuilt the transition probability matrix, and developed a name generator capable of producing realistic outputs based on surname data.
This commit is contained in:
Vendored
+227
-219
File diff suppressed because one or more lines are too long
Vendored
+2002
File diff suppressed because one or more lines are too long
Reference in New Issue
Block a user