- I was interested in what the average complexity of a Kanji, defined by the number of strokes used.
- I have not been able to find any data on this, so I decided to to it myself!
- The results may help visualize how languages balance the number of possible permutations on their characters with the limits of human memory.
- I also wanted to learn some more pandas / matplotlib for data analysis. Was fun!
- Analysis of Kanji stroke patterns by overall word frequency!
- Weighs each Kanji by their average use in natural language, then finds the most common stroke counts used in every-day language!
- Graphs! (Bar and Pie right now).
- Finds standard deviation and mean of most common stroke counts!
- numpy: https://numpy.org/devdocs/contents.html
- pandas: https://pandas.pydata.org/pandas-docs/stable/index.html
- re: https://docs.python.org/3/library/re.html
- matplotlib: https://matplotlib.org/
- scipy: https://www.scipy.org/
- math: https://docs.python.org/3/library/math.html
- random: https://docs.python.org/3/library/random.html
- collections: https://docs.python.org/3/library/collections.html
- Play around more with plot-fitting. See if you can try to find the mean / std of the plot that was fitted in the video!