Chinese museum analysis

A word-count analysis of the names of around 4500 museums in China.

natural language processing text analysis word counting

Readings and links

Summary

Caixin converted a long, long PDF of over 4,000 museums in China to a CSV, determining what were the most popular words used in museum names. We take a look at both their text approach as well as their per-capita analysis.

Notebooks, Assignments, and Walkthroughs

Chinese museum dataset cleanup

Cleaning up the data exported by Tabula so we can do some further analysis on it.

Chinese museums per capita analysis

Analysis and graphics determining how many museums per person are in different areas in China.

Counting words in Chinese museum names

Using the jieba text segmentation library, we'll count the number of times different words appear in Chinese museum names.

Discussion topics

Check the individual sections for discussion topics.