Topic Modelling Introduction
Topic Modelling Introduction¶
The goal of this section is to explore topic modelling techniques, namely Latent Dirichlet Allocation (LDA), and Latent Semantic Analysis (LSA).
These two methods share a very important characteristic: they detect latent topics/meaning. This means that from the data we feed the models are all thats used to determine and shape topics.
These are very powerful tools that can help us identify what kind of data we’re dealing with, but a limitation to these techniques is that if our data is skewed in a certain direction, ie data which only belongs to subcategories of a topic, or only a few niche topics.
In the next couple pages, we explore what these techniques can produce for us, and whether their reesults are any good.