Project Link: topic model prototype

The intention of this prototype is to assemble a user interface implementation of a pipeline for topic analysis. The prototype makes use of a number of a number of public domain techniques. The focus of the user interface consists of

1. A discovery process - assisted by an unsupervised modelling technique for document topic discovery.
2. A review and curation process - where the user manually reviews metrics for each of the terms and clusters and assigns class labels to clusters.
3. A supervised modelling process - where the user is able to build a classification model derived from the curated data sets defined during the process.

The methods applied in this process are:

- TFIDF document term feature encoding. Provided by the "tidytext" library.
- Latent Dirichlet Allocation. Provided by the "topicmodels" library.
- A feed forward neural network. Provided by the "keras" package for R using a tensorflow backend.