Example 1 of applying author topic model (AT)

This example shows how to run the AT Gibbs sampler on a small dataset to extract a set of topics. This code will produce a single WP and AT count matrix and will show the most likely words and authors per topic. It also writes the results to a file

load 'bagofwords_nips'; % load the nips word document dataset
load 'authordoc_nips'; % load the author-document pairings for nips
load 'words_nips'; % Load the vocabulary
load 'authors_nips'; % Load the author names

The text file to show the topic-word and topic-author distributions

filename = 'topics_nips_at.txt';

Set the number of topics

T = 50;

Set the hyperparameters

BETA  = 0.01;
ALPHA = 50/T;

The number of iterations

N = 50;

The random seed

SEED = 3;

What output to show (0=no output; 1=iterations; 2=all output)

OUTPUT = 1;

This function might require 30-45 minutes of compute time

fprintf( 'The following computation might take 30-45 minutes...\n' );
tic
[ WP, AT , Z , X ] = GibbsSamplerAT( WS , DS , AD , T , N , ALPHA , BETA , SEED , OUTPUT );
toc

Just in case, save this WP sample and associated information

save 'temp' WP AT Z X ALPHA BETA SEED N;

WPM{1} = WP; WPM{2} = AT;
BETAM(1)=BETA; BETAM(2) = ALPHA;
WOM{1}=WO; WOM{2}=AN;

Write the word topic and author topic distributions to a text file

[ SM ] = WriteTopicsMult( WPM , BETAM , WOM , 7 , 0.7 , 4 , filename );

Show the most likely words in the first ten topics

SM{1}(1:10)

Show the most likely authors in the first ten topics

SM{2}(1:10)