Example 2 of running HMM-LDA topic model

This example shows how to collect multiple samples from the HMM-LDA Gibbs sampler from the same chain and different chains

Choose the dataset

dataset = 1; % 1 = psych review; 2 = nips papers

if (dataset == 1)
    % Load the psych review word stream
    load 'psychreviewstream';

    % Set the parameters for the model
    T      = 50;     % number of topics
    NS     = 12;     % number of syntactic states
    ALPHA  = 50 / T; % ALPHA hyperparameter
    BETA   = 0.01;   % BETA hyperparameter
    GAMMA  = 0.1;    % GAMMA hyperparameter
end

if (dataset == 2)
    % Load the nips paper word stream
    load 'nips_stream';

    % Set the parameters for the model
    T      = 50;     % number of topics
    NS     = 16;     % number of syntactic states
    ALPHA  = 50 / T; % ALPHA hyperparameter
    BETA   = 0.01;   % BETA hyperparameter
    GAMMA  = 0.1;    % GAMMA hyperparameter
end

What output to show (0=no output; 1=iterations; 2=all output)

OUTPUT = 1;

The number of iterations

BURNIN   = 100; % the number of iterations before taking samples
LAG      = 10; % the lag between samples
NSAMPLES = 2; % the number of samples for each chain
NCHAINS  = 2; % the number of chains to run

The starting seed number

SEED = 1;

for c=1:NCHAINS
    SEED = SEED + 1;
    N = BURNIN;

    fprintf( 'Running Gibbs sampler for burnin\n' );
    [WP,DP,MP,Z,X]=GibbsSamplerHMMLDA( WS,DS,T,NS,N,ALPHA,BETA,GAMMA,SEED,OUTPUT);

    fprintf( 'Continue to run sampler to collect samples\n' );
    for s=1:NSAMPLES
        filename = sprintf( 'ldahmm_chain%d_sample%d' , c , s );
        fprintf( 'Saving sample #%d from chain #%d: filename=%s\n' , s , c , filename );
        comm = sprintf( 'save ''%s'' WP DP Z ALPHA BETA SEED N Z T s c' , filename );
        eval( comm );

        WPM{ s , c } = WP;
        MPM{ s , c } = MP;

        if (s < NSAMPLES)
           N = LAG;
           SEED = SEED + 1; % important -- change the seed between samples !!
           [WP,DP,MP,Z,X]=GibbsSamplerHMMLDA( WS,DS,T,NS,N,ALPHA,BETA,GAMMA,SEED,OUTPUT,Z,X);
        end
    end
end
Running Gibbs sampler for burnin
	Iteration 0 of 100
	Iteration 10 of 100
	Iteration 20 of 100
	Iteration 30 of 100
	Iteration 40 of 100
	Iteration 50 of 100
	Iteration 60 of 100
	Iteration 70 of 100
	Iteration 80 of 100
	Iteration 90 of 100
Continue to run sampler to collect samples
Saving sample #1 from chain #1: filename=ldahmm_chain1_sample1
	Iteration 0 of 10
Saving sample #2 from chain #1: filename=ldahmm_chain1_sample2
Running Gibbs sampler for burnin
	Iteration 0 of 100
	Iteration 10 of 100
	Iteration 20 of 100
	Iteration 30 of 100
	Iteration 40 of 100
	Iteration 50 of 100
	Iteration 60 of 100
	Iteration 70 of 100
	Iteration 80 of 100
	Iteration 90 of 100
Continue to run sampler to collect samples
Saving sample #1 from chain #2: filename=ldahmm_chain2_sample1
	Iteration 0 of 10
Saving sample #2 from chain #2: filename=ldahmm_chain2_sample2

Inspect the first few topics of a few samples

for c=1:NCHAINS
    for s=1:NSAMPLES

      [S] = WriteTopics( WPM{s,c} , BETA , WO , 7 , 0.8 );

      fprintf( '\n\nExample topic-word distributions of chain %d sample %d\n' , c , s );
      S(1:5)

      [S] = WriteTopics( MPM{s,c} , BETA , WO , 7 , 0.8 );

      fprintf( '\nExample hmm state-word distributions of chain %d sample %d\n' , c , s );
      S(1:5)
    end
end

Example topic-word distributions of chain 1 sample 1
ans = 
    'similarity bias used account represented drug extended'
    'order serial search network comparison applied parallel'
    'stimulus stimuli response or which color accounts'
    'responses change rate s both underlying normal'
    'self individual different individuals those others does'

Example hmm state-word distributions of chain 1 sample 1
ans = 
    'model theory authors article process analysis framework'
    'it presents we although when can proposes'
    'for which by a or through e.g'
    'and are or not the can were'
    'in with as on by from about'


Example topic-word distributions of chain 1 sample 2
ans = 
    'similarity bias presented represented account drug systematic'
    'order serial network search parallel comparison applied'
    'stimulus stimuli response which or color cs'
    'responses change rate both s normal underlying'
    'self individual individuals other does situations occur'

Example hmm state-word distributions of chain 1 sample 2
ans = 
    'model theory authors article process function analysis'
    'it presents although we however when proposes'
    'for which by a or through e.g'
    'and are or not can were'
    'in as with on by from at'


Example topic-word distributions of chain 2 sample 1
ans = 
    'control evidence patterns studies motor production functional'
    [1x76 char]
    'is that has there view people review'
    'cognitive action part e.g implicit systems dynamic'
    'knowledge relations features such specific structures also'

Example hmm state-word distributions of chain 2 sample 1
ans = 
    'authors results data theory processes relation theories'
    'as used suggested assumed shown consistent concluded'
    'to article based argued in proposed shown'
    'of and to is for with'
    'these presents evidence however proposes describes when'


Example topic-word distributions of chain 2 sample 2
ans = 
    'control studies patterns motor evidence production primary'
    [1x80 char]
    'is that has there phenomena view review'
    'cognitive action part e.g dynamic implicit explicit'
    'knowledge relations structures specific also features distance'

Example hmm state-word distributions of chain 2 sample 2
ans = 
    'authors results data theory processes relation theories'
    'as used shown suggested assumed consistent concluded'
    'to article based argued proposed in accounts'
    'of and to is for with'
    'these presents evidence however describes proposes when'