|





| |
Research
The Structure of Semantic Memory
|
I am interested in the
characterization of the structure of semantic networks and how this
interacts with processes operating on these structures. The statistical
structure of large-scale semantic networks, such as word association, WordNet,
and Roget's thesaurus are characterized by a small-world structure with short
average path lengths, strong local clustering and scale-free patterns
of connectivity with most nodes having relatively few connections
joined together through a small number of hubs with many connections.
This pattern of connectivity is difficult to explain on the basis of
Euclidian spaces.
.
Steyvers, M., & Tenenbaum, J. (2005). The Large Scale Structure
of Semantic Networks: Statistical Analyses and a Model of Semantic
Growth. Cognitive Science, 29(1), 41-78.


I am also interested in developing probabilistic topic models to
explain these structured semantic representations. Probabilistic topic models
were introduced in the domain of machine learning and information
retrieval under more technical names such as probabilistic Latent
Semantic Indexing (pLSI) and Latent Dirichlet Allocation (LDA). These
models are based on the idea that documents are mixtures of topics,
where a topic is a probability distribution over words. A topic model
is a generative model for documents: it specifies a simple
probabilistic procedure by which documents can be generated. To make a
new document, one chooses a distribution over topics. Then, for each
word in that document, one chooses a topic at random according to this
distribution, and draws a word from that topic. An efficient Markov Chain Monte Carlo (MCMC) technique
can be used to infer the set of topics that were responsible for
generating a collection of documents.
Steyvers, M. & Griffiths, T.
(in press). Probabilistic topic models. In T. Landauer, D McNamara,
S. Dennis, and W. Kintsch (eds), Latent Semantic Analysis: A Road
to Meaning. Laurence Erlbaum

Griffiths, T., & Steyvers, M.
(2004). Finding Scientific Topics. Proceedings of the National Academy of Sciences, 101 (suppl. 1),
5228-5235.

Our work with these
probabilistic topic models has shown that the large-scale
structure of the model's representation has statistical properties
that correspond well with those of semantic networks produced by
humans:
Griffiths, T.L., & Steyvers,
M. (2002). Prediction and semantic association. In: Advances in Neural Information Processing Systems, 15.

Griffiths, T.L., & Steyvers, M. (2002). A
probabilistic approach to semantic representation. In: Proceedings of
the Twenty-Fourth Annual Conference of Cognitive Science Society. George
Mason University, Fairfax, VA.

We are currently extending this
model to make precise predictions for episodic memory tasks such as
recognition and recall where semantic properties play an important
role (e.g. "false memory").
For a demo of the Gibbs sampler
as applied to word-sense disambiguation, check out this demo:
demo
of word sense disambiguation (--> click on "next iteration" to see
through the iterations of the Gibbs sampler). Colors and numbers
indicate the assignment of words to topics. Note the ambiguous word
"PLAY". This word, over the course of learning, is assigned to
different topics that highlight the different senses.
Another research direction is to formulate
increasingly more structured representations that will be useful in
both cognitive science as well as machine learning/information
retrieval. For example, standard topic models do not make any
assumptions about the order of words as they appear in documents. This
is known as the bag-of-words assumption, and is common to many
statistical models of language.
Of course, word-order information
might contain important cues to the content of a document and this
information is not utilized by the model. Griffiths, Steyvers, Blei,
and Tenenbaum (2005) present an extension of the topic model that
is sensitive to word-order and automatically learns which words
characterize the content of a document and which words are mere
function words that are needed to form a sentence. This research is
both useful in the domain of information retrieval to automatically
identify relevant content words as well cognitive science, to
characterize the online processing of syntactic and semantic
information.
Griffiths, T.L., &
Steyvers, M., Blei, D.M., & Tenenbaum, J.B. (2005).
Integrating Topics and Syntax. In: Advances in Neural
Information Processing Systems, 17. 
|
Learning about Documents
|
Recently, with
Padhraic Smyth (ICS, UC
Irvine), we
developed the author-topic model, an extension of the topic model that
integrates authorship information with content (e.g., Steyvers, Smyth,
Rosen-Zvi, and Griffiths, 2004; Rosen-Zvi, Griffiths, Steyvers, and Smyth,
2004). Instead of associating each document with a distribution over topics,
the author-topic model associates each author with a distribution over
topics and assumes each multi-authored document expresses a mixture of the
authors’ topic mixtures. Using a large corpus of 500,000 Enron emails
recently released by the Justice department, we applied the model to learn
what topics Enron employees wrote about. We also applied the model to a
large collection of CiteSeer abstracts to learn about the main researchers
and topic trends in different areas of computer science research.
Steyvers, M., Smyth, P., Rosen-Zvi, M., & Griffiths, T.
(2004). Probabilistic Author-Topic Models for Information Discovery. The Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data
Mining. Seattle, Washington.
Rosen-Zvi, M., Griffiths T., Steyvers, M., & Smyth, P. (2004). The Author-Topic Model
for Authors and Documents. In 20th Conference on Uncertainty in
Artificial Intelligence. Banff, Canada

Based on the derived representations,
statistical inference can be used to pose the following queries: 1) what
topics does a given author write about? 2) given a document, what author is
most likely to have written about the topics expressed in the document? 3)
how broad is the research of an author as expressed by the topics
distribution? 4) how unusual is a paper for a given author? and 5), what
author is similar to a given author? These queries are not only relevant
when exploring a scientific domain or developing an author profile, but also
in practical situations when finding targets for funding or assigning
reviewers to a paper or grant proposal.
For an online demo of the author-topic
model, go to this website |
Inference in Dynamic Environments
|
Understanding how people make
decisions, from very simple perceptual to complex cognitive decisions,
is an important area of research in psychology. In this collaborative research
with Scott Brown, we
examine decision-making behavior in dynamically changing decision
contexts. Real world decision contexts are continually varying, and good
decision makers must continually adjust their behavior to track
environmental changes.

Consider the case of a military
observer making decisions about the identity (friend vs. enemy) of noisy
stimuli from reconnaissance pictures. The difficulty of these decisions
will change throughout the task, as more or less clear pictures are
used, or more or less uniform terrain is observed. An ideal observer
must dynamically adjust their decision making process to reflect changes
in the environment. For example, if it becomes easier to identify
friendly stimuli in new terrain, observers should relax their criterion
for identifying enemy stimuli.
Previous research has most often
assumed static models for decision making that ignore sequential
dependencies between environments and the effect of history on current
decision making. Some more recent research has focused on dynamic models
of decision making, in which certain parameters of the decision process
are allowed to vary from decision to decision. However, this research
typically makes another static assumption: namely that the environment
is stationary.
In our research, we develop new
experimental paradigms in which dynamic decision making environments
force participants to change their decision making processes in order to
remain (approximately) ideal. This paradigm allows us to observe
decision makers tracking changes in the environment. Currently, one
demonstration program is near completion in which an aircraft is flying
through a canyon environment (see screenshots below). During the flight,
the aircraft is attacked by incoming missiles.
There are two types of incoming missiles and the participant in the
experiment has to make a quick decision about the correct type of
missile in order to choose the appropriate counter-measures. Building on
this demo, the goal is to measure decision speed in natural environments
and also to measure how well participants adapt to changes in the
decision making environment (e.g., by making the two types of missiles
more or less similar during the course of the experiment).

We are also developing two models
for the decision process in dynamic environments. One model is an ideal
observer system in which statistical evidence for a changed environment
is weighed in optimal fashion against evidence for a stable environment.
The ideal observer analysis results in estimates for the (optimal)
number of trials it takes to detect and adjust to new decision
environments. Our other model is a dynamic SDT model that estimates how
long it actually takes for individual decision makers to adapt to novel
decision environments. By comparing predictions from the ideal observer
model to the parameter estimates from the decision model (from
individual decision makers), we can quantify the degree of mismatch
between ideal and actual observer.
|
Brown, S.D., & Steyvers, M. (2005). The
Dynamics of Experimentally Induced Criterion Shifts. Journal of
Experimental Psychology: Learning, Memory & Cognition, 31(4),
587-599. 
Steyvers, M., &
Brown, S. (in press). Prediction and Change Detection. In: Advances in Neural Information Processing Systems, 19.

|
|
In research on episodic,
lexical and semantic memory, I am trying to understand both the processes
that underlie tasks such as recognition, recall, and lexical decision, as
well as the representations that support these processes. The basis for this
research is the theory of REM (Retrieving Effectively from Memory) developed
by Richard M. Shiffrin at Indiana University and myself (Shiffrin &
Steyvers, 1997). The model was able to handle some basic recognition memory
phenomena that have been difficult to handle with extant models. For
example, in order to model mirror effects – the finding that many
experimental manipulations (e.g., list length, strength and word frequency)
simultaneously raise hit rates and lower false alarm rates – many models
adjust response thresholds without explaining how and why these thresholds
vary as they do. In contrast, in the REM theory, memory decisions are based
on a Bayesian inference process that contrasts the evidence for one decision
(e.g. “old”) with the evidence for another decision (e.g. “new”), which
naturally leads to mirror effects.
Shiffrin, R.M. & Steyvers, M. (1997). A model for recognition memory:
REM: Retrieving Effectively from Memory. Psychonomic Bulletin &
Review, 4 (2), 145-166.

Shiffrin, R. M., & Steyvers, M. (1998). The effectiveness of retrieval
from memory. In M. Oaksford & N. Chater (Eds.). Rational models of
cognition. (pp. 73-95), Oxford, England: Oxford University Press.

Steyvers, M. (2000).
Modeling semantic and orthographic similarity effects on memory for
individual words. Dissertation, Psychology Department, Indiana
University. Formatted for 55 pages

Wagenmakers, Steyvers,
Raaijmakers, Shiffrin, van Rijn, & Zeelenberg (2004) have developed a model
for lexical decision based on the same principles as the REM model. The
evidence for one decision (“WORD”) is contrasted with the evidence for the
other decision (“NONWORD”) on the basis of a Bayesian inference process. The
long-term goal in the REM framework is to develop a unified account of
episodic, lexical and semantic memory.
Wagenmakers, E.J.M.,
Steyvers, M., Raaijmakers, J.G.W., Shiffrin, R.M., van Rijn, H., &
Zeelenberg, R. (2004). A Model for Evidence Accumulation in the Lexical Decision Task.
Cognitive Psychology, 48, 332-367.

Steyvers, M., Wagenmakers, E.J.M., Shiffrin, R.M., Zeelenberg, R., &
Raaijmakers, J.G.W. (2001). A Bayesian model for the time-course of
lexical processing. In: Proceedings of the Fourth International
Conference on Cognitive Modeling. George Mason University, Fairfax, VA.

Finally, I
am interested in explaining word frequency effects in recognition memory in
terms of aspects of words other than word frequency perse. For example, what
is the effect of the number of contexts a word has appeared in, irrespective
of the number of total times a word has appeared? Are words that appear in
few contexts more memorable? Also, what is the effect of rare or common
letter features within words? Are words with rare features more memorable?
Steyvers, M., & Malmberg,
K. (2003). The effect of
normative context variability on recognition memory. Journal of
Experimental Psychology: Learning, Memory, & Cognition, 29(5),
760-766.
Malmberg, K. J., Steyvers, M., Stephens, J. D., & Shiffrin, R.M. (2002).
Feature-frequency effects in recognition memory. Memory & Cognition,
30(4), 607-613.

|
|

|
One theme in my research is
finding appropriate mental representations for stimuli often used in
cognitive tasks (e.g., words, faces, visual scenes). Traditional multidimensional scaling
techniques place stimuli as points in a multidimensional space with
similarity inversely related to the distances between points. Typically, pairwise similarity
judgments are used to infer these multidimensional representations but I
have shown how to extend this framework to learn semantic spaces for
words based on word association (Steyvers, Shiffrin, & Nelson, 2004) and
perceptual representations for faces based on physical features as well
as similarity ratings (Steyvers & Busey, 2000).
|
Steyvers, M. (2002). Multidimensional Scaling. In:
Encyclopedia of
Cognitive Science. Nature Publishing Group, London, UK.

Steyvers, M., & Busey, T. (2000). Predicting Similarity Ratings to Faces
using Physical Descriptions. In M. Wenger, & J. Townsend (Eds.), Computational, geometric, and process perspectives on facial cognition:
Contexts and challenges. Lawrence Erlbaum Associates.

Steyvers, M.,
Shiffrin, R.M., & Nelson, D.L. (2004). Word Association
Spaces for Predicting Semantic Similarity Effects in Episodic Memory. In
A. Healy (Ed.), Experimental Cognitive Psychology and its Applications.

|
|

|
In research
on causal reasoning, I study peoples ability to infer causal structure
from both observation and intervention, and to choose informative
interventions on the basis of purely observational data. I develop
computational models of how people infer causal structure from data and
how they plan intervention experiments, based on the representational
framework of causal Bayesian networks and the inferential principles of
optimal Bayesian decision-making and maximizing expected information
gain.
|
Steyvers, M., Tenenbaum, J., Wagenmakers, E.J., Blum, B. (2003).
Inferring Causal Networks from Observations and Interventions. Cognitive Science, 27, 453-489.

|
|

|
Some of my
work has involved testing subjects on the web (see
website here) and comparing their performance to subjects tested
in the lab. Part of the success in luring web subjects to my site may
have something to do with the "Hall of Fame" where the subjects
performance score is directly posted and compared to other subjects. So
far, I have published two papers where I show detailed comparisons
between web and lab subjects.
|
Steyvers, M., & Malmberg,
K. (2003). The effect of
normative context variability on recognition memory. Journal of
Experimental Psychology: Learning, Memory, & Cognition, 29(5),
760-766.

Steyvers, M., Tenenbaum, J., Wagenmakers, E.J., Blum, B. (2003).
Inferring Causal Networks from Observations and Interventions. Cognitive Science, 27, 453-489.

|
|