Presenter: Adriana Caraeni
Group Members: Romaisa Fatima
Faculty Sponsor: James Allan
School: UMass Amherst
Research Area: Computer Science
ABSTRACT
This project investigates the internal representations of information retrieval (IR) features within the RankLLaMA 7B language model through layer-wise probing analysis. We extract neuron activations from all 32 layers of RankLLaMA when processing query-document pairs from the MS MARCO dataset and train Ridge regression models to predict around two dozen distinct IR features from these activations. Our feature set encompasses an extensions from traditional retrieval metrics (BM25, TF-IDF cosine similarity, KL and JS divergence), with term frequency features (min TF, normalized min TF, stream length), position-based features (proximity score, position bias, order preservation, term clustering), and advanced features (co-occurrence score, phrase matching, rare term score, query type score, semantic coverage, title boost, document length normalization, and TF saturation). By computing R^2 scores across layers, we identify where specific IR features are most strongly represented in the model's internal activations. Our results provide insights into how neural ranking models encode and process retrieval-relevant information across their layers, contributing to the mechanistic interpretability of language models for information retrieval tasks.RELATED ABSTRACTS