conversation search paper-1

Large Language Models Know Your Contextual Search Intent: A Prompting Framework for Conversational Search

present a framework called LLMCS: perform few-shot conversational query rewriting for conversational search.

3 methods to generate multiple query rewrites & hypothetical responses

aggregating them into an integrated representation (robustly represent usr' real contextual search intent)

Experiment: on CAsT-19, CAsT-20

KYW: Conversational research,passage retrieval,LMMs

Intro

A common and intuitive thread of existing methods for CS:==CQR==(conversational query rewriting): employ a rewriting model to rewrite the current query into a de-contextualized one, then freely adopt any ad-hoc search model.

==BUT CANNOT handle real conversational search scenarios==

few-shot conversational query rewriting?

aim to reformulate a concise conversational query to a fully specified, context-independent query that can be effectively handled by existing information retrieval systems. This technique is presented in a paper that proposes a few-shot generative approach to conversational query rewriting.(small samples)

3methods of LLMCS framework:

rewriting prompt
rewriting-then-response prompt
rewriting-and-response prompt

Main Ideas: use LLM to generate query rewrites & longer hypothetical system responses.

==BUT HOW TO AGGREGATE THESE RESULTS?==

Advantages:

additionally generating hypothetical responses(improve performance)
filter out incorrect search intent, enhance the reasonable ones(aggregate)

Methodology

recap of CQR--say it again(no use)

first, we focus on the task of conversational passage and retrieval. \[ C^t=(q^1,r^1,\ldots,q^{t-1},r^{t-1}) \] \(q: question, r: results, C: context\)

three methods

Rewriting Prompt--tips:

under the multi-turn information-seeking dialog context,

add a few complete multi-turn search-oriented conversations with corresponding manual rewrites as examples for demonstration.

For \(t\)-th: only provide \(Rewrite: C^t\)

even if it's easy and simple prompting, high efficient.

Rewriting-then-response prompt

use the rewrite(just generated)->generate a hypothetical response(have relevant info to answer this question)

The instruction changed into generating a correct ans under rewrite.

Rewriting-and-respense prompt

instead of a two-stage manner, generate them all at once with an integrated instruction

==why?== "If A MODEL DOES NOT GENERATE THE MARKER WORD(LIKE RESPONSE:)CORRECTLY, WE CONSIDER IT A GENERATION FAILURE AND DROP IT."

which it means, if the generated text don't include the marker word, even if the text is quite close to the response, we still do not think this's a response.

ok, it just simplifies our solutions and methods.( I THINK SO )

introduce the chain-of-mind

make LLM decompose a reasoning task into multiple intermediate steps(中间步骤)

+chain-of-mind

Rewrite: {rewrite} into

Rewrite: {chain-of-thought}. So the question should be rewritten as: {rewrite}

aggregation and retrieval

aggregate (multiple rewrites and hypothetical responses) into an integrated representation.

\(N\) query rewrites \(Q=(\hat q_1,\ldots,\hat q^N)\)
\(M\) hypothetical responses \(\R=(\hat r_{i1},\ldots,\hat r_{iM})\) for each rewrite \(\hat q_i\)

to encode each of them into a high-D search intent vector and aggregate these intent vectors into a final intent vector \(v\).

3 ways:

Maxprob: directly use the highest probability ones.(==high efficiency==) \[ For\ Rewrite: &v=f(\hat q_1)\\ For\ RTR\ and\ RAR:&v=\frac{f(\hat q_1)+f(\hat r_{11})}{2} \]
Self-Consistency: is proposed for the reasoning tasks -推理任务(the final ans is from a fixed answer set)But in conversational search, there is no fixed standard ans.
- maybe some of generated rewrites and responses are correct.
- choose the cluster center of all intent vectors as final vector.
- \[ \hat q^*=&\frac 1 N \sum^N_{i=1}f(\hat q_i)\\ v=&\arg\max_{f(\hat q_i)}f(\hat q_i)^\top\cdot \hat q^* \]
- \(\hat q^*\) is the cluster center vector.
- For RTR: 1. select the intent vector \(f(\hat q_k),f(\hat r_{kz})\)(from response, based on \(\hat q_k\)). 𝑘 and 𝑧 are the finally selected indexes of the rewrite and the response, respectively.
- \[ k=\arg\max_i f(\hat q_i)^\top\cdot \hat q^*\\ \hat r^*_k=\frac 1 M \sum^M_{j=1}f(\hat r_{kj})\\ z=\arg\max_j f(\hat r_{kj})^\top\cdot \hat r^*_k\\ v=\frac{f(\hat q_k)+f(\hat r_{kz})}{2} \]
- For RAR: no need to select response!(for one \(q\), only related one \(r\))
- \[ v =\frac{f(\hat q_k)+f(\hat r_{𝑘1})}2 \]
Mean: get avg
- \[ ①\ v=\frac 1 N \sum^N_{i=1}f(\hat q_i)\\ ②\ v=\frac{\sum^N_{i=1}[f(\hat q_i)+\sum^M_{j=1}f(\hat r_{ij})]}{N(M+1)} \]
- ①for rewrite,②for RAR&RTR.