吃了小龙虾不能吃什么| 经常口腔溃疡是什么原因| 咖啡加牛奶叫什么| 突然流鼻血是什么征兆| 二月二十三日是什么星座| 智齿什么时候开始长| 男女接吻有什么好处| 老花眼是什么症状| 做什么业务员好| 一心一意什么意思| 胆汁反流用什么药好| mcn是什么意思| 兵马俑是什么意思| 狗狗可以吃什么水果| 大学什么时候开始收费| 818是什么星座| 纸片人什么意思| 奶茶和奶绿有什么区别| shake是什么意思| 腰间盘膨出是什么意思| 农历六月初三是什么星座| 梦见手机失而复得是什么意思| 来月经喝红糖水有什么好处| 秋葵什么时候播种| 脾五行属什么| 为什么会得子宫肌瘤| 久旱逢甘露是什么意思| 西亚是什么人种| 皮肤起小水泡很痒是什么原因| 1907年属什么生肖| 胸前骨头疼是什么原因| 尿酸高能吃什么鱼| 1972年属鼠的是什么命| 踏雪寻梅是什么意思| 夜来非是什么意思| 吃什么英语怎么说| 甚好是什么意思| 清朝前面是什么朝代| 妙曼是什么意思| 手忙脚乱是什么意思| 带状疱疹是什么病| 什么是预科生| 自学成才是什么意思| 子宫脱垂吃什么药怎么恢复正常| o型血容易得什么病| 高字是什么结构| 褐色分泌物是什么原因| 梦见被熊追是什么意思| 工作机制是什么意思| 什么时候可以上环最好的| 马拉松起源与什么有关| 打火机的气体是什么| 牙齿什么时候换完| 杨梅泡酒有什么功效和作用| 娅字五行属什么| 是代表什么意思| 9点到11点是什么经络| hr过高是什么意思| 三个王念什么| 做胃镜前喝的那个液体是什么| b型钠尿肽高说明什么| 胰腺炎能吃什么| 跑水是什么意思| 多囊不能吃什么食物| 骨性关节炎吃什么药| 麦粒肿滴什么眼药水| 喇叭裤配什么鞋子好看| 感染性疾病科看什么病| 攻坚是什么意思| 排班是什么意思| 李子是什么水果| 四个又念什么| 肌肉拉伤吃什么药| 什么是焦距| 手脚出汗多是什么原因| 平安顺遂什么意思| 口唇发绀是什么意思| 舌头尖有小红点这是什么症状| 1992年属猴的是什么命| 昱这个字念什么| 同房什么感觉| 林彪为什么反革命| 碘伏什么颜色| 什么导航好用又准确| 竞走是什么意思| 咳嗽能吃什么食物| 腹黑什么意思| 玉米笋是什么| 容易长口腔溃疡是什么原因| 什么是薪级工资| 嘴角起泡是什么原因| 一路向北是什么意思| 握手言和是什么意思| 内脏吃多了有什么危害| 星星像什么比喻句| 惊奇的什么| 东山再起是什么意思| 农历六月初六是什么节| 567是什么意思| 看门神是什么生肖| 小孩子不吃饭是什么原因引起的| 手麻是什么原因| 哮喘是什么原因引起的| 老是放臭屁是什么原因| 睡醒头疼是什么原因| 莱猪是什么| 咏柳中的咏是什么意思| 什么的母鸡| 卵黄囊回声是什么意思| 孕妇吃什么水果比较好| 人间四月芳菲尽的尽是什么意思| 胎盘下缘覆盖宫颈内口是什么意思| 大连六院是什么医院| 闭关什么意思| 马失前蹄下一句是什么| 九王念什么| 早上起来不晨勃是什么原因| vave是什么意思| 惊悸的意思是什么| 异常的反义词是什么| 梦见蛇蛋是什么意思啊| 陈皮是什么皮| bni是什么意思| 肛门跳动是什么原因| 舒字属于五行属什么| 惠字五行属什么| 老年人经常头晕是什么原因造成的| 头孢不能和什么药一起吃| 骨裂什么症状| 迈巴赫是什么车| 花椒有什么功效与作用| 铁蛋白高吃什么药能降下来| 受孕是什么意思| 白细胞3个加号是什么意思| 得乙肝的人有什么症状| 阴阳八卦是什么生肖| 医院脱毛挂什么科| 胆囊结石会引起身体什么症状| 反胃恶心想吐吃什么药| 血压低头疼是什么原因| 股骨头坏死有什么好办法治疗吗| 探花是什么意思| 玥字属于五行属什么| 痣为什么会越来越多| 胃癌早期有什么症状| 梦见龙卷风是什么预兆| 月经提前量少是什么原因| 梦见别人给自己剪头发是什么意思| 什么都不做| 强字五行属什么| 姜字五行属什么| 多囊卵巢综合症吃什么药| 一什么嘴巴| 岁月如歌是什么意思| 月经一直不干净吃什么药| 阴阳两虚是什么症状| 来月经腰酸腰痛什么原因造成的| 阑尾炎吃什么食物好| 特别的意思是什么| 黑色车牌是什么车| 气郁症是什么症状| 锡兵是什么意思| 咨客是做什么的| 6.19什么星座| 壬申日是什么意思| 组织机构代码是什么| 喝茶对身体有什么好处| 腿走路没劲发软是什么原因| 为什么家里会有蚂蚁| 什么牌子空调好| 平面模特是做什么的| 干事是什么意思| 乙肝五项第二项阳性是什么意思| 777是什么意思| 餐补是什么意思| 女生月经迟迟不来是什么原因| 坚字五行属什么| 吃什么水果减肥最快减肚子| o型血的父母是什么血型| 海东青是什么鸟| 菊花泡水喝有什么功效| 李思思为什么离开央视| 鸡肉和什么菜搭配最好| impress是什么意思| 月经为什么推迟不来| 正正得什么| 福尔马林是什么| 干戈是什么意思| 洋葱有什么功效与作用| 女人肝胆湿热吃什么药| 吃什么清肺| 名不见经传是什么意思| 难道是什么意思| 坐月子什么不可以吃| r13是什么牌子| 世界第一大运动是什么| 鸡眼用什么药| 鹭鸶是什么动物| 叶绿素是什么| 什么是血虚| 拉屎肛门疼是什么原因| 五月是什么星座的啊| 天蓝色是什么颜色| 狗狗狂犬疫苗什么时候打| 玻璃体混浊用什么药| 关系是什么意思| ushi是什么品牌男装| 女m是什么意思| epc是什么意思| 囟门是什么意思| 莫欺少年穷是什么意思| 轩尼诗是什么酒| 多核巨细胞是什么意思| 虾皮是什么| 阴沟肠杆菌是什么病| 怀孕10天左右有什么症状| 什么是重水| 窦性心律不齐什么意思| 乌岽单丛是什么茶| 脚心痒是什么原因引起的| 六月份是什么季节| 济公搓的泥丸叫什么| 什么是it行业| 红玛瑙五行属什么| 少女怀春是什么意思| NF什么意思| 包面是什么| 微醺什么意思| 不什么不什么的四字词语| 水瓶座的幸运色是什么| 压脚背有什么好处| 肛门瘙痒涂什么药膏| 骨钙素低是什么原因| 平均血红蛋白含量偏低是什么意思| 什么情况下安装心脏起搏器| 八仙茶属于什么茶| 复方阿胶浆适合什么样的人喝| 麻了是什么意思| 喝绿豆汤有什么好处| 嗓子沙哑是什么原因| 诺如病毒吃什么药最有效| 纽带是什么意思| 基础代谢率是什么意思| 车代表什么生肖| 来月经肚子疼是什么原因| 乙肝病毒表面抗体高是什么意思| 心跳过缓是什么原因造成的| 嫪毐是什么意思| 睡眠障碍吃什么药最好| 月亮杯是什么东西| 中医把脉能看出什么| 月经总是提前是什么原因| k代表什么意思| 狗狗感冒吃什么药| 什么药可以消肿| 脚热是什么原因| 生理期量少是什么原因| 阳贵是什么意思| 陋习什么意思| 岩茶属于什么茶| 皮肤瘙痒是什么病的前兆| 新生儿前面头发稀少是什么原因| 肿瘤前期出现什么症状| 肝病晚期什么症状| 百度
Skip to content
quit your lying

欢迎访问中国建设银行网站

百度 威斯汀酒店及度假村自1999年引入天梦之床(HeavenlyBed)系列以来,威斯汀酒店客房里的每张床都令人印象深刻,其秘诀就在于带有袖珍弹簧和强化边缘以提供长期支撑的垫层床垫。

The framework pulls in external sources to enhance accuracy. Does it live up to the hype?

Chris Stokel-Walker | 174
Credit: Aurich Lawson | Getty Images
Credit: Aurich Lawson | Getty Images
Story text

We’ve been living through the generative AI boom for nearly a year and a half now, following the late 2022 release of OpenAI’s ChatGPT. But despite transformative effects on companies’ share prices, generative AI tools powered by large language models (LLMs) still have major drawbacks that have kept them from being as useful as many would like them to be. Retrieval augmented generation, or RAG, aims to fix some of those drawbacks.

Perhaps the most prominent drawback of LLMs is their tendency toward confabulation (also called “hallucination”), which is a statistical gap-filling phenomenon AI language models produce when they are tasked with reproducing knowledge that wasn’t present in the training data. They generate plausible-sounding text that can veer toward accuracy when the training data is solid but otherwise may just be completely made up.

Relying on confabulating AI models gets people and companies in trouble, as we’ve covered in the past. In 2023, we saw two instances of lawyers citing legal cases, confabulated by AI, that didn’t exist. We’ve covered claims against OpenAI in which ChatGPT confabulated and accused innocent people of doing terrible things. In February, we wrote about Air Canada’s customer service chatbot inventing a refund policy, and in March, a New York City chatbot was caught confabulating city regulations.

So if generative AI aims to be the technology that propels humanity into the future, someone needs to iron out the confabulation kinks along the way. That’s where RAG comes in. Its proponents hope the technique will help turn generative AI technology into reliable assistants that can supercharge productivity without requiring a human to double-check or second-guess the answers.

“RAG is a way of improving LLM performance, in essence by blending the LLM process with a web search or other document look-up process” to help LLMs stick to the facts, according to Noah Giansiracusa, associate professor of mathematics at Bentley University.

Let's take a closer look at how it works and what its limitations are.

A framework for enhancing AI accuracy

Although RAG is now seen as a technique to help fix issues with generative AI, it actually predates ChatGPT. Researchers coined the term in a 2020 academic paper by researchers at Facebook AI Research (FAIR, now Meta AI Research), University College London, and New York University.

As we've mentioned, LLMs struggle with facts. Google’s entry into the generative AI race, Bard, made an embarrassing error on its first public demonstration back in February 2023 about the James Webb Space Telescope. The error wiped around $100 billion off the value of parent company Alphabet. LLMs produce the most statistically likely response based on their training data and don’t understand anything they output, meaning they can present false information that seems accurate if you don't have expert knowledge on a subject.

LLMs also lack up-to-date knowledge and the ability to identify gaps in their knowledge. “When a human tries to answer a question, they can rely on their memory and come up with a response on the fly, or they could do something like Google it or peruse Wikipedia and then try to piece an answer together from what they find there—still filtering that info through their internal knowledge of the matter,” said Giansiracusa.

But LLMs aren’t humans, of course. Their training data can age quickly, particularly in more time-sensitive queries. In addition, the LLM often can’t distinguish specific sources of its knowledge, as all its training data is blended together into a kind of soup.

In theory, RAG should make keeping AI models up to date far cheaper and easier. “The beauty of RAG is that when new information becomes available, rather than having to retrain the model, all that’s needed is to augment the model’s external knowledge base with the updated information,” said Peterson. “This reduces LLM development time and cost while enhancing the model’s scalability.”

How does RAG work?

By default, an LLM will pull statistically plausible-sounding output from its training data, with some randomness inserted along the way to have the outputs appear more human-like. RAG introduces a new information-retrieval component into the process to search through external data. The data could be from any number of sources and in multiple formats.

As van der Putten puts it, “When a user has a question, the RAG first performs a search in all sources for text fragments relevant to the query. Then a prompt is sent to the generative AI model or service to request to answer the user question based on the search results.”

To find relevant information from the external data that could help answer the user’s query, the LLM converts the query to a vector representation, which allows for dense numerical representations of textual information, then cross-checks it with the vector databases of external data it holds. Asking an LLM to identify information about Apple's business performance, for instance, could lead to a search for all mentions of Apple in the LLM's external data, alongside any mentions of businesses more widely, and present them to the user in a response based on a ranking of how useful the information is.

What makes RAG so powerful is that it can then augment the user prompt with the new information it finds from its external data. It will try to harness that information to produce a better prompt that is more likely to elicit a higher-quality response. And it can be set up to constantly update that external data, all while not altering the underlying model that sits behind the process.

Even better, each answer the LLM produces can be fed into the external data used during RAG, in theory helping to improve accuracy. An LLM using RAG can also potentially recall how it answered previous similar questions.

And crucially, AI models using RAG can often cite the source of their claims because their information is held within that vector database. If an LLM produces an incorrect answer and it’s identified, the source of that incorrect information can be pinpointed within the vector database and be removed or corrected.

RAG’s potential applications

Beyond the general benefits RAG is thought to provide to generative AI outputs, specialist knowledge of subjects such as medicine or history could be improved by using RAG to augment the “knowledge” LLMs draw upon in certain subjects. “When you combine RAG with domain-specific fine-tuning, the result is a more robust, reliable, and refined LLM fit for business purpose,” said Melanie Peterson, senior director of TrainAI at the RWS Group, a tech firm.

RAG is already making a difference in real-world applications, according to some AI experts. “In my business role, we and our clients are exploring RAGs for lots of purposes because of how it steers AI in the right direction,” said van der Putten. “These controls will enable wider use of generative AI in business and elsewhere.”

But van der Putten—who has one foot in business and one foot in academia—believes RAG has benefits beyond the business world. “In my academic research, we are looking into interesting societal applications as well,” he said. “For instance, we are developing a RAG-controlled voting assistant for electoral systems based on proportional representation like in the Netherlands.”

The system would work by letting voters explore points of view of political parties from across the aisle on topics the voter provides. “The goal is to decrease polarization and make election choices more based on stated policy and actual proposals, motions, and party voting behavior in parliament,” he said.

Currently, OpenAI's ChatGPT does a form of RAG when it performs a web search related to a user question, providing more up-to-date information and a link to a source that a user can verify. Google's Gemini AI models do the same. And GPTs from OpenAI can be configured to use information from external data sources, which is also a form of RAG.

Does it actually solve the hallucination problem?

To hear some talk—I was at the recent International Journalism Festival in Perugia, Italy, where plenty of panelists mentioned RAG knowingly as a solution to the confabulation problem for generative AI—RAG seems like it will solve all of AI’s issues. But will it really?

When it comes to tackling generative AI’s confabulation problem, “RAG is one part of the solution,” said David Foster, founding partner of Applied Data Science Partners and the author of Generative Deep Learning: Teaching Machines how to Paint, Write, Compose and Play.

But Foster is clear that it’s not a catch-all solution to the issue of an LLM making things up. “It is not a direct solution because the LLM can still hallucinate around the source material in its response,” he said.

To explain why RAG isn’t a perfect solution, Foster drew an analogy. “Suppose a student takes an English literature exam but doesn’t have access to the source text in the exam itself,” he said. “They may be able to write a decent essay, but there is a high likelihood that they will misremember quotes or incorrectly recall the order of events.”

RAG is like providing easy access to the source material to jog the student's memory. (Anthropomorphizing AI tools is problematic, of course, but it can be difficult to avoid when speaking in analogies.)

“If you give the student the source text, they would be able to 'look up' the relevant information and therefore reduce errors in recall,” said Foster. “This is RAG. However, the student may still retrieve information from the wrong place in the book—so still draws incorrect hallucinatory conclusions—or they may hallucinate additional information that wasn’t present.”

Because of this, Foster calls RAG a “mitigation” rather than a “cure for hallucination.”

A step forward—but not a silver bullet

The million-dollar question is whether it’s worth expending time, effort, and money on integrating RAG into generative AI deployments. Bentley University’s Giansiracusa isn’t sure it is. “The LLM is still guessing answers, but with RAG, it's just that the guesses are often improved because it is told where to look for answers,” he said. The problem remains the same as with all LLMs: “There's still no deep understanding of words and the world,” he said.

Giansiracusa also pointed out that the rise of generative AI-aided search results—and the recent "enshittification" of the web through AI-generated content—means that what might at one point have been a halfway useful solution to a fundamental flaw in generative AI tools could become less useful if AI language models draw from AI-written junk found online.

We've seen that issue recently with Google's AI Overview, which draws on gamed page rankings to determine "accurate" sources that Google's AI model will then draw answers from.

“We know web search is riddled with misinformation and we know LLMs are riddled with hallucinations, so you can do the math here on what happens when you combine the two,” Giansiracusa said.

Listing image: Aurich Lawson | Getty Images

174 Comments
Staff Picks
H
A very recent research paper explored the hypothesis that RAG would reduce hallucinations and improve recall, when applied to legal texts and legal-related tasks (summarizing caselaws, document drafting, etc), and the conclusion is negative, specialized models hallucinate between 17 and 33% (which is a slight improvement over general purposes models, but not much), while slightly improving recall.

Paper is the following one: "Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools", from Varun Magesh, Faiz Surani, Matthew Dahl, Mirac Suzgun, Christopher D. Manning, Daniel E. Ho
乳头胀痛什么原因 爱我永不变是什么歌 什么是胰腺炎 抑菌是什么意思 交是什么结构的字
精索静脉曲张什么症状 梦魇是什么原因造成的 新西兰移民需要什么条件 缺钙应该吃什么 1941年是什么年
18kgp是什么意思 七七年属什么生肖 吃维生素c片有什么好处 抹茶是什么茶 枳是什么意思
aug什么意思 治疗阴虱子用什么药最好 汉语拼音什么时候发明的 什么都不做 满血复活是什么意思
肺结核是什么症状hcv8jop7ns0r.cn 透明人什么意思hcv9jop3ns1r.cn 7月6日什么星座wzqsfys.com 持续耳鸣是什么原因引起的hcv7jop9ns8r.cn 大运正官是什么意思hcv7jop6ns9r.cn
血小板低会有什么症状hcv7jop6ns7r.cn 黑裙子配什么鞋子zhiyanzhang.com 痰多吃什么药ff14chat.com 516是什么意思hcv9jop7ns4r.cn 焦虑症是什么意思hcv8jop4ns8r.cn
8月是什么月hcv8jop7ns3r.cn 神经官能症是什么症状hkuteam.com 女朋友生日送什么礼物hcv9jop1ns2r.cn 尿蛋白低是什么原因hcv8jop6ns8r.cn 巡礼是什么意思hcv9jop1ns8r.cn
萎缩性胃炎不能吃什么食物hcv9jop3ns7r.cn 血小板分布宽度低是什么原因hcv8jop3ns4r.cn 为什么打哈欠hcv7jop6ns2r.cn 先天性一个肾对人有什么影响hcv8jop1ns0r.cn 三个王念什么hcv8jop9ns8r.cn
百度