MMLM之Gemini：《Introducing Gemini: our lar

2023-12-26 大全 25 作者：考证青年

MMLM之：《 : our and most AI model》的翻译与解读

导读：2023年12月6日，重磅发布大规模多模态模型，表示了语言模型发展到了一个新阶段，其多模态和通用能力明显优于目前大部分主流大模型。这是目前最大、最强大的人工智能模型。从底层构建为多模式，可以概括和无缝地理解、操作和组合不同类型的信息，包括文本、图像、音频、视频和代码。这意味着它具有复杂的多模态推理和高级编码能力。通过可以驱动产品，提供更先进的客户服务互动，用于内容创作和营销活动，并在自然语言、代码生成、竞赛编程等任务上表现优秀。

背景：随着AI技术的不断进步，语言模型也在不断发展，但现有模型在多模态处理能力和一致性暴露了不足。

解决痛点：面向未来AI助手应有的知识和能力，即多模态、通用、可靠等能力。

解决方案:

>> 采用从一开始就注重多模态的训练方式，可以自然地理解和推理各种输入。

>> 在多种语言、图像、知识测评上均超过目前SOTA，表明其强大的多模态能力。

>> 在自然语言、代码生成、竞赛编程等任务上也表现出色。

>> 的三个版本针对不同场景进行优化，可以在服务器、设备上高效运行。

>> 系列开发注重责任和安全，采取多重机制提升模型安全性。

>> 将被应用在谷歌多个产品中，同时也将通过API对开发者开放。

总之，极大提升了谷歌模型在多模态能力、通用性和运行效率上的水平，解决了传统模型在这方面的不足，有望助推AI助手的发展。

《 : our and most AI model》的翻译与解读

Note from

介绍

State-of-the-art 最先进的性能

See more in our .在我们的技术报告中看到更多细节。

在包括文本和编码在内的一系列基准测试中都超越了最先进的性能 state-of-the-art on a range of text and .

在一系列多模式基准上超越了最先进的性能 state-of-the-art on a range of .

Next- 新一代能力

Learn more about ’s and see how it works.了解有关能力的更多信息，并了解其工作原理。

复杂的推理

解锁新的科学见解

text, , audio and more理解文本，图像，音频和更多

in math and ，在数学和物理的推理中表现优异。

先进的编码

at and ，擅长编码和竞争性编程

See more in our 2 .详见我们的 2技术报告。

and 可扩展且高效

More , and 更可靠，可扩展和高效

A row of Cloud TPU v5p AI in a data .谷歌数据中心的一排Cloud TPU v5p AI加速器超级计算机

and 责任与安全

Built with and at the core以责任和安全为核心构建

可用性

to the world让向世界开放

Pro in ， Pro在谷歌产品中

在线体验

with 使用构建

Ultra soon， Ultra即将推出

The era: a of ，时代：开启创新的未来

《 : our and most AI model》的翻译与解读

地址

地址： : ’s most AI model yet

时间

2023年12月6日

作者

CEO of and

Demis

CEO and Co-,

Note from

A note from and CEO :

Every shift is an to , human , and lives. I the we are right now with AI will be the most in our , far than the shift to or to the web it. AI has the to — from the to the — for . It will bring new waves of and and drive , , and on a scale we haven’t seen .

That’s what me: the to make AI for , in the world.

谷歌和首席执行官的一则声明:

每一次技术变革都是推动科学发现、加速人类进步和改善生活的机会。我相信我们现在看到的人工智能的转变将是我们一生中最深刻的，远远超过之前向移动或网络的转变。人工智能有潜力为世界各地的人们创造机会——从日常生活到非凡的生活。它将带来新的创新浪潮和经济进步，并以前所未有的规模推动知识、学习、创造力和生产力。

让我兴奋的是：有机会使人工智能对全球所有人都有帮助。

eight years into our as an AI-first , the pace of is only : of are now using AI our to do they ’t even a year ago, from to more to using new tools to and . At the same time, are using our and to build new AI , and and the world are with our AI tools.

This is , and yet, we’re only to the of what’s .

We’re this work and . That means being in our and the that will bring to and , while in and with and to risks as AI more . And we to in the very best tools, and and bring them to our and to , by our AI .

Now, we’re the next step on our with , our most and model yet, with state-of-the-art many . Our first , 1.0, is for sizes: Ultra, Pro and Nano. These are the first of the era and the first of the we had when we this year. This new era of one of the and we’ve as a . I’m for what’s ahead, and for the will for .

–

_MMLM之Gemini：《Introducing Gemini: our lar_MMLM之Gemini：《Introducing Gemini: our lar

作为一家以人工智能为先的公司，我们已经进行了近八年的探索，进展的速度只是在加快：数百万人现在正在使用我们产品中的生成式人工智能，做一些他们一年前甚至无法做到的事情，从解答更复杂的问题到使用新工具进行协作和创造。同时，开发人员正在利用我们的模型和基础设施构建新的生成式人工智能应用程序，全球范围内的初创公司和企业正在借助我们的人工智能工具实现增长。

这是不可思议的动力，然而，我们只是刚刚开始触及可能性的表面。

我们正在大胆而负责地开展这项工作。这意味着在研究中抱有雄心，并追求那些将为人们和社会带来巨大利益的能力，同时建立防护措施，并与政府和专家合作，以应对随着人工智能变得更加强大而出现的风险。我们继续投资于最优秀的工具、基础模型和基础设施，并将它们引入我们的产品和其他产品，遵循我们的人工智能原则。

现在，我们正在的旅程中迈出下一步，这是我们迄今为止最强大且最通用的模型，在许多领先的基准测试中具有最先进的性能。我们的第一个版本 1.0针对不同的尺寸进行了优化：Ultra、Pro和Nano。这些是时代的第一批模型，也是我们今年早些时候成立时的第一个愿景的首次实现。这一新时代的模型代表了公司迄今为止进行的最大的科学和工程努力之一。我为即将发生的事情感到真正兴奋，也为将为全球人民开启的机会感到兴奋。

介绍

By Demis , CEO and Co- of , on of the team

AI has been the focus of my life's work, as for many of my . Ever since AI for games as a , and my years as a to the of the brain, I’ve that if we could build , we could them to in ways.

This of a world by AI to drive our work at . For a long time, we’ve to build a new of AI , by the way and with the world. AI that feels less like a smart piece of and more like and — an or .

Today, we’re a step to this as we , the most and model we’ve ever built.

由首席执行官兼联合创始人Demis 代表团队发表

人工智能一直是我毕生工作的焦点，也是我的许多研究同仁的焦点。自从十几岁时为电脑游戏编写人工智能程序以来，一直到我作为神经科学研究者试图理解大脑工作的这些年，我一直相信，如果我们能构建更智能的机器，我们就能利用它们以令人难以置信的方式造福人类。

在，我们继续致力于这一由人工智能负责任地赋予世界权力的承诺。很长一段时间以来，我们一直想要构建一代新的人工智能模型，灵感来自人们理解和与世界互动的方式。这种人工智能感觉不像是一款聪明的软件，更像是一种有用而直观的东西 —— 一种专业的助手或专家。

is the of large-scale by teams , our at . It was built from the up to be , which means it can and , and types of text, code, audio, image and video.

今天，我们向这一愿景又迈进了一步，我们推出了，这是我们有史以来打造的最强大、最通用的模型。

是谷歌各个团队大规模合作的结果，包括我们在谷歌研究部门的同事。它从头开始构建，以多模态为特点，这意味着它可以泛化并无缝地理解、操作和组合不同类型的信息，包括文本、代码、音频、图像和视频。

: our and most AI model

is also our most model yet — able to run on from data to . Its state-of-the-art will the way and build and scale with AI.

We’ve 1.0, our first , for three sizes:

>> Ultra — our and most model for tasks.

>> Pro — our best model for a wide range of tasks.

>> Nano — our most model for on- tasks.

:我们最大、最强大的人工智能模型

也是我们迄今为止最灵活的模型，能够在从数据中心到移动设备的所有设备上高效运行。其最先进的功能将显著增强开发人员和企业客户使用人工智能构建和扩展的方式。

我们已经优化了 1.0，我们的第一个版本，有三种不同的尺寸：

>> -用于高度复杂任务的最大最强大的模型。

>> Pro -在各种任务上扩展的最佳模型。

>> Nano-在设备上任务中最有效的模型。

State-of-the-art 最先进的性能

We've been our and their on a wide of tasks. From image, audio and video to , Ultra’s state-of-the-art on 30 of the 32 -used used in large model (LLM) and .

With a score of 90.0%, Ultra is the first model to human on MMLU ( ), which uses a of 57 such as math, , , law, and for both world and - .

Our new to MMLU to use its to think more , to over just using its first .

我们已经对模型进行了严格的测试，并在各种任务上评估了它们的性能。从自然图像、音频和视频理解到数学推理， Ultra的性能在32个广泛使用的大语言模型（LLM）研究和开发中使用的学术基准中有30个超越了当前最先进的结果。

在MMLU（大规模多任务语言理解）中， Ultra以90.0%的得分首次超过人类专家，该任务使用57个主题（如数学、物理学、历史、法律、医学和伦理学）结合测试世界知识和解决问题的能力。

我们对MMLU的新基准方法使能够利用其推理能力在回答困难问题之前更加谨慎思考，从而比仅使用第一印象有显着改善。

Ultra also a state-of-the-art score of 59.4% on the new MMMU , which of tasks .

With the image we , Ultra state-of-the-art , from (OCR) that text from for . These ’s and early signs of 's more .

Ultra在新的MMM（多模态多任务）基准测试中也取得了59.4%的最先进得分，该基准测试包括涉及不同领域的多模态任务，需要深思熟虑的推理。

在我们测试的图像基准测试中， Ultra在没有目标字符识别（OCR）系统的辅助下，超越了以前最先进的模型。这些基准测试突显了的本机多模态性，并表明具有更复杂推理能力的早期迹象。

See more in our .在我们的技术报告中看到更多细节。在包括文本和编码在内的一系列基准测试中都超越了最先进的性能 state-of-the-art on a range of text and .

在一系列多模式基准上超越了最先进的性能 state-of-the-art on a range of .

Next- 新一代能力

Until now, the to for and then them to mimic some of this . These can be good at tasks, like , but with more and .

We to be , pre- from the start on . Then we fine-tuned it with data to its . This helps and about all kinds of from the up, far than — and its are state of the art in every .

到目前为止，创建多模态模型的标准方法包括为不同的模态训练单独的组件，然后将它们拼接在一起，粗略地模仿一些功能。这些模型有时可以很好地执行某些任务，比如描述图像，但在更概念性和复杂的推理方面会遇到困难。

我们设计是天生的多模态，从一开始就在不同的模态上进行了预训练。然后我们用额外的多模态数据对其进行微调，以进一步改进其有效性。这有助于从一开始就无缝地理解和推理各种输入，比现有的多模态模型要好得多，而且它的能力几乎在每个领域都是最先进的。

Learn more about ’s and see how it works.了解有关能力的更多信息，并了解其工作原理。复杂的推理

1.0’s can help make sense of and . This makes it at that can be to amid vast of data.

Its to from of of , and will help new at in many from to .

1.0复杂的多模态推理能力有助于理解复杂的书面和视觉信息。这使得它在发现在大量数据中难以辨别的知识方面具有独特的技能。

它通过阅读、过滤和理解信息，从数十万份文件中提取见解的非凡能力，将有助于在从科学到金融的许多领域以数字速度实现新的突破。

解锁新的科学见解 text, , audio and more理解文本，图像，音频和更多

1.0 was to and text, , audio and more at the same time, so it and can to . This makes it good at in like math and .

.0经过训练，可以同时识别和理解文本、图像、音频等，因此它能更好地理解细微的信息，并能回答与复杂话题有关的问题。这使得它特别擅长解释数学和物理等复杂学科的推理。

in math and ，在数学和物理的推理中表现优异。先进的编码

Our first of can , and high- code in the world’s most , like , Java, C++, and Go. Its to work and about makes it one of the for in the world.

Ultra in , , an - for on tasks, and , our held-out , which uses - of web-based .

can also be used as the for more . Two years ago we , the first AI code to reach a level of in .

Using a of , we a more code , 2, which at that go to math and .

我们的第一个版本可以理解、解释和生成世界上最流行的编程语言的高质量代码，如、Java、c++和Go。它具有跨语言工作和对复杂信息进行推理的能力，使其成为世界上领先的编码基础模型之一。

Ultra在几个编码基准测试中表现出色，包括(一个重要的行业标准，用于评估编码任务的性能)和(我们的内部保留数据集)，它使用作者生成的来源而不是基于web的信息。

也可以用作更先进的编码系统的引擎。两年前，我们推出了，这是第一个在编程比赛中达到竞技水平的人工智能代码生成系统。

使用专门的版本，我们创建了一个更高级的代码生成系统， 2，在解决涉及复杂数学和理论计算机科学的竞争性编程问题方面表现出色。

When on the same as the , 2 shows , twice as many , and we that it than 85% of — up from 50% for . When with 2 by for the code to , it even .

We’re for to use AI as tools that can help them about the , code and with — so they can apps and , .

当在与原始相同的平台上进行评估时， 2显示出巨大的改进，解决了几乎两倍的问题，我们估计它的表现优于85%的比赛参与者——较的近50%有所提高。当程序员通过为代码示例定义某些属性与 2协作时，它的性能会更好。

我们很高兴程序员越来越多地使用高性能的人工智能模型作为协作工具，帮助他们推理问题、提出代码设计并协助实现——这样他们就可以更快地发布应用程序和设计更好的服务。

at and ，擅长编码和竞争性编程 See more in our 2 .详见我们的 2技术报告。 and 可扩展且高效 More , and 更可靠，可扩展和高效

We 1.0 at scale on our AI- using ’s in-house Units (TPUs) v4 and v5e. And we it to be our most and model to train, and our most to serve.

On TPUs, runs than , and less- . These - AI have been at the heart of 's AI- that serve of users like , , Gmail, Maps, Play and . They’ve also the world to train large-scale AI cost-.

Today, we’re the most , and TPU to date, Cloud TPU v5p, for -edge AI . This next TPU will ’s and help and train large-scale AI , new and to reach .

我们使用谷歌自家设计的 Units（TPUs）v4和v5e在我们的AI优化基础设施上大规模训练 1.0。我们把它设计成最可靠、最可扩展的培训模式，也是最有效的服务模式。

在TPUs上，的运行速度明显快于早期、较小和功能较差的机型。这些定制设计的人工智能加速器一直是谷歌人工智能产品的核心，这些服务为数十亿用户提供搜索、、Gmail、 Maps、 Play和等服务。它们还使世界各地的公司能够以经济高效的方式训练大规模的AI模型。

今天，我们宣布了迄今为止最强大，最高效和可扩展的TPU系统，Cloud TPU v5p，专为训练尖端的人工智能模型而设计。这款下一代TPU将加速的开发，并帮助开发人员和企业客户更快地训练大规模生成式人工智能模型，从而使新产品和功能更快地到达客户手中。

A row of Cloud TPU v5p AI in a data .谷歌数据中心的一排Cloud TPU v5p AI加速器超级计算机

and 责任与安全 Built with and at the core以责任和安全为核心构建

At , we’re to bold and AI in we do. upon ’s AI and the our , we’re new to for ’s . At each stage of , we’re risks and to test and them.

has the most of any AI model to date, for bias and . We’ve novel into risk areas like cyber-, and , and have ’s best-in-class to help in of ’s .

To in our , we’re with a group of and to -test our a range of .

To ’s and its our , we’re using such as Real , a set of 100,000 with of from the web, by at the Allen for AI. on this work are soon.

在谷歌，我们致力于在我们所做的一切中推进大胆而负责任的人工智能。在谷歌的AI原则和我们产品各个领域的健全安全政策的基础上，我们为的多模态能力增加了新的保护措施。在开发的每个阶段，我们都考虑了潜在的风险，并努力测试和缓解这些风险。

拥有迄今为止谷歌所有人工智能模型中最全面的安全评估，包括偏见和毒性。我们进行了关于潜在风险领域的新颖研究，如网络攻击、说服和自治，并应用了谷歌研究最佳的对抗测试技术，以帮助在部署之前预先识别关键的安全问题。

为了在内部评估方法中识别盲点，我们与外部的多样化的专家团队和合作伙伴合作，以在一系列问题上对我们的模型进行压力测试。

在的训练阶段诊断内容安全问题，并确保其输出符合我们的政策，我们使用了真实毒性提示(Real toxic )等基准测试，这是一组从网络中提取的具有不同程度毒性的10万个提示，由艾伦人工智能研究所的专家开发。有关此工作的进一步细节即将发布。

To limit harm, we built to , label and sort out or , for . with , this is to make safer and more for . , we’re to known for such as , , and .

and will be to the and of our . This is a long-term that , so we’re with the and on best and and like , the Model Forum and its AI Fund, and our AI (SAIF), which was to help risks to AI the and . We’ll with , and civil the world as we .

为了减少伤害，我们构建了专用的安全分类器，用于识别、标记和分类涉及暴力或负面刻板印象的内容。结合强大的过滤器，这种分层方法旨在使更安全、更包容。此外，我们还在继续解决模型的已知挑战，如事实性、基础、归因和协同。

责任和安全将始终是我们模型开发和部署的核心。这是一项长期的承诺，需要协作建设，因此我们正在与行业和更广泛的生态系统合作，共同制定最佳实践，并通过、 Model Forum及其AI安全基金以及我们的安全AI框架（SAIF）等组织设定安全和安全标准，该框架旨在帮助缓解公共和私营部门中特定于AI系统的安全风险。在我们开发的过程中，我们将继续与世界各地的研究人员、政府和公民社会团体合作。

可用性 to the world让向世界开放

1.0 is now out a range of and :

1.0现在正在逐步在一系列产品和平台上推出:

Pro in ， Pro在谷歌产品中

We’re to of .

today, Bard will use a fine-tuned of Pro for more , , and more. This is the to Bard since it . It will be in in more than 170 and , and we plan to to and new and in the near .

We’re also to Pixel. Pixel 8 Pro is the first to run Nano, which is new like in the app and out in Smart Reply in , with — with more apps next year.

In the , will be in more of our and like , Ads, and Duet AI.

We’re to with in , where it's our (SGE) for users, with a 40% in in in the U.S., in .

专业在谷歌产品

我们通过谷歌产品将带给了数十亿人。

从今天开始，Bard将使用 Pro的微调版本进行更高级的推理、规划、理解等操作。这是Bard自推出以来的最大升级。它将在超过170个国家和地区提供英文版本，并计划在不久的将来扩展到不同的模态，并支持新的语言和地区。

我们还将引入Pixel。Pixel 8 Pro是首款运行 Nano的智能手机，它支持一些新功能，比如在应用程序中进行总结，并在中推出智能回复功能，从开始，明年还会推出更多的即时通讯应用程序。

在未来几个月内，将在我们的更多产品和服务中推出，如、Ads、和Duet AI。

我们已经开始在中尝试，它使我们的搜索生成体验（SGE）对用户更加快速，在美国英语中的延迟减少了40%，同时提高了质量。

在线体验

产品测试地址：

with 使用构建

on 13, and can Pro via the API in AI or Cloud AI.

AI is a free, web-based tool to and apps with an API key. When it's time for a fully- AI , AI of with full data and from Cloud for , , and data and .

will also be able to build with Nano, our most model for on- tasks, via , a new in 14, on Pixel 8 Pro . Sign up for an early of .

从12月13日开始，开发者和企业客户可以通过 AI 或 Cloud AI中的 API访问 Pro。

AI 是一款免费的基于web的开发者工具，可以通过API密钥快速创建和发布应用。当一个完全托管的人工智能平台到来时， AI允许的定制化，具有完全的数据控制，并受益于额外的谷歌云功能，包括企业安全、隐私、数据治理和合规性。

开发者还可以通过 ( 14中的一项新系统功能，从Pixel 8 Pro设备开始)，使用 Nano(我们最高效的设备上任务模型)进行构建。注册获得的早期预览版。

Ultra soon， Ultra即将推出

For Ultra, we’re trust and , red- by , and the model using fine- and from human (RLHF) it .

As part of this , we’ll make Ultra to , , and and for early and it out to and early next year.

Early next year, we’ll also Bard , a new, -edge AI that gives you to our best and , with Ultra.

对于 Ultra，我们目前正在进行广泛的信任和安全性检查，包括由可信赖的外部团体进行的红队测试，并在广泛推出之前使用来自人类反馈的微调和强化学习（RLHF）进一步完善模型。

作为这一过程的一部分，我们将向选定的客户、开发人员、合作伙伴以及安全和责任专家提供 Ultra，以便在明年年初向开发人员和企业客户推出之前进行早期实验和反馈。

明年年初，我们还将推出Bard ，这是一种全新的尖端人工智能体验，从 Ultra开始，您可以使用我们最好的模型和功能。

The era: a of ，时代：开启创新的未来

This is a in the of AI, and the start of a new era for us at as we to and the of our .

We’ve made great on so far and we’re hard to its for , in and , and the for even more to give .

We’re by the of a world by AI — a of that will , , and the way of live and work the world.

这是人工智能发展的一个重要里程碑，也是我们谷歌一个新时代的开始，因为我们将继续快速创新，负责任地提高我们模型的能力。

到目前为止，我们在上取得了很大的进展，并且我们正在努力进一步扩展其能力，包括在规划和记忆方面的进步，以及增加上下文窗口以处理更多信息，以提供更好的响应。

我们对由人工智能负责任赋能的美好可能性感到兴奋——这是一个通过创新来增强创造力、扩展知识、推动科学并改变全球数十亿人生活和工作方式的未来。

tags: 人工智能能力模型

MMLM之Gemini：《Introducing Gemini: our lar

Swift之父离开特斯拉，李飞飞高徒安德烈加入

启英泰伦推出基于AI语音芯片CI1102的茶吧机强降噪识别方案

突发！李飞飞高徒Karpathy离职，特斯拉自动驾驶要悬？

AI智能超越人类终破解！李飞飞高徒新作破圈，5万个合成数据碾压人类示例

累计装机破2000万台！终端AI语音芯片企业——启英泰伦发展迅猛

On the Opportunities and Risks of Founda

利用区块链等技术，加强对交通运输信用信息的归集共享和分析应用

印尼西爪哇梳邦县发生山体滑坡已经导致2人死亡

【SpringBoot笔记10】Spring中Bean的6种作用域

ARS548 ARS549RDI 80GHZ毫米波雷达达学习笔记（一)

叠氮PEG修饰二硒化钨 (N3-WSe2；azide

ATFX：黑海运粮遭俄暂停，小麦期货开盘跳涨

关于我们

最火推荐

小编推荐

联系我们

复制成功

MMLM之Gemini：《Introducing Gemini: our lar

Swift之父离开特斯拉，李飞飞高徒安德烈加入

启英泰伦推出基于AI语音芯片CI1102的茶吧机强降噪识别方案

突发！李飞飞高徒Karpathy离职，特斯拉自动驾驶要悬？

AI智能超越人类终破解！李飞飞高徒新作破圈，5万个合成数据碾压人类示例

累计装机破2000万台！终端AI语音芯片企业——启英泰伦发展迅猛

On the Opportunities and Risks of Founda

利用区块链等技术，加强对交通运输信用信息的归集共享和分析应用

印尼西爪哇梳邦县发生山体滑坡 已经导致2人死亡

【SpringBoot笔记10】Spring中Bean的6种作用域

ARS548 ARS549RDI 80GHZ毫米波雷达达学习笔记（一)

叠氮PEG修饰二硒化钨 (N3-WSe2；azide

ATFX：黑海运粮遭俄暂停，小麦期货开盘跳涨

关于我们

最火推荐

小编推荐

联系我们

复制成功

印尼西爪哇梳邦县发生山体滑坡已经导致2人死亡