載入中...
載入中...
Gemini 3.0 is a fantastic model, but the sheer volume of updates is honestly overwhelming, and not every new feature deserves your attention. So, after a month of going through official guides
Gemini 3.0是一個很棒的模型,但更新的數量實在太多了,說實話讓人不知所措,而且不是每個新功能都值得你關注。所以,經過一個月的官方指南閱讀
and testing Gemini 3 with real work, I've narrowed down the five changes that actually matter for professionals. Let's get started. Kicking things off with the first major update, improved multimodal
和用真實工作測試Gemini 3後,我整理出了對專業人士真正重要的五個變化。讓我們開始吧。首先是第一個主要更新:改進的多模態
understanding. In plain English, Gemini 3 has become much better at understanding images, video, and audio together. Previously, Gemini might have broken down a video into a collection of screenshots and an audio track. Now,
理解。用白話說,Gemini 3在同時理解圖片、視頻和音頻方面變得更好了。之前,Gemini可能會把視頻分解成一系列截圖和音軌。現在,
Gemini 3 can process everything at once by linking audio cues to visual data. In practice, this means we can upload a short form video, for example, and ask a
Gemini 3可以通過將音頻線索與視覺數據連結來一次處理所有內容。實際上,這意味著我們可以上傳一個短視頻,然後讓
Gemini 3 to first watch the video to understand what's going on, then output specific and detailed recommendations for improvement. And it does exactly that, which is already pretty insane,
Gemini 3先觀看視頻來理解發生了什麼,然後輸出具體詳細的改進建議。它確實做到了,這已經非常瘋狂了,
right? But let's see how this translates to actual work. Here, I've uploaded a screen recording onto Gemini and said, "I just recorded a walkthrough on how to toggle smart features in Gmail. Watch
對吧?但讓我們看看這如何轉化為實際工作。這裡,我上傳了一段螢幕錄影到Gemini,說:「我剛錄了一個關於如何在Gmail中切換智能功能的教程。觀看
the recording and turn it into a clean step-by-step checklist that I can hand to a new hire so they can do it next week without asking me questions." In
這段錄影,然後把它變成一個清晰的步驟清單,我可以交給新員工,這樣他們下週就可以自己做而不用問我問題。」在
under 60 seconds, Gemini turns a messy one-time recording into a permanent training asset, which is a complete game changer for anyone working in operations. Taking this a step further,
不到60秒內,Gemini把一個混亂的一次性錄影變成了永久的培訓資源,這對任何做運營工作的人來說都是徹底的遊戲規則改變者。更進一步,
and bear with me, this might sound a bit dystopian. Imagine you were a UIUX researcher. You can now upload hours of user interviews and ask, "List every moment the user frowned or paused for
請耐心聽我說,這可能聽起來有點反烏託邦。想像你是一個UI/UX研究員。你現在可以上傳幾個小時的用戶訪談,然後問:「列出用戶皺眉或停頓超過
more than 3 seconds and tell me exactly what was on screen in that moment." That level of analysis used to take a human team weeks of analysis. Now you can get
3秒的每個時刻,並告訴我那個時刻螢幕上顯示的是什麼。」這種程度的分析過去需要一個人類團隊數週的分析。現在你可以
it in days, if not hours. On a lighter note, this improved multimodality is also why Nano Banana Pro produces such clean images. Now I can take a dense
在幾天內完成,甚至幾小時。說點輕鬆的,這種改進的多模態能力也是為什麼Nano Banana Pro能產生如此乾淨的圖片。現在我可以把一份密集的
industry report, turn it into a clean infographic with legible text, something previous models struggled with, and tweak the design until it looks just right. It's this fluid movement, seamlessly translating video into text
行業報告變成一個有清晰可讀文字的資訊圖表——這是之前的模型很難做到的——然後調整設計直到看起來剛剛好。這種流暢的轉換,無縫地把視頻變成文字
and text into image that showcases what true multimodality looks like in practice. Moving on to the second major update, better use of large documents.
再把文字變成圖片,展示了真正的多模態在實踐中是什麼樣子。接下來是第二個主要更新:更好地使用大型文件。
So, previous versions of Gemini already had a massive context window of over a million tokens, meaning we could upload a lot of files, but simply holding that much information is very different from
要說清楚,之前版本的Gemini已經有超過一百萬個token的巨大上下文窗口,意味著我們可以上傳很多文件,但僅僅是容納那麼多資訊和
actually understanding it. Think of it like someone flipping through a 200page book instead of thoroughly studying it.
真正理解它是非常不同的。就像有人翻閱一本200頁的書而不是徹底研究它。
With this update, Gemini 3 is now 60% better at finding and using specific information buried deep inside your documents. And to show you the difference, here's a real world example.
有了這次更新,Gemini 3現在在找出和使用埋藏在你文件深處的特定資訊方面提升了60%。為了展示差異,這裡是一個真實的例子。
Let's say you're a strategy analyst responsible for covering meta. You can now upload all the earnings call recordings and financial PDFs from the past year and ask Gemini based on all
假設你是一個負責研究Meta的策略分析師。你現在可以上傳過去一年所有的財報電話錄音和財務PDF,然後問Gemini:基於所有
these sources, what are the three biggest discrepancies between management status strategy in the video calls and what the financial data in the PDFs actually shows. Just think about how
這些資料,管理層在視頻電話中陳述的策略和PDF中財務數據實際顯示的內容之間,最大的三個差異是什麼。想想這個要求有多
complex that request is. Gemini would first need to figure out what the executives actually meant from the earnings calls. find the right financial numbers burden I don't know how many
複雜。Gemini首先需要從財報電話中弄清楚高管們實際上是什麼意思,找到正確的財務數字埋在不知道多少
pages and then connect the two instead of a generic summary or hallucinating a connection. Gemini 3 now correctly identifies that Zuckerberg claims strong momentum for reality labs but in reality
頁裡,然後把兩者連接起來。而不是給出通用的摘要或編造一個連接,Gemini 3現在正確識別出Zuckerberg聲稱Reality Labs有強勁勢頭,但實際上
from the financial statements it shows that that segment lost more than 4.4 billion and represents less than 1% of their total revenue. So, as a rule of thumb, we can now stop treating the
從財務報表來看,那個部門虧損超過44億美元,只佔他們總收入的不到1%。所以,作為經驗法則,我們現在可以停止把
context window as just a storage bin for our files and use it instead as an active working memory when, for example, we need to spot conflicts across different file types. This connects to
上下文窗口僅僅當作存放文件的儲存箱,而是把它當作一個活躍的工作記憶來使用,例如當我們需要在不同文件類型之間找出衝突時。這連接到
something interesting. According to LinkedIn, people management is now the number one skill employers are looking for in the age of AI. And roles requiring these skills typically pay
一件有趣的事。根據LinkedIn,人員管理現在是AI時代僱主最看重的頭號技能。需要這些技能的角色通常
$32,000 more per year. So, if you want to build that skill, I'd recommend the new Google People Management Essentials course on Corsera. It comes from the Google School for Leaders, which means
每年薪水高出32,000美元。所以,如果你想培養那個技能,我推薦Coursera上新的Google人員管理基礎課程。它來自Google領導力學院,這意味著
you're getting nearly 20 years of internal Google research, the same training they give their own managers, packaged into a practical course that anyone can take. In addition to core skills like coaching and
你將獲得近20年的Google內部研究——他們給自己的經理提供的同樣培訓——打包成任何人都可以參加的實用課程。除了輔導和
decision-making, they also cover how to use AI as a management tool, which ties directly into what we've been talking about. Right now, you can get 40% off 3
決策等核心技能,他們還涵蓋了如何將AI作為管理工具使用,這直接與我們討論的內容相關。現在,你可以享受Coursera Plus三個月40%的折扣。
months of Corsera Plus. So, click the link in the description to get started.
點擊描述中的連結開始吧。
Huge thanks to Corsera for sponsoring this portion of the video. Onto update number three, enhanced workspace search.
非常感謝Coursera贊助這部分視頻。接下來是第三個更新:增強的工作區搜索。
To be clear, the ability for Gemini to search across your Google apps has been around for a while, but let's be honest, in the past it was a hit or miss.
要說清楚,Gemini跨Google應用搜索的功能已經存在一段時間了,但說實話,過去它有時有用,有時會編造從未存在的郵件。
Sometimes it worked, sometimes it hallucinated emails that never existed.
有時候它有效,有時候它會編造從未存在的郵件。
With Gemini 3, that inconsistency is basically gone, and now the workspace integration is reliable enough that I actually trust it with day-to-day work.
有了Gemini 3,那種不一致基本上消失了,現在工作區整合足夠可靠,我實際上可以信任它處理日常工作。
Diving to a real example. A freelancer I worked with a year ago recently emailed me asking for a testimonial. Previously, I would have to spend like 20 minutes searching Gmail for old threads and
來看一個真實的例子。一年前和我合作的一個自由工作者最近給我發郵件,請我給他寫一個推薦。之前,我需要花大約20分鐘搜索Gmail的舊郵件和
checking my Google Drive for like shared docs. Right now, I can just enable the workspace extension and ask Gemini find everything related to this freelancer and his work across my Gmail and drive
檢查我的Google Drive尋找共享文件。現在,我只需啟用工作區擴展,然後問Gemini:找到我Gmail和Drive中與這個自由工作者及其工作相關的所有內容,
and draft two testimonials, one short and one detailed. And a minute later, I have drafts that site specific deliverables and outcomes pulled directly from my actual correspondence.
然後起草兩份推薦——一份簡短的,一份詳細的。一分鐘後,我就有了引用具體交付物和成果的草稿,直接從我的實際通信中提取。
Put simply, this change means we're able to turn our scattered digital history, emails, drive files, and docs into a single searchable knowledge base we can actually query. Here's another use case
簡單說,這個變化意味著我們能夠把散落的數位歷史——郵件、Drive文件和文檔——變成一個我們可以實際查詢的單一可搜索知識庫。這是另一個使用案例,
for those of you struggling with email management. Let's say it's Monday morning and your Gmail is overflowing with unread messages, right? Instead of scrolling through everything, enable the Gmail extension and ask Gemini, "Find
給那些在郵件管理上掙紮的人。假設是週一早上,你的Gmail塞滿了未讀郵件,對吧?不要翻閱所有郵件,而是啟用Gmail擴展,問Gemini:「找到
emails from the last week that mention deadlines. Group them by category or project and tell me what needs my response today." Gemini scans your Gmail, pulls irrelevant threads, organizes them into logical groupings,
過去一週提到截止日期的郵件。按類別或專案分組,告訴我今天需要我回覆什麼。」Gemini掃描你的Gmail,提取相關主題,把它們組織成邏輯分組,
and flags what requires action now. And here's one more for those of us, especially me, who hate writing performance reviews. With the workspace extension enabled, ask Gemini to search
並標記出現在需要處理的事項。這裡還有一個給我們這些——特別是我——討厭寫績效評估的人。啟用工作區擴展後,讓Gemini搜索
my emails, docs, and calendar from the past 6 months, identify the major projects I contributed to, plout any quantifiable results like target achieved or deadlines met, and draft a
我過去6個月的郵件、文檔和日曆,識別我參與的主要專案,找出任何可量化的結果比如達成的目標或完成的截止日期,然後起草一份
performance review I can edit. Instead of spending an afternoon reconstructing your own accomplishments, you get a first draft with specifics already filled in. Pro tip, if your company requires you to follow a
我可以編輯的績效評估。不用花一個下午重構你自己的成就,你就能得到一份已經填好具體內容的初稿。專業提示:如果你的公司要求你遵循
specific structure or format, just upload your previous writeups and ask Gemini to reference those files. So, as a rule of thumb, if you would normally spend more than 10 minutes hunting
特定的結構或格式,只需上傳你之前的寫作,讓Gemini參考那些文件。所以,作為經驗法則,如果你通常需要花超過10分鐘翻找
through old emails and docs to reconstruct context in Google Workspace, ask Gemini first. By the way, if you're tired of getting inconsistent or just straight up bad results from AI, I put
舊郵件和文檔來在Google Workspace中重構上下文,先問Gemini。順便說一下,如果你厭倦了從AI那裡得到不一致或完全糟糕的結果,我整理了
together something called Essential Power Prompts. It's a notion library of 15 battle tested prompts I actually use for real work. Each with a video walkthrough showing exactly how to apply
一個叫做Essential Power Prompts的東西。這是一個Notion庫,包含15個我實際用於真實工作的經過實戰檢驗的提示詞。每個都有視頻教程,展示如何準確地
it. These are all plug-andplay so you can start using them immediately. Link down below. Onto the fourth major update, generative surfaces. To be clear, I've always maintained that benchmark scores are an extremely
應用它。這些都是即插即用的,你可以立即開始使用。連結在下方。接下來是第四個主要更新:生成式界面。要說清楚,我一直認為基準分數是一種非常
limited way to evaluate model performance because they can be so easily gamed. But in this case, I do need to recognize that Gemini 3 scored a whopping 72.7%
有限的評估模型性能的方式,因為它們很容易被操縱。但在這個案例中,我確實需要承認Gemini 3在Screen Spot Pro基準測試中
on the Screen Spot Pro benchmark, which measures screen understanding. And if you compare that to just 11.4% for the previous model, you can see the massive leap in its ability to understand user
獲得了驚人的72.7%的分數,這個基準測試衡量的是螢幕理解能力。如果你把這與之前模型的11.4%相比,你可以看到它在理解用戶
interface layouts. In simple terms, Gemini can now generate interactive tools and visual layouts on the fly. So the output format matches our actual task. For example, I was recently evaluating three newsletter platforms,
界面布局能力上的巨大飛躍。簡單說,Gemini現在可以即時生成互動工具和視覺布局。所以輸出格式與我們的實際任務相匹配。例如,我最近在評估三個電子報平臺——
Substack, Ghost, and Beehive. None of which are sponsors, by the way. I uploaded their pricing and feature pages onto Gemini and asked, "Create a comprehensive comparison table that compares these three platforms based on
Substack、Ghost和Beehive。順便說一下,這些都不是贊助商。我把他們的定價和功能頁面上傳到Gemini,問:「根據這些附件文檔,創建一個全面的比較表來比較這三個平臺。」
the attached documents. Now, just for contrast, if I don't enable dynamic view, I get exactly what I'd expect. A comprehensive yet static table comparison. Useful, sure, but nothing
現在,作為對比,如果我不啟用動態視圖,我會得到預期的結果:一個全面但靜態的表格比較。有用,當然,但沒什麼
special. Now, watch what happens when I use the same prompt, but this time with dynamic view enabled. We're going to fast forward a bit here. And after a few
特別的。現在,看看當我用同樣的提示,但這次啟用動態視圖時會發生什麼。我們要快進一下。幾
minutes, I get a fully functional and actually useful interactive tool. Under the revenue calculator tab, I can move these sliders to estimate annual gross revenue based on subscriber count and
分鐘後,我得到一個功能齊全且實際有用的互動工具。在收入計算器標籤下,我可以移動這些滑塊來根據訂閱者數量和
monthly subscription price. I can see in real time how much I get to keep after each platform takes their cut. And that's not even mentioning these other tabs that compare features in detail. I
月訂閱價格估算年度總收入。我可以實時看到每個平臺抽成後我能留下多少。這還沒提到這些其他詳細比較功能的標籤。我
can even follow up with make this tool more useful and be more objective in your comparison. And Gemini is able to update the tool based on that simple and
甚至可以跟進說:讓這個工具更有用,在比較中更客觀。Gemini能夠根據那個簡單
vague feedback. Okay, I I was going to move on, but this is crazy. There's an objective analysis here. Awesome. It created a break even calculator that looks to be correct, and they have a
模糊的反饋更新工具。好,我本來要繼續的,但這太瘋狂了。這裡有一個客觀分析。太棒了。它創建了一個看起來是正確的盈虧平衡計算器,還有一個
recommendation quiz for beginners. Damn. As you can see, with generative interfaces, the output arrives in a format we can use immediately, meaning we don't need to manually reformat the AI output into something usable.
給初學者的推薦測驗。天啊。如你所見,有了生成式界面,輸出以我們可以立即使用的格式呈現,意味著我們不需要手動把AI輸出重新格式化成可用的東西。
Here's an even more powerful use case. Instead of creating slides to present this data in a quarterly review, for example, we can share this spreadsheet with Gemini, enable dynamic view, and
這裡有一個更強大的使用案例。例如,與其創建幻燈片在季度審查中展示這些數據,我們可以把這個電子表格分享給Gemini,啟用動態視圖,然後
say, create a dashboard where I can filter by region and click any bar to see the underlying accounts. After a minute, we have a revenue insights dashboard where I can click into
說:創建一個儀表板,我可以按地區篩選,點擊任何條形圖來查看底層帳戶。一分鐘後,我們有了一個收入洞察儀表板,我可以點擊進入
specific regions to uncover insights. Uh, Apac has a much higher turn rate than America's, which requires a follow-up, or I can just go into all regions and click into specific bars for
特定地區來發現洞察。呃,APAC的流失率比美洲高得多,這需要跟進。或者我可以進入所有地區,點擊特定的條形圖獲取
more information. Pro tip, explicitly ask for the controls you want, like give me a dashboard with a slider for budget and a toggle for region so the AI can
更多資訊。專業提示:明確要求你想要的控件,比如「給我一個帶預算滑塊和地區切換的儀表板」,這樣AI可以
create tools tailored to our use cases. Update number five, better intent understanding. In a nutshell, Gemini 3 is significantly better at understanding vague instructions, which shifts the focus from prompt engineering, obsessing
創建針對我們使用案例量身定制的工具。第五個更新:更好的意圖理解。簡而言之,Gemini 3在理解模糊指令方面明顯更好,這把焦點從提示工程——糾結於
over exact wording, to context engineering, curating the right background information. Here's a simple example. Previously, after a team meeting, you write something like this.
精確措辭——轉移到上下文工程——策劃正確的背景資訊。這是一個簡單的例子。之前,在團隊會議後,你會寫這樣的東西:
Act as a professional but friendly colleague. Draft an email summarizing the key points from today's meeting.
「扮演一個專業但友好的同事。起草一封總結今天會議要點的郵件。
Keep it under 200 words. Use bullet points. You had to spell out tone, format, and length explicitly to get a decent result. Right now, we can paste
保持在200字以內。使用要點符號。」你必須明確說明語氣、格式和長度才能得到不錯的結果。現在,我們可以貼上
our rough notes and just say, "Write a concise email with next steps." And Gemini infers the appropriate tone, structure, and length on its own, giving us the same quality output for a
我們的粗略筆記,只說:「寫一封簡潔的郵件,附帶下一步行動。」Gemini自己推斷出適當的語氣、結構和長度,用一小部分的指令努力給我們同樣質量的輸出。
fraction of the instruction effort. Here's an oversimplified way to think about this. Gemini is now much better at guessing your tone, your format, and your length. Although, I heard effort ma
這是一個過度簡化的理解方式。Gemini現在在猜測你的語氣、格式和長度方面好得多。雖然,我聽說努力比大小更重要。
matters more than size. But, um, Gemini can't guess your facts. So giving it better context like relevant emails, docs, and data now yields significantly higher returns than writing a better
但是,Gemini無法猜測你的事實。所以給它更好的上下文——比如相關的郵件、文檔和數據——現在比寫一個更好的
prompt. Here's another example. Let's say you need to write a LinkedIn post for your VP. Previously, you had to describe the writing style you wanted with a bunch of adjectives like punchy
提示帶來的回報明顯更高。這是另一個例子。假設你需要為你的VP寫一篇LinkedIn帖子。之前,你必須用一堆形容詞描述你想要的寫作風格,比如簡潔有力
and thought leadership, which is hard to nail and usually got you generic results. Anyways, now you can upload three previous posts your VP actually wrote and say, "Here are three examples
和思想領導力,這很難把握,通常會給你通用的結果。總之,現在你可以上傳你的VP之前寫的三篇帖子,說:「這是我的寫作風格的三個例子。
of my writing style. Based on these, rewrite this dry Q4 report into a LinkedIn post. Instead of describing the quote unquote vibe, we've now provided the ground truth of the vibe, the
基於這些,把這份枯燥的Q4報告改寫成一篇LinkedIn帖子。」不是描述所謂的「氛圍」,我們現在提供了氛圍的真實依據——
previous post so that Gemini can mimic the sentence structure, vocabulary, and rhythm automatically. The output sounds like your VP because you showed it what your VP sounds like. So, as a rule of
之前的帖子,這樣Gemini可以自動模仿句子結構、詞彙和節奏。輸出聽起來像你的VP,因為你展示了你的VP聽起來是什麼樣的。所以,作為經驗法則,
thumb, focus on gathering the right context to share, not perfecting how you phrase the prompt. Here's a bonus update for those of you still watching. reduced psychopency. In simple terms, Google
專注於收集正確的上下文來分享,而不是完善你如何表達提示。這是給那些還在看的人的額外更新:減少討好行為。簡單說,Google
explicitly states that Gemini 3 was trained to be less agreeable, meaning Gemini is now much more willing to tell us when we're wrong. And in my testing, that actually holds up. For example,
明確表示Gemini 3被訓練成不那麼順從,意味著Gemini現在更願意告訴我們我們錯了。在我的測試中,這確實成立。例如,
I've stitched together a presentation from three different teams, and I'm worried it sounds disjointed. And so, I share that deck with Gemini and ask, "Identify storytelling weaknesses and logical contradictions between the
我把來自三個不同團隊的演示文稿拼接在一起,我擔心它聽起來不連貫。所以,我把那個幻燈片分享給Gemini,問:「識別這份報告不同部分之間的敘事弱點和邏輯矛盾。」
different sections of this report." Instead of telling me everything looks great, Gemini highlights a disconnect between the initial revenue target and the final attainment numbers and even predicts the push back I'd likely
不是告訴我一切看起來都很好,Gemini指出了最初收入目標和最終達成數字之間的脫節,甚至預測了我可能
receive from leadership. Regular viewers will recognize this is related to the red team technique I covered in a previous video where you ask the AI to adopt a critical persona to get sharper
會從領導層那裡收到的反駁。常看的觀眾會認識到這與我在之前視頻中介紹的紅隊技術相關,你讓AI扮演一個批判性角色來獲得更尖銳的
feedback. Check that out if you haven't already. See you on the next video. In the meantime, have a great one.
反饋。如果你還沒看過,去看看。下個視頻見。在此期間,祝你一切順利。
點擊句子跳轉到對應位置
Gemini 3.0 is a fantastic model, but the sheer volume of updates is honestly overwhelming, and not every new feature deserves your attention. So, after a month of going through official guides
Gemini 3.0是一個很棒的模型,但更新的數量實在太多了,說實話讓人不知所措,而且不是每個新功能都值得你關注。所以,經過一個月的官方指南閱讀
and testing Gemini 3 with real work, I've narrowed down the five changes that actually matter for professionals. Let's get started. Kicking things off with the first major update, improved multimodal
和用真實工作測試Gemini 3後,我整理出了對專業人士真正重要的五個變化。讓我們開始吧。首先是第一個主要更新:改進的多模態
understanding. In plain English, Gemini 3 has become much better at understanding images, video, and audio together. Previously, Gemini might have broken down a video into a collection of screenshots and an audio track. Now,
理解。用白話說,Gemini 3在同時理解圖片、視頻和音頻方面變得更好了。之前,Gemini可能會把視頻分解成一系列截圖和音軌。現在,
Gemini 3 can process everything at once by linking audio cues to visual data. In practice, this means we can upload a short form video, for example, and ask a
Gemini 3可以通過將音頻線索與視覺數據連結來一次處理所有內容。實際上,這意味著我們可以上傳一個短視頻,然後讓
Gemini 3 to first watch the video to understand what's going on, then output specific and detailed recommendations for improvement. And it does exactly that, which is already pretty insane,
Gemini 3先觀看視頻來理解發生了什麼,然後輸出具體詳細的改進建議。它確實做到了,這已經非常瘋狂了,
right? But let's see how this translates to actual work. Here, I've uploaded a screen recording onto Gemini and said, "I just recorded a walkthrough on how to toggle smart features in Gmail. Watch
對吧?但讓我們看看這如何轉化為實際工作。這裡,我上傳了一段螢幕錄影到Gemini,說:「我剛錄了一個關於如何在Gmail中切換智能功能的教程。觀看
the recording and turn it into a clean step-by-step checklist that I can hand to a new hire so they can do it next week without asking me questions." In
這段錄影,然後把它變成一個清晰的步驟清單,我可以交給新員工,這樣他們下週就可以自己做而不用問我問題。」在
under 60 seconds, Gemini turns a messy one-time recording into a permanent training asset, which is a complete game changer for anyone working in operations. Taking this a step further,
不到60秒內,Gemini把一個混亂的一次性錄影變成了永久的培訓資源,這對任何做運營工作的人來說都是徹底的遊戲規則改變者。更進一步,
and bear with me, this might sound a bit dystopian. Imagine you were a UIUX researcher. You can now upload hours of user interviews and ask, "List every moment the user frowned or paused for
請耐心聽我說,這可能聽起來有點反烏託邦。想像你是一個UI/UX研究員。你現在可以上傳幾個小時的用戶訪談,然後問:「列出用戶皺眉或停頓超過
more than 3 seconds and tell me exactly what was on screen in that moment." That level of analysis used to take a human team weeks of analysis. Now you can get
3秒的每個時刻,並告訴我那個時刻螢幕上顯示的是什麼。」這種程度的分析過去需要一個人類團隊數週的分析。現在你可以
it in days, if not hours. On a lighter note, this improved multimodality is also why Nano Banana Pro produces such clean images. Now I can take a dense
在幾天內完成,甚至幾小時。說點輕鬆的,這種改進的多模態能力也是為什麼Nano Banana Pro能產生如此乾淨的圖片。現在我可以把一份密集的
industry report, turn it into a clean infographic with legible text, something previous models struggled with, and tweak the design until it looks just right. It's this fluid movement, seamlessly translating video into text
行業報告變成一個有清晰可讀文字的資訊圖表——這是之前的模型很難做到的——然後調整設計直到看起來剛剛好。這種流暢的轉換,無縫地把視頻變成文字
and text into image that showcases what true multimodality looks like in practice. Moving on to the second major update, better use of large documents.
再把文字變成圖片,展示了真正的多模態在實踐中是什麼樣子。接下來是第二個主要更新:更好地使用大型文件。
So, previous versions of Gemini already had a massive context window of over a million tokens, meaning we could upload a lot of files, but simply holding that much information is very different from
要說清楚,之前版本的Gemini已經有超過一百萬個token的巨大上下文窗口,意味著我們可以上傳很多文件,但僅僅是容納那麼多資訊和
actually understanding it. Think of it like someone flipping through a 200page book instead of thoroughly studying it.
真正理解它是非常不同的。就像有人翻閱一本200頁的書而不是徹底研究它。
With this update, Gemini 3 is now 60% better at finding and using specific information buried deep inside your documents. And to show you the difference, here's a real world example.
有了這次更新,Gemini 3現在在找出和使用埋藏在你文件深處的特定資訊方面提升了60%。為了展示差異,這裡是一個真實的例子。
Let's say you're a strategy analyst responsible for covering meta. You can now upload all the earnings call recordings and financial PDFs from the past year and ask Gemini based on all
假設你是一個負責研究Meta的策略分析師。你現在可以上傳過去一年所有的財報電話錄音和財務PDF,然後問Gemini:基於所有
these sources, what are the three biggest discrepancies between management status strategy in the video calls and what the financial data in the PDFs actually shows. Just think about how
這些資料,管理層在視頻電話中陳述的策略和PDF中財務數據實際顯示的內容之間,最大的三個差異是什麼。想想這個要求有多
complex that request is. Gemini would first need to figure out what the executives actually meant from the earnings calls. find the right financial numbers burden I don't know how many
複雜。Gemini首先需要從財報電話中弄清楚高管們實際上是什麼意思,找到正確的財務數字埋在不知道多少
pages and then connect the two instead of a generic summary or hallucinating a connection. Gemini 3 now correctly identifies that Zuckerberg claims strong momentum for reality labs but in reality
頁裡,然後把兩者連接起來。而不是給出通用的摘要或編造一個連接,Gemini 3現在正確識別出Zuckerberg聲稱Reality Labs有強勁勢頭,但實際上
from the financial statements it shows that that segment lost more than 4.4 billion and represents less than 1% of their total revenue. So, as a rule of thumb, we can now stop treating the
從財務報表來看,那個部門虧損超過44億美元,只佔他們總收入的不到1%。所以,作為經驗法則,我們現在可以停止把
context window as just a storage bin for our files and use it instead as an active working memory when, for example, we need to spot conflicts across different file types. This connects to
上下文窗口僅僅當作存放文件的儲存箱,而是把它當作一個活躍的工作記憶來使用,例如當我們需要在不同文件類型之間找出衝突時。這連接到
something interesting. According to LinkedIn, people management is now the number one skill employers are looking for in the age of AI. And roles requiring these skills typically pay
一件有趣的事。根據LinkedIn,人員管理現在是AI時代僱主最看重的頭號技能。需要這些技能的角色通常
$32,000 more per year. So, if you want to build that skill, I'd recommend the new Google People Management Essentials course on Corsera. It comes from the Google School for Leaders, which means
每年薪水高出32,000美元。所以,如果你想培養那個技能,我推薦Coursera上新的Google人員管理基礎課程。它來自Google領導力學院,這意味著
you're getting nearly 20 years of internal Google research, the same training they give their own managers, packaged into a practical course that anyone can take. In addition to core skills like coaching and
你將獲得近20年的Google內部研究——他們給自己的經理提供的同樣培訓——打包成任何人都可以參加的實用課程。除了輔導和
decision-making, they also cover how to use AI as a management tool, which ties directly into what we've been talking about. Right now, you can get 40% off 3
決策等核心技能,他們還涵蓋了如何將AI作為管理工具使用,這直接與我們討論的內容相關。現在,你可以享受Coursera Plus三個月40%的折扣。
months of Corsera Plus. So, click the link in the description to get started.
點擊描述中的連結開始吧。
Huge thanks to Corsera for sponsoring this portion of the video. Onto update number three, enhanced workspace search.
非常感謝Coursera贊助這部分視頻。接下來是第三個更新:增強的工作區搜索。
To be clear, the ability for Gemini to search across your Google apps has been around for a while, but let's be honest, in the past it was a hit or miss.
要說清楚,Gemini跨Google應用搜索的功能已經存在一段時間了,但說實話,過去它有時有用,有時會編造從未存在的郵件。
Sometimes it worked, sometimes it hallucinated emails that never existed.
有時候它有效,有時候它會編造從未存在的郵件。
With Gemini 3, that inconsistency is basically gone, and now the workspace integration is reliable enough that I actually trust it with day-to-day work.
有了Gemini 3,那種不一致基本上消失了,現在工作區整合足夠可靠,我實際上可以信任它處理日常工作。
Diving to a real example. A freelancer I worked with a year ago recently emailed me asking for a testimonial. Previously, I would have to spend like 20 minutes searching Gmail for old threads and
來看一個真實的例子。一年前和我合作的一個自由工作者最近給我發郵件,請我給他寫一個推薦。之前,我需要花大約20分鐘搜索Gmail的舊郵件和
checking my Google Drive for like shared docs. Right now, I can just enable the workspace extension and ask Gemini find everything related to this freelancer and his work across my Gmail and drive
檢查我的Google Drive尋找共享文件。現在,我只需啟用工作區擴展,然後問Gemini:找到我Gmail和Drive中與這個自由工作者及其工作相關的所有內容,
and draft two testimonials, one short and one detailed. And a minute later, I have drafts that site specific deliverables and outcomes pulled directly from my actual correspondence.
然後起草兩份推薦——一份簡短的,一份詳細的。一分鐘後,我就有了引用具體交付物和成果的草稿,直接從我的實際通信中提取。
Put simply, this change means we're able to turn our scattered digital history, emails, drive files, and docs into a single searchable knowledge base we can actually query. Here's another use case
簡單說,這個變化意味著我們能夠把散落的數位歷史——郵件、Drive文件和文檔——變成一個我們可以實際查詢的單一可搜索知識庫。這是另一個使用案例,
for those of you struggling with email management. Let's say it's Monday morning and your Gmail is overflowing with unread messages, right? Instead of scrolling through everything, enable the Gmail extension and ask Gemini, "Find
給那些在郵件管理上掙紮的人。假設是週一早上,你的Gmail塞滿了未讀郵件,對吧?不要翻閱所有郵件,而是啟用Gmail擴展,問Gemini:「找到
emails from the last week that mention deadlines. Group them by category or project and tell me what needs my response today." Gemini scans your Gmail, pulls irrelevant threads, organizes them into logical groupings,
過去一週提到截止日期的郵件。按類別或專案分組,告訴我今天需要我回覆什麼。」Gemini掃描你的Gmail,提取相關主題,把它們組織成邏輯分組,
and flags what requires action now. And here's one more for those of us, especially me, who hate writing performance reviews. With the workspace extension enabled, ask Gemini to search
並標記出現在需要處理的事項。這裡還有一個給我們這些——特別是我——討厭寫績效評估的人。啟用工作區擴展後,讓Gemini搜索
my emails, docs, and calendar from the past 6 months, identify the major projects I contributed to, plout any quantifiable results like target achieved or deadlines met, and draft a
我過去6個月的郵件、文檔和日曆,識別我參與的主要專案,找出任何可量化的結果比如達成的目標或完成的截止日期,然後起草一份
performance review I can edit. Instead of spending an afternoon reconstructing your own accomplishments, you get a first draft with specifics already filled in. Pro tip, if your company requires you to follow a
我可以編輯的績效評估。不用花一個下午重構你自己的成就,你就能得到一份已經填好具體內容的初稿。專業提示:如果你的公司要求你遵循
specific structure or format, just upload your previous writeups and ask Gemini to reference those files. So, as a rule of thumb, if you would normally spend more than 10 minutes hunting
特定的結構或格式,只需上傳你之前的寫作,讓Gemini參考那些文件。所以,作為經驗法則,如果你通常需要花超過10分鐘翻找
through old emails and docs to reconstruct context in Google Workspace, ask Gemini first. By the way, if you're tired of getting inconsistent or just straight up bad results from AI, I put
舊郵件和文檔來在Google Workspace中重構上下文,先問Gemini。順便說一下,如果你厭倦了從AI那裡得到不一致或完全糟糕的結果,我整理了
together something called Essential Power Prompts. It's a notion library of 15 battle tested prompts I actually use for real work. Each with a video walkthrough showing exactly how to apply
一個叫做Essential Power Prompts的東西。這是一個Notion庫,包含15個我實際用於真實工作的經過實戰檢驗的提示詞。每個都有視頻教程,展示如何準確地
it. These are all plug-andplay so you can start using them immediately. Link down below. Onto the fourth major update, generative surfaces. To be clear, I've always maintained that benchmark scores are an extremely
應用它。這些都是即插即用的,你可以立即開始使用。連結在下方。接下來是第四個主要更新:生成式界面。要說清楚,我一直認為基準分數是一種非常
limited way to evaluate model performance because they can be so easily gamed. But in this case, I do need to recognize that Gemini 3 scored a whopping 72.7%
有限的評估模型性能的方式,因為它們很容易被操縱。但在這個案例中,我確實需要承認Gemini 3在Screen Spot Pro基準測試中
on the Screen Spot Pro benchmark, which measures screen understanding. And if you compare that to just 11.4% for the previous model, you can see the massive leap in its ability to understand user
獲得了驚人的72.7%的分數,這個基準測試衡量的是螢幕理解能力。如果你把這與之前模型的11.4%相比,你可以看到它在理解用戶
interface layouts. In simple terms, Gemini can now generate interactive tools and visual layouts on the fly. So the output format matches our actual task. For example, I was recently evaluating three newsletter platforms,
界面布局能力上的巨大飛躍。簡單說,Gemini現在可以即時生成互動工具和視覺布局。所以輸出格式與我們的實際任務相匹配。例如,我最近在評估三個電子報平臺——
Substack, Ghost, and Beehive. None of which are sponsors, by the way. I uploaded their pricing and feature pages onto Gemini and asked, "Create a comprehensive comparison table that compares these three platforms based on
Substack、Ghost和Beehive。順便說一下,這些都不是贊助商。我把他們的定價和功能頁面上傳到Gemini,問:「根據這些附件文檔,創建一個全面的比較表來比較這三個平臺。」
the attached documents. Now, just for contrast, if I don't enable dynamic view, I get exactly what I'd expect. A comprehensive yet static table comparison. Useful, sure, but nothing
現在,作為對比,如果我不啟用動態視圖,我會得到預期的結果:一個全面但靜態的表格比較。有用,當然,但沒什麼
special. Now, watch what happens when I use the same prompt, but this time with dynamic view enabled. We're going to fast forward a bit here. And after a few
特別的。現在,看看當我用同樣的提示,但這次啟用動態視圖時會發生什麼。我們要快進一下。幾
minutes, I get a fully functional and actually useful interactive tool. Under the revenue calculator tab, I can move these sliders to estimate annual gross revenue based on subscriber count and
分鐘後,我得到一個功能齊全且實際有用的互動工具。在收入計算器標籤下,我可以移動這些滑塊來根據訂閱者數量和
monthly subscription price. I can see in real time how much I get to keep after each platform takes their cut. And that's not even mentioning these other tabs that compare features in detail. I
月訂閱價格估算年度總收入。我可以實時看到每個平臺抽成後我能留下多少。這還沒提到這些其他詳細比較功能的標籤。我
can even follow up with make this tool more useful and be more objective in your comparison. And Gemini is able to update the tool based on that simple and
甚至可以跟進說:讓這個工具更有用,在比較中更客觀。Gemini能夠根據那個簡單
vague feedback. Okay, I I was going to move on, but this is crazy. There's an objective analysis here. Awesome. It created a break even calculator that looks to be correct, and they have a
模糊的反饋更新工具。好,我本來要繼續的,但這太瘋狂了。這裡有一個客觀分析。太棒了。它創建了一個看起來是正確的盈虧平衡計算器,還有一個
recommendation quiz for beginners. Damn. As you can see, with generative interfaces, the output arrives in a format we can use immediately, meaning we don't need to manually reformat the AI output into something usable.
給初學者的推薦測驗。天啊。如你所見,有了生成式界面,輸出以我們可以立即使用的格式呈現,意味著我們不需要手動把AI輸出重新格式化成可用的東西。
Here's an even more powerful use case. Instead of creating slides to present this data in a quarterly review, for example, we can share this spreadsheet with Gemini, enable dynamic view, and
這裡有一個更強大的使用案例。例如,與其創建幻燈片在季度審查中展示這些數據,我們可以把這個電子表格分享給Gemini,啟用動態視圖,然後
say, create a dashboard where I can filter by region and click any bar to see the underlying accounts. After a minute, we have a revenue insights dashboard where I can click into
說:創建一個儀表板,我可以按地區篩選,點擊任何條形圖來查看底層帳戶。一分鐘後,我們有了一個收入洞察儀表板,我可以點擊進入
specific regions to uncover insights. Uh, Apac has a much higher turn rate than America's, which requires a follow-up, or I can just go into all regions and click into specific bars for
特定地區來發現洞察。呃,APAC的流失率比美洲高得多,這需要跟進。或者我可以進入所有地區,點擊特定的條形圖獲取
more information. Pro tip, explicitly ask for the controls you want, like give me a dashboard with a slider for budget and a toggle for region so the AI can
更多資訊。專業提示:明確要求你想要的控件,比如「給我一個帶預算滑塊和地區切換的儀表板」,這樣AI可以
create tools tailored to our use cases. Update number five, better intent understanding. In a nutshell, Gemini 3 is significantly better at understanding vague instructions, which shifts the focus from prompt engineering, obsessing
創建針對我們使用案例量身定制的工具。第五個更新:更好的意圖理解。簡而言之,Gemini 3在理解模糊指令方面明顯更好,這把焦點從提示工程——糾結於
over exact wording, to context engineering, curating the right background information. Here's a simple example. Previously, after a team meeting, you write something like this.
精確措辭——轉移到上下文工程——策劃正確的背景資訊。這是一個簡單的例子。之前,在團隊會議後,你會寫這樣的東西:
Act as a professional but friendly colleague. Draft an email summarizing the key points from today's meeting.
「扮演一個專業但友好的同事。起草一封總結今天會議要點的郵件。
Keep it under 200 words. Use bullet points. You had to spell out tone, format, and length explicitly to get a decent result. Right now, we can paste
保持在200字以內。使用要點符號。」你必須明確說明語氣、格式和長度才能得到不錯的結果。現在,我們可以貼上
our rough notes and just say, "Write a concise email with next steps." And Gemini infers the appropriate tone, structure, and length on its own, giving us the same quality output for a
我們的粗略筆記,只說:「寫一封簡潔的郵件,附帶下一步行動。」Gemini自己推斷出適當的語氣、結構和長度,用一小部分的指令努力給我們同樣質量的輸出。
fraction of the instruction effort. Here's an oversimplified way to think about this. Gemini is now much better at guessing your tone, your format, and your length. Although, I heard effort ma
這是一個過度簡化的理解方式。Gemini現在在猜測你的語氣、格式和長度方面好得多。雖然,我聽說努力比大小更重要。
matters more than size. But, um, Gemini can't guess your facts. So giving it better context like relevant emails, docs, and data now yields significantly higher returns than writing a better
但是,Gemini無法猜測你的事實。所以給它更好的上下文——比如相關的郵件、文檔和數據——現在比寫一個更好的
prompt. Here's another example. Let's say you need to write a LinkedIn post for your VP. Previously, you had to describe the writing style you wanted with a bunch of adjectives like punchy
提示帶來的回報明顯更高。這是另一個例子。假設你需要為你的VP寫一篇LinkedIn帖子。之前,你必須用一堆形容詞描述你想要的寫作風格,比如簡潔有力
and thought leadership, which is hard to nail and usually got you generic results. Anyways, now you can upload three previous posts your VP actually wrote and say, "Here are three examples
和思想領導力,這很難把握,通常會給你通用的結果。總之,現在你可以上傳你的VP之前寫的三篇帖子,說:「這是我的寫作風格的三個例子。
of my writing style. Based on these, rewrite this dry Q4 report into a LinkedIn post. Instead of describing the quote unquote vibe, we've now provided the ground truth of the vibe, the
基於這些,把這份枯燥的Q4報告改寫成一篇LinkedIn帖子。」不是描述所謂的「氛圍」,我們現在提供了氛圍的真實依據——
previous post so that Gemini can mimic the sentence structure, vocabulary, and rhythm automatically. The output sounds like your VP because you showed it what your VP sounds like. So, as a rule of
之前的帖子,這樣Gemini可以自動模仿句子結構、詞彙和節奏。輸出聽起來像你的VP,因為你展示了你的VP聽起來是什麼樣的。所以,作為經驗法則,
thumb, focus on gathering the right context to share, not perfecting how you phrase the prompt. Here's a bonus update for those of you still watching. reduced psychopency. In simple terms, Google
專注於收集正確的上下文來分享,而不是完善你如何表達提示。這是給那些還在看的人的額外更新:減少討好行為。簡單說,Google
explicitly states that Gemini 3 was trained to be less agreeable, meaning Gemini is now much more willing to tell us when we're wrong. And in my testing, that actually holds up. For example,
明確表示Gemini 3被訓練成不那麼順從,意味著Gemini現在更願意告訴我們我們錯了。在我的測試中,這確實成立。例如,
I've stitched together a presentation from three different teams, and I'm worried it sounds disjointed. And so, I share that deck with Gemini and ask, "Identify storytelling weaknesses and logical contradictions between the
我把來自三個不同團隊的演示文稿拼接在一起,我擔心它聽起來不連貫。所以,我把那個幻燈片分享給Gemini,問:「識別這份報告不同部分之間的敘事弱點和邏輯矛盾。」
different sections of this report." Instead of telling me everything looks great, Gemini highlights a disconnect between the initial revenue target and the final attainment numbers and even predicts the push back I'd likely
不是告訴我一切看起來都很好,Gemini指出了最初收入目標和最終達成數字之間的脫節,甚至預測了我可能
receive from leadership. Regular viewers will recognize this is related to the red team technique I covered in a previous video where you ask the AI to adopt a critical persona to get sharper
會從領導層那裡收到的反駁。常看的觀眾會認識到這與我在之前視頻中介紹的紅隊技術相關,你讓AI扮演一個批判性角色來獲得更尖銳的
feedback. Check that out if you haven't already. See you on the next video. In the meantime, have a great one.
反饋。如果你還沒看過,去看看。下個視頻見。在此期間,祝你一切順利。