top of page
  • Kyle Chua

Google Unveils Gemini, Its Most Capable, Flexible AI Model Yet

Updated: Dec 16, 2023

Google is making a major move in the artificial intelligence (AI) space.

Gemini
Credit: Google

The search engine giant today introduced Gemini, its most capable, flexible and general AI model yet. Gemini is a multimodal model, meaning it can process not only text but also images, videos and audio. It's reportedly also capable of completing complex tasks like solving math and physics problems, as well as generating code in different programming languages.


The first version of the model, Gemini 1.0, is said to be optimised for three different sizes: Ultra, Pro and Nano. Gemini Ultra is, according to Google, the most capable and largest model among the three, built for complex tasks. Gemini Pro, meanwhile, is the best model for scaling across a wide range of tasks. Finally, Gemini Nano is the most efficient model for on-device tasks.


"Google DeepMind ran the Gemini Pro base model through a number of industry-standard benchmarks and found that Gemini Ultra’s performance exceeds current state-of-the-art results on 30 of the 32 widely used industry benchmarks - this includes MMLU (massive multitask language understanding), where Gemini has scored 90.04%," touted Google.


Gemini Ultra is available to select customers, developers, partners and safety and responsibility experts for early experimentation and feedback. It'll then be available to developers and enterprise customers early next year.


Gemini Pro, on the other hand, is already integrated with Google Bard, making the conversational AI far more capable at things like understanding and summarising, reasoning, brainstorming, writing and planning, among other tasks.

Google Bard
Credit: Google

For now, the new AI model is available via its integrations with Google Bard. Users can try it out starting today using text-based prompts, with other modalities expected to be added in the future. It's available in English in over 180 countries to start, but it'll be including more languages and regions in the near future.


Gemini Nano is also powering various features in Google's flagship Pixel 8 Pro smartphone, including Summarise in the Recorder app and Smart Reply in Gboard.


Google plans to roll out Gemini to other products and services in the coming months, which include Search, Ads, Chrome and Duet AI.


Gemini stands out versus other AI models out there today since it's natively multimodal. In comparison, OpenAI's GPT-4 model, its latest and most advanced model so far, is primarily a text-based model. It only becomes multimodal with plugins and integrations, relying on DALL-E 3 and Whisper, for example, to generate images and process audio, respectively.

 
  • Google has unveiled Gemini, its most capable, flexible and general AI model yet.

  • Gemini is a multimodal model, meaning it can process not only text but also images, videos and audio.

  • For now, the new AI model is available via its integrations with Google Bard and the Google Pixel 8 smartphone.

  • Gemini stands out versus other AI models out there today since it's natively multimodal, whereas OpenAI's GPT-4 model is primarily text-based.

As technology advances and has a greater impact on our lives than ever before, being informed is the only way to keep up.  Through our product reviews and news articles, we want to be able to aid our readers in doing so. All of our reviews are carefully written, offer unique insights and critiques, and provide trustworthy recommendations. Our news stories are sourced from trustworthy sources, fact-checked by our team, and presented with the help of AI to make them easier to comprehend for our readers. If you notice any errors in our product reviews or news stories, please email us at editorial@tech360.tv.  Your input will be important in ensuring that our articles are accurate for all of our readers.

bottom of page