Google is Now More OPEN than OpenAI

In seven days, God created the world Google dropped two major announcements: Gemini 1.5’s record-breaking 1M token cap, and Gemma’s family of four lightweight open models.

With this back-to-back news, Google is gaining momentum in its race to ace the Open Source marathon.

According to Google’s official announcement, Gemma is a family of lightweight open models built from the same research and technology used to create the Gemini models. Alongside this news, Google will release tools to support developer innovation, foster collaboration, and guide the responsible use of Gemma models.

Amidst this surprising news, many wonder about the timing of the announcement. Why now?

Tech writer Emilia David; known for her work on Business Insider, Venture Capital Journal, and CNBC, even speculated that Google released Gemma to keep the discussion about Gemini 1.5. Because the search giant isn’t ready to open the new model to the public yet.

If this is the case, Google; alongside Meta, now emerges as a prominent supporter and champion of open-source technology.

Even Yan LeCun, Meta’s Chief AI Scientist, chimes in this open-source conversation.

These innovations highlight the importance of open-source technologies. Software such as Gemma, Llama, and Mistral are essential for collaboration, innovation, and accessibility.

While open source is not a charitable endeavor, it yields returns through engineering economics and community engagement. These initiatives drive progress in tech, education, and government.

Considering the search giant’s open-source project, what implications does this hold for users, competitors, and the value we place on free software?

Let’s get to know Gemma more!

The model consists of four different variants, Gemma 2B, Gemma 2B-it, Gemma 7B, and Gemma 7B-it. All of which were designed to support AI use in third-party applications.

Within the family, Gemma 2B and Gemma 7B are the two main variants, with 2 billion and 7 billion parameters, respectively. This means that despite its lightweight nature, Gemma can handle up to 7 billion data points. Gemini 1.5, being a larger model demands increased computational power, making Gemma a better fit for edge devices.

This is the complete family tree:

  • Pre-trained variants:
    • Gemma 2B
    • Gemma 7B
  • Instruction-tuned variants:
    • Gemma 2B-it
    • Gemma 7B-it

All variants cater to a wide range of applications like summarization or retrieval-augmented generation (RAG). With these four options, you can customize the models to align perfectly with your needs.

Gemma's quick-start guides for developers.

Moreover, the models also support a wide variety of tools and systems, run across popular device types, and are optimized for Google Cloud and other cutting-edge hardware platforms. Essentially, you can apply this new model on platforms you already use and trust.

Gemma's partner quick-start guides.

Not only that, Gemma is also designed for responsible AI development and includes a toolkit for creating safer AI applications.

Responsible generative AI toolkit.

Given Google’s track record of utilizing user data for product development, it’s inevitable to question the potential risks Gemma might pose to users integrating it into their native systems. Although there hasn’t been much discussion on this, we will make sure to share these conversations as they come up.

On a positive note, Gemma is also commercially available, this means that any companies, big and small, can profit from the services built on top of this model. However, it is important to take note that there are terms in place that restrict users from promoting harm by using these models.

You can read more about Gemma’s terms of use here.

According to tech writer Steven Vaughan-Nichols; a key contributor to sites like Foundry, The Register, and The New Stack, open source is considered the cradle of artificial intelligence as it provides tools and libraries for storing and processing vast data crucial for AI and machine learning. Open source and AI emerged together, giving rise to popular AI models like ChatGPT and Llama 2.

While not directly credited with AI success, open source significantly influences its development, challenging big players like Google and OpenAI in the AI race.

However, open source is not without its challenges, particularly when it comes to defining a common understanding of what it is. According to this article published by ZDNET; a news website that provides information on the latest trends in the tech industry, stakeholders are working on defining a common understanding of open-source AI, which is expected to be finalized soon.

Following the launch of Gemma, Redditors are spearheading discussions on Google possibly outperforming OpenAI at its own game. Let’s not forget, that OpenAI used to be open source until 2019.

As of today, OpenAI is no longer open-source because it has shifted away from its initial goals of being a non-profit organization. Initially, OpenAI was dedicated to promoting digital intelligence for the benefit of humanity while maintaining open-source principles. Instead, OpenAI has transformed into a closed-source, for-profit company, focusing on maximizing profits rather than sharing its research freely.

This change in direction contradicts the company’s origins alongside Elon Musk’s plans, who initially envisioned OpenAI as an open-source entity aimed to counterbalance Google’s dominance in AI research. Isn’t that ironic?

The shift has led to criticisms and skepticism regarding the appropriateness of the company’s name, considering its current priorities differ significantly from its original intentions.

And now, the tables have turned. With Gemma, Google takes center stage in the field of open-source AI software.

This achievement shines even brighter when we contrast Gemma’s features with those of its current competitors. When compared to Meta’s Llama 2, Gemma 7B surpassed all four capabilities, such as general, reasoning, math, and code.

Comparison table between Gemma and Llama-2 features.

In addition to this internal review, Google also shared an interactive benchmark checker. This tool allows you to compare open-source models like Mistral and Llama 2 to Gemma based on parameters such as “HellaSwag,” “MMLU,” “PIQA” and other standard benchmark matrices. The visual representation captivated me because it effectively showcases the impressive nature of Gemma.

Interactive benchmark checker comparing open-source models like mistral and llama-2.

Meanwhile, on the Hugging Face leaderboard, where open source models are compared, Gemma 7B currently holds the second position, trailing behind Llama 2 by just four points.

Hugging face leaderboard per model.

Since its creation, this leaderboard has served as a space where a community of developers can discover detailed results and queries for open-source models. In true open-source nature, the leaderboard is a wrapper running the open benchmarking library Eleuther AI LM Evaluation, a library for reproducible and transparent evaluation of LLMs.

Being deeply engaged in discussions about open-source AI, I delve into benchmarks and comparisons beyond the model’s internal review, and Hugging Face serves as a valuable resource for such deep dives.

An open platform for open sources.

While doing my research, I also looked up the common criticisms of open-source technology. Among them are security, lack of audit process, operational inefficiencies, and poor developer practices. But, the statistics; both from Google and Hugging Face, show that this will not be the case with Google’s open-source models.

Does this mean that Google is going to be the new face of “OPEN” AI? We’re going to keep watching the developments expecting a response from OpenAI.

When is your next move, Sam Altman?

What’s in it for u+

Now that you are invested in this new open source, you might be wondering how you can implement it in your platforms.

As a gift, we have curated these tutorials to guide you in your exploration.

According to the announcement, you can integrate the model into these five tools.

Kaggle – a data science competition platform and online community of data scientists and machine learning practitioners under Google LLC. Discover quickstarts on Kaggle.

Vertex AI – a platform that provides purpose-built MLOps tools for data scientists and ML engineers to automate, standardize, and manage ML projects. Train and deploy on Google Cloud.

Google Colab – a hosted Jupyter Notebook service that requires no setup to use and provides free access to computing resources, including GPUs and TPUs. Try low-rank adaptation with JAX via Keras 3.

Hugging Face – a company that provides a variety of tools and resources for building and deploying NLP models. View on Hugging Face.

Nvidia NeMo – a toolkit for building state-of-the-art NLP models. It provides a wide range of pre-trained models, as well as tools for training and evaluating your models. View on GitHub.

While we wait for Gemini 1.5, you can start exploring Gemma HERE.

With these rapid advancements sprouting left and right, who knows what will be the big news next week? Perhaps hints on GPT-5? Or maybe another open-source initiative vying for a spot among the AI giants.

The important thing is we keep being in the know and understand how these open-source models can influence our lives.

For the future belongs to those who embrace it today.

Do you want to stay current with the latest AI news?
At a.i. + u, we deliver fresh, engaging, and digestible AI updates.

Stay tuned for more exciting developments!
Let’s see what stories we can bring to life next.