Ai2 OLMo Initiative: A Groundbreaking Approach to Open Source LLM Research

Ai2 OLMo Initiative: A Groundbreaking Approach to Open Source LLM Research: In 2014, Paul Allen, co-founder of Microsoft, established the Allen Institute for Artificial Intelligence (Ai2) as a nonprofit organization dedicated to advancing artificial intelligence for the greater good. Ai2 takes a significant leap in this mission with the introduction of the OLMo (Open Language Model) LLM. Unlike some open-source models that provide only codes and weights, Ai2 goes a step further by making OLMo truly open, offering not just codes and weights but also training code, training data, and associated toolkits, all licensed under Apache 2.0.

The release of the state-of-the-art OLMo model and its accompanying framework underscores Ai2's commitment to fostering innovation and collaboration in language models while raising awareness of the ethical and societal implications they present.

Hanna Hajishirzi, Project Lead for OLMo, Senior NLP Research Director at AI2, and professor at the Allen School of the University of Washington, explains:

Many language models lack transparency in their publication. Without access to training data, researchers are unable to scientifically comprehend a model's functionality. It's akin to developing drugs without clinical trials or exploring the solar system without a telescope. Our new framework empowers researchers to delve into the science of LLMs, a crucial step toward building the next generation of safe and trustworthy AI.

OLMo is the result of collaboration with the Kempner Institute for the Study of Natural and Artificial Intelligence at Harvard University, along with partners such as AMD, CSC, the Paul G. Allen School of Computer Science & Engineering at the University of Washington, and Databricks.

Developed on the LUMI supercomputer at the CSC, powered by AMD EPYC™ processors and AMD Instinct™ accelerators, OLMo 7B and 1B models were trained using Databricks' MosaicML platform.

The comprehensive AI development tools included in the framework are:

Complete pre-training data: Built on AI2's Dolma dataset, encompassing an open corpus of three trillion tokens for language model pre-training, including the code generating the training data. The OLMo framework provides complete model weights for four 7B-scale model variants, each trained on at least 2 trillion tokens. It includes inference code, training metrics, and training logs. Evaluation: Ai2 has released the evaluation suite used in development, featuring over 500 checkpoints per model, every 1000 steps in the training process, and evaluation code governed by the Catwalk project.

Eric Horvitz, Chief Scientist at Microsoft and founding member of AI2's Scientific Advisory Council, expresses excitement about making OLMo accessible to AI researchers, following Ai2's tradition of providing valuable open models, tools, and data driving advancements in the global AI community.

OLMo offers AI researchers and developers:

Enhanced precision: With a comprehensive understanding of the underlying training data, researchers can work more efficiently, scientifically testing the model instead of relying on qualitative assumptions. Reduced carbon footprint: By granting full access to the training and evaluation ecosystem, Ai2 significantly reduces redundancies in the development process, crucial for lowering carbon emissions in AI. Sustainable outcomes: Keeping models and datasets open enables researchers to learn from and build upon prior models and work. Ai2 plans to expand the OLMo family by introducing various model sizes, modalities, datasets, and capabilities in the near future.

Noah Smith, OLMo Project Leader, Senior NLP Research Director at AI2, and professor at the Allen School of the University of Washington, concludes:

With OLMo, 'open' truly means 'open,' and the entire AI research community gains access to every facet of model creation, including training code, evaluation methods, data, and more. While AI was once an open field driven by an active research community, the evolution of models into commercial products prompted work on AI to shift behind closed doors. OLMo aims to reverse this trend, empowering the research community to collaboratively understand and engage with language models scientifically, leading to more responsible AI technology that benefits everyone.