Sakana AI employed a technique called “model merging” which combines existing AI models to yield a new model, combining it with an approach inspired by evolution, leading to the creation of hundreds of model generations.
The most successful models from each generation were then identified, becoming the “parents” of the next generation.
The company is releasing the three Japanese language models and two are being open-sourced, Sakana AI founder David Ha told Reuters in online remarks from Tokyo.
The company’s founders are former Google researchers Ha and Llion Jones.
Jones is an author on Google’s 2017 research paper “Attention Is All You Need”, which introduced the “transformer” deep learning architecture that formed the basis for viral chatbot ChatGPT, leading to the race to develop products powered by generative AI.
Discover the stories of your interest
Ha was previously the head of research at Stability AI and a Google Brain researcher. All the authors of the ground-breaking Google paper have since left the organisation.
Venture investors have poured millions of dollars in funding into their new ventures, such as AI chatbot startup Character.AI run by Noam Shazeer, and the large language model startup Cohere founded by Aidan Gomez.
Sakana AI seeks to put the Japanese capital on the map as an AI hub, just as OpenAI did for San Francisco and the company DeepMind did for London earlier. In January Sakana AI said it had raised $30 million in seed financing led by Lux Capital.