Since its inception in the previous year, the company has amassed a total of $1.525 billion in funding.
On June 29, Inflection AI, headquartered in Palo Alto, unveiled the successful procurement of $1.3 billion in funding, led by prominent names such as Microsoft, Reid Hoffman, Bill Gates, Eric Schmidt, and Nvidia. The newly acquired capital is slated to partially fund the creation of a 22,000-unit Nvidia H100 Tensor GPU cluster, professed by the company to be the world’s largest. The GPUs are intended for the construction of large-scale artificial intelligence models. The developers commented:
“Upon comparing our cluster to the recent TOP500 list of supercomputers, we conjecture that ours would be a close second, and potentially even the top entry. This, despite being optimized specifically for AI, as opposed to scientific applications.”
Additionally, Inflection AI is in the process of designing its own unique personal assistant AI system, affectionately named “Pi.” As per the company’s explanation, Pi serves as “a mentor, guide, confidant, creative collaborator, and sounding board” that can be accessed directly via social media or WhatsApp. From its launch in early 2022, the company’s total funding tally now stands at $1.525 billion.
Despite the increasing financial support for sizable AI models, specialists caution that the actual efficiency of these models could be considerably hampered by current technological constraints. As a case in point, Foresight, a venture fund based in Singapore, highlights the difficulties of handling the sheer volume of data with large AI models, quoting an AI model with 175 billion parameters and 700GB of data:
“If we consider a scenario with 100 computing nodes, with each node needing to update all parameters at each step, each step would demand the transmission of around 70TB of data (700GB*100). With an optimistic assumption that each step takes 1s, then the required bandwidth would be 70TB per second, a demand far surpassing most networks’ capabilities.”
Furthering their point, Foresight also cautioned that “owing to communication latency and network congestion, data transmission time could far exceed the 1s estimate,” implying that computing nodes could spend a significant portion of their time waiting for data transmission instead of performing actual computation. As per Foresight’s analysis, given the present constraints, the answer lies in smaller AI models, which are “simpler to deploy and manage.”
“In many application settings, users or companies do not require the extensive reasoning capacity of large language models, but instead focus primarily on a very specific prediction target.”