Microsoft Takes on AI Rivals With 3 New Foundational Models 

The launch shows Microsoft’s growing effort to build its own AI technology, even as it continues to work closely with OpenAI. 

Microsoft Takes on AI Rivals With 3 New Foundational Models  - feature image

San Francisco: Microsoft has announced the launch of 3 new foundational AI models in a bid to compete with major AI players such as OpenAI and Google. The models can generate text, voice, and images, covering key areas of AI development. 

The announcement was made by Microsoft AI, the company’s research division. The launch shows Microsoft’s growing effort to build its own AI technology, even as it continues to work closely with OpenAI.  

The 3 models are named MAI‑Transcribe‑1, MAI‑Voice‑1, and MAI‑Image‑2. Together, they offer tools for speech‑to‑text conversion, audio generation, and image creation.  

MAI‑Transcribe‑1 focuses on converting spoken words into text. Microsoft says it supports 25 languages and works 2.5 times faster than its earlier Azure Fast transcription service. 

MAI‑Voice‑1 is designed to generate realistic audio. The model can create 60 seconds of voice output in just one second and allows users to create a custom‑sounding voice. 

MAI‑Image‑2 is an updated image generation model. It was first introduced on MAI Playground in March and is now being made more widely available.  

All 3 models are being released on Microsoft Foundry, the company’s AI development platform. The transcription and voice models are also available through MAI Playground for testing and experimentation. 

The models were built by Microsoft’s MAI Superintelligence team, which was formed in November 2025. The team is led by Mustafa Suleyman, the CEO of Microsoft AI.  

Suleyman said Microsoft is focused on building “human‑centered” AI that fits naturally into how people communicate and work. He also hinted that more models are expected in the coming months. 

Microsoft believes pricing will be a key advantage. The company says its new AI models are cheaper to use than similar offerings from major competitors. The pricing starts at USD 0.36 per hour for MAI‑Transcribe‑1. MAI‑Voice‑1 costs USD 22 per one million characters, while MAI‑Image‑2 starts at USD 5 per million input tokens and USD 33 per million output tokens.  

Despite building its own AI models, Microsoft has reiterated its commitment to OpenAI. The company has invested over USD 13 billion in the partnership and continues to integrate OpenAI tools into its products.  

With this launch, Microsoft is clearly signaling its intent to become more self‑reliant in AI. The move gives it a stronghold in the increasingly competitive AI market and gives customers more options within Microsoft’s ecosystem. 

Published on April 3, 2026

Yashika Aneja

Journalist

Yashika Aneja is a journalist at Tea4Tech with over five years of experience in reporting and editorial writing. Her work spans technology, environment, education, politics, social media, travel, and lifestyle, with a focus on fact-based reporting and explanatory storytelling. At Tea4Tech, Yashika contributes original reporting and analysis that ad...

View Bio