Models and Benchmarks in Finance

Introducing BloombergGPT, Bloomberg’s 50-billion parameter large language model, purpose-built from scratch for finance

NEW YORK – Bloomberg today released a research paper detailing the development of BloombergGPT TM, a new large-scale generative artificial intelligence (AI) model. This large language model (LLM) has ...

Myrtle.ai Halves Latency in Financial Machine Learning Inference Benchmark Record with VOLLO

VOLLO® product has recently been audited by STAC®, a leading benchmark authority for the finance industry.[1] The results, ...

OfficeChai

Moonshot AI Releases Kimi K2.6, Beats Top US Models On Some Benchmarks

Even as frontier models from US keep getting better, Chinese open-source is more than keeping up. Moonshot AI, the Beijing-based startup behind ...

TechCrunch

The rise of AI ‘reasoning’ models is making benchmarking more expensive

AI labs like OpenAI claim that their so-called “reasoning” AI models, which can “think” through problems step by step, are more capable than their non-reasoning counterparts in specific domains, such ...

Hosted on MSN

Popular AI model performance benchmark may be flawed, Meta researchers warn

'We've identified multiple loopholes with SWE-bench Verified,' the manager at Meta Platforms' AI research lab Fair says A popular benchmark for measuring the performance of artificial intelligence ...

Geeky Gadgets

Al Benchmarks Investigated : Do Companies Tune Private Builds for Leaderboards, Then Ship Weaker Versions?

Are AI benchmarks really the gold standard we’ve been led to believe? Matt Wolfe walks through how these widely accepted metrics, designed to measure the performance of artificial intelligence systems ...

Searchenginejournal.com

OpenAI Secretly Funded Benchmarking Dataset Linked To o3 Model

OpenAI secretly funded and had access to a benchmarking dataset, raising questions about high scores achieved by its new o3 AI model. Revelations that OpenAI secretly funded and had access to the ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results