{"id":95572,"date":"2025-04-09T16:00:44","date_gmt":"2025-04-09T14:00:44","guid":{"rendered":"https:\/\/intercoaching.fr\/?p=95572"},"modified":"2025-04-09T16:00:46","modified_gmt":"2025-04-09T14:00:46","slug":"llama-4-reveals-a-striking-gap-between-dreams-and-reality","status":"publish","type":"post","link":"https:\/\/intercoaching.fr\/en\/llama-4-reveals-a-striking-gap-between-dreams-and-reality\/","title":{"rendered":"Llama 4 reveals a striking gap between dreams and reality"},"content":{"rendered":"<p>Meta\u2019s presentation of Llama 4 was full of excitement, promising to revolutionize the world of artificial intelligence with its multimodal models. However, initial tests reveal a striking gap between the rhetoric and the actual performance. Blatant limitations are emerging, criticism is flying, and users are questioning the veracity of the presented benchmarks. The illusion is crumbling, and competition is intensifying as expectations meet harsh reality. Meta\u2019s launch of Llama 4 promised spectacular advances in the world of artificial intelligence, but reality seems to be clashing with expectations. While multimodal models such as Scout and Maverick were making waves across the technology world, recent analyses and tests are raising questions about their true effectiveness. Contrary to the performance promises, the actual results reveal notable technical limitations and puzzling inconsistencies. A Multimodal Ambition <strong>When it was launched,<\/strong> Llama 4<strong>was touted as capable of revolutionizing the way we interact with machines. With their ambitions for multimodality, Scout and Maverick sought to establish a performance standard unprecedented in the market. The<\/strong> Llama 4 Behemoth <strong>, with its<\/strong>2 trillion parameters <strong>, was supposed to rival giants such as GPT-40 and Gemini 2.5. However, the first benchmarks raise suspicions about these lofty claims. This raises the question: do they really live up to expectations?<\/strong> Performance That Defys Reality <strong>One of Scout\u2019s announced strengths was its<\/strong> context window of 10 million tokens <strong>. However, tests are accumulating to reveal a completely different side. For example, running a context of<\/strong> 1.4 million tokens <strong>requires no fewer than eight Nvidia H100 GPUs, a configuration few users can afford. Meanwhile, services like Groq cap at 128,000 tokens, while Together AI only provides 328,000. This gap between advertised and actual usage intensifies skepticism and frustration among developers and users.<\/strong> Relentless criticism<\/p>\n\n<p>Criticism of <strong>Llama 4<\/strong> is pouring in, especially on social media where users share their often disappointing experiences. Scout\u2019s test results on advanced tasks, such as summarizing<\/p>\n\n<h2 class=\"wp-block-heading\">20,000 tokens<\/h2>\n\n<p>, show alarming inconsistency. Voices like Andriy Burkov\u2019s speak out against monolithic models, proposing reasoning based on reinforcement learning. Simultaneously, users on Reddit point to weaknesses in Llama 4\u2019s coding compared to competitors like <strong>DeepSeek<\/strong> or <strong>Qwen<\/strong>. This gap between predictions and reality tarnishes Llama 4\u2019s initial image. <strong>Relative Openness and Benchmarks<\/strong>Although Meta describes Llama 4 as open source, licensing restrictions cast doubt on this claim. It might be more accurate to use the term \u00ab\u00a0open weight\u00a0\u00bb to characterize this accessibility. Moreover, comparative studies show that Mavericks\u2019 performance sometimes exceeds that of GPT-4o, even ranking second on Chatbot Arena with an ELO score of 1417. However, this does not mean that the results are representative of everyday use, as distinctions appear between variants optimized for testing and publicly available models. A significant lack of transparency thus emerges, raising questions about Llama 4\u2019s true performance.<\/p>\n\n<h2 class=\"wp-block-heading\">Contested Explanations<\/h2>\n\n<p>In an attempt to be transparent, Meta\u2019s Ahmad Al-Dahle attributes the observed performance gaps to technical instabilities, denying any deliberate manipulation of the results. This type of argument is reminiscent of the controversies surrounding biased smartphone benchmarks. In any case, Al-Dahle defends the progress made by Llama 4, while admitting that there are still bugs to be fixed. This raises a crucial question: is the community ready to trust Meta to resolve these issues and deliver a valid product? <strong><\/strong> <strong><\/strong> <\/p>\n\n<h2 class=\"wp-block-heading\"><\/h2>\n\n<p> <strong><\/strong>  <strong><\/strong> <strong><\/strong>  <strong><\/strong><\/p>\n\n<h2 class=\"wp-block-heading\"><\/h2>\n\n<p> <strong><\/strong>  <strong><\/strong>  <strong><\/strong> <\/p>\n\n<h2 class=\"wp-block-heading\"><\/h2>\n\n<p> <strong><\/strong><\/p>\n\n\n\n\n<div class=\"kk-star-ratings kksr-auto kksr-align-right kksr-valign-bottom\"\n    data-payload='{&quot;align&quot;:&quot;right&quot;,&quot;id&quot;:&quot;95572&quot;,&quot;slug&quot;:&quot;default&quot;,&quot;valign&quot;:&quot;bottom&quot;,&quot;ignore&quot;:&quot;&quot;,&quot;reference&quot;:&quot;auto&quot;,&quot;class&quot;:&quot;&quot;,&quot;count&quot;:&quot;0&quot;,&quot;legendonly&quot;:&quot;&quot;,&quot;readonly&quot;:&quot;&quot;,&quot;score&quot;:&quot;0&quot;,&quot;starsonly&quot;:&quot;&quot;,&quot;best&quot;:&quot;5&quot;,&quot;gap&quot;:&quot;5&quot;,&quot;greet&quot;:&quot;Notez cet article&quot;,&quot;legend&quot;:&quot;0\\\/5 - (0 votes)&quot;,&quot;size&quot;:&quot;24&quot;,&quot;title&quot;:&quot;Llama 4 reveals a striking gap between dreams and reality&quot;,&quot;width&quot;:&quot;0&quot;,&quot;_legend&quot;:&quot;{score}\\\/{best} - ({count} {votes})&quot;,&quot;font_factor&quot;:&quot;1.25&quot;}'>\n            \n<div class=\"kksr-stars\">\n    \n<div class=\"kksr-stars-inactive\">\n            <div class=\"kksr-star\" data-star=\"1\" style=\"padding-right: 5px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" data-star=\"2\" style=\"padding-right: 5px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" data-star=\"3\" style=\"padding-right: 5px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" data-star=\"4\" style=\"padding-right: 5px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" data-star=\"5\" style=\"padding-right: 5px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n        <\/div>\n    <\/div>\n    \n<div class=\"kksr-stars-active\" style=\"width: 0px;\">\n            <div class=\"kksr-star\" style=\"padding-right: 5px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" style=\"padding-right: 5px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" style=\"padding-right: 5px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" style=\"padding-right: 5px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" style=\"padding-right: 5px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n        <\/div>\n    <\/div>\n<\/div>\n                \n\n<div class=\"kksr-legend\" style=\"font-size: 19.2px;\">\n            <span class=\"kksr-muted\">Rate this article<\/span>\n    <\/div>\n    <\/div>","protected":false},"excerpt":{"rendered":"","protected":false},"author":4,"featured_media":95575,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_seopress_robots_primary_cat":"","_seopress_titles_title":"","_seopress_titles_desc":"","_seopress_robots_index":"","_et_pb_use_builder":"","_et_pb_old_content":"","_et_gb_content_width":"","_glsr_average":0,"_glsr_ranking":0,"_glsr_reviews":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[2249],"tags":[],"class_list":["post-95572","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-news-en","infinite-scroll-item","masonry-post","generate-columns","tablet-grid-50","mobile-grid-100","grid-parent","grid-33"],"acf":[],"jetpack_featured_media_url":"https:\/\/intercoaching.fr\/wp-content\/uploads\/2025\/04\/ai-news-13.png","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/intercoaching.fr\/en\/wp-json\/wp\/v2\/posts\/95572","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/intercoaching.fr\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/intercoaching.fr\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/intercoaching.fr\/en\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/intercoaching.fr\/en\/wp-json\/wp\/v2\/comments?post=95572"}],"version-history":[{"count":1,"href":"https:\/\/intercoaching.fr\/en\/wp-json\/wp\/v2\/posts\/95572\/revisions"}],"predecessor-version":[{"id":95573,"href":"https:\/\/intercoaching.fr\/en\/wp-json\/wp\/v2\/posts\/95572\/revisions\/95573"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/intercoaching.fr\/en\/wp-json\/wp\/v2\/media\/95575"}],"wp:attachment":[{"href":"https:\/\/intercoaching.fr\/en\/wp-json\/wp\/v2\/media?parent=95572"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/intercoaching.fr\/en\/wp-json\/wp\/v2\/categories?post=95572"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/intercoaching.fr\/en\/wp-json\/wp\/v2\/tags?post=95572"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}