• theunknownmuncher@lemmy.world
          link
          fedilink
          arrow-up
          1
          ·
          20 hours ago

          “I don’t think any of that is true. show me data” is shown data “I won’t accept that data!” Lol. Lmao even.

          Yeah, I’m not going to play this game of trying to anticipate which numbers you’re willing to accept and which you aren’t. You have just as equal access to a search engine as I have. All of the results I have seen align with the numbers that Qwen released and are well within margins of error.

          This model’s release caused such a stir and was a big deal due to the fact that it reproducibly meets or beats Claude Opus 4.5 while being locally runnable. If you won’t believe it, okay, I don’t care. 🤷

          • AtHeartEngineer@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            19 hours ago

            I did look it up before I commented at all, and what I was looking at wasn’t a good picture, they are pretty close.

            my bad. I still am not going to taking any frontier labs word for it, I hope you get that. And for real I was not/am not trying to be a dick, the benchmarks I saw said opus 4.5 was winning out on reasoning, I saw some others that were a lot more mixed.

            are you running it? what quant/hardware? how fast is it practically?

            • theunknownmuncher@lemmy.world
              link
              fedilink
              arrow-up
              1
              ·
              19 hours ago

              I run 27b at q8 with unquantized KV cache and 256k context on two Instinct MI60 GPUs. Definitely the best model that I have been able to run locally at a reasonable speed. 35b generates tokens as fast as you’d expect from any cloud provider. 27b is slower than 35b, of course, but token generation is still faster than my reading speed and suitable with coding agents.

              • AtHeartEngineer@lemmy.world
                link
                fedilink
                English
                arrow-up
                1
                ·
                17 hours ago

                How have I not heard of this line of GPUs?? wth. lot of wattage. I’ve got a 7900xtx on my desktop and a modded 2080ti with 22gb vram and a 3060ti in my server (my old desktop hardware). I tried 27b dense at q4 or q5 a few weeks ago on my 2080ti and it was painfully slow and I was getting pretty mixed quality results.

                32gb of vram is hard to come by.

                • theunknownmuncher@lemmy.world
                  link
                  fedilink
                  arrow-up
                  1
                  ·
                  17 hours ago

                  The wattage is actually relatively low compared to a lot of current gen GPUs (mainly NVIDIA ones). They are software capped to 225W, but the GPUs can handle 300W. Compared to 5090 which is like 600W

        • theunknownmuncher@lemmy.world
          link
          fedilink
          arrow-up
          1
          arrow-down
          1
          ·
          16 hours ago

          It’s not like the Qwen team hasn’t already built a lot of trust with the community. They’ve never been misleading with previous releases, the “marketing material” (🙄) is for a free product, so they have no incentive to lie, and it would be extra stupid because anyone can run the benchmarks and verify their numbers independently anyway. What would be the point?