noneabove1182@sh.itjust.worksMEnglish · 1 year agoBeginner questions threadplus-squarepinmessage-squaremessage-square28fedilinkarrow-up135arrow-down11
arrow-up134arrow-down1message-squareBeginner questions threadplus-squarepinnoneabove1182@sh.itjust.worksMEnglish · 1 year agomessage-square28fedilink
SmokeyDope@lemmy.worldEnglish · 22 hours agoReturning back to where it started with llama 3 8B. DeepHermes is a great for 8gb VRAM cardsplus-squaremessage-squaremessage-square3fedilinkarrow-up117arrow-down10
arrow-up117arrow-down1message-squareReturning back to where it started with llama 3 8B. DeepHermes is a great for 8gb VRAM cardsplus-squareSmokeyDope@lemmy.worldEnglish · 22 hours agomessage-square3fedilink
NinjaMoves@feddit.nuEnglish · 6 hours agoMistral small 3.1 releasedplus-squaremistral.aiexternal-linkmessage-square2fedilinkarrow-up118arrow-down11
arrow-up117arrow-down1external-linkMistral small 3.1 releasedplus-squaremistral.aiNinjaMoves@feddit.nuEnglish · 6 hours agomessage-square2fedilink
hendrik@palaver.p3x.deEnglish · edit-21 day agoRecommendations for a lightweight Python LLM framework for a webapp?plus-squaremessage-squaremessage-square5fedilinkarrow-up17arrow-down11
arrow-up16arrow-down1message-squareRecommendations for a lightweight Python LLM framework for a webapp?plus-squarehendrik@palaver.p3x.deEnglish · edit-21 day agomessage-square5fedilink
morrowind@lemm.eeEnglish · 1 day agoSketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketchingplus-squarearxiv.orgexternal-linkmessage-square2fedilinkarrow-up111arrow-down10
arrow-up111arrow-down1external-linkSketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketchingplus-squarearxiv.orgmorrowind@lemm.eeEnglish · 1 day agomessage-square2fedilink
thickertoofan@lemm.eeEnglish · 2 days agoLoaded benchmark for 1-3-4-7b models?plus-squaremessage-squaremessage-square3fedilinkarrow-up16arrow-down10
arrow-up16arrow-down1message-squareLoaded benchmark for 1-3-4-7b models?plus-squarethickertoofan@lemm.eeEnglish · 2 days agomessage-square3fedilink
SmokeyDope@lemmy.worldEnglish · edit-22 days agoDeepHermes Preview features swappable standard output to R1 distill CoT reasoning. Its kind of blowing my mind.plus-squarelemmy.worldimagemessage-square5fedilinkarrow-up18arrow-down10
arrow-up18arrow-down1imageDeepHermes Preview features swappable standard output to R1 distill CoT reasoning. Its kind of blowing my mind.plus-squarelemmy.worldSmokeyDope@lemmy.worldEnglish · edit-22 days agomessage-square5fedilink
SmokeyDope@lemmy.worldEnglish · edit-22 days agoCan Your LLM pass the Sun Theft Vibe Check (STVC) benchmark?plus-squaremessage-squaremessage-square2fedilinkarrow-up111arrow-down10
arrow-up111arrow-down1message-squareCan Your LLM pass the Sun Theft Vibe Check (STVC) benchmark?plus-squareSmokeyDope@lemmy.worldEnglish · edit-22 days agomessage-square2fedilink
Lantier@jlai.luEnglish · edit-25 days agoNew release: Gemma 3 family of modelsplus-squarehuggingface.coexternal-linkmessage-square3fedilinkarrow-up120arrow-down10
arrow-up120arrow-down1external-linkNew release: Gemma 3 family of modelsplus-squarehuggingface.coLantier@jlai.luEnglish · edit-25 days agomessage-square3fedilink
thickertoofan@lemm.eeEnglish · 5 days agoGemma 3 1B and 3B result on a "needle in a haystack" like test ran locallyplus-squaremessage-squaremessage-square1fedilinkarrow-up112arrow-down10
arrow-up112arrow-down1message-squareGemma 3 1B and 3B result on a "needle in a haystack" like test ran locallyplus-squarethickertoofan@lemm.eeEnglish · 5 days agomessage-square1fedilink
morrowind@lemm.eeEnglish · 6 days agoReka Flash, open source 21B model comparable to QWQ 32Bi.postimg.ccimagemessage-square2fedilinkarrow-up119arrow-down10
arrow-up119arrow-down1imageReka Flash, open source 21B model comparable to QWQ 32Bi.postimg.ccmorrowind@lemm.eeEnglish · 6 days agomessage-square2fedilink
Björn Tantau@swg-empire.deEnglish · edit-25 days agoIs there a German 7B Vision Model?plus-squaremessage-squaremessage-square3fedilinkarrow-up16arrow-down10
arrow-up16arrow-down1message-squareIs there a German 7B Vision Model?plus-squareBjörn Tantau@swg-empire.deEnglish · edit-25 days agomessage-square3fedilink
morrowind@lemm.eeEnglish · 6 days agoSorting-Free GPU Kernels for LLM Samplingplus-squareflashinfer.aiexternal-linkmessage-square0fedilinkarrow-up15arrow-down10
arrow-up15arrow-down1external-linkSorting-Free GPU Kernels for LLM Samplingplus-squareflashinfer.aimorrowind@lemm.eeEnglish · 6 days agomessage-square0fedilink
Lantier@jlai.luEnglish · 12 days agoQwen/QwQ-32B · Hugging Faceplus-squarehuggingface.coexternal-linkmessage-square5fedilinkarrow-up117arrow-down10
arrow-up117arrow-down1external-linkQwen/QwQ-32B · Hugging Faceplus-squarehuggingface.coLantier@jlai.luEnglish · 12 days agomessage-square5fedilink
Oskar@piefed.socialEnglish · 12 days agoMac Studio 2025plus-squaremessage-squaremessage-square6fedilinkarrow-up111arrow-down11
arrow-up110arrow-down1message-squareMac Studio 2025plus-squareOskar@piefed.socialEnglish · 12 days agomessage-square6fedilink
ikt@aussie.zoneEnglish · 13 days agoNVIDIA's GeForce RTX 4090 With 96GB VRAM Reportedly Exists; The GPU May Enter Mass Production Soon, Targeting AI Workloadsplus-squarewccftech.comexternal-linkmessage-square4fedilinkarrow-up120arrow-down11
arrow-up119arrow-down1external-linkNVIDIA's GeForce RTX 4090 With 96GB VRAM Reportedly Exists; The GPU May Enter Mass Production Soon, Targeting AI Workloadsplus-squarewccftech.comikt@aussie.zoneEnglish · 13 days agomessage-square4fedilink
morrowind@lemm.eeEnglish · 14 days agoChain of Draft: Thinking Faster by Writing Lessplus-squarearxiv.orgexternal-linkmessage-square0fedilinkarrow-up19arrow-down11
arrow-up18arrow-down1external-linkChain of Draft: Thinking Faster by Writing Lessplus-squarearxiv.orgmorrowind@lemm.eeEnglish · 14 days agomessage-square0fedilink
morrowind@lemm.eeEnglish · 15 days agoAtom of Thoughts (AOT): lifts gpt-4o-mini to 80.6% F1 on HotpotQA, surpassing o3-mini and DeepSeek-R1plus-squarebsky.appexternal-linkmessage-square0fedilinkarrow-up114arrow-down12
arrow-up112arrow-down1external-linkAtom of Thoughts (AOT): lifts gpt-4o-mini to 80.6% F1 on HotpotQA, surpassing o3-mini and DeepSeek-R1plus-squarebsky.appmorrowind@lemm.eeEnglish · 15 days agomessage-square0fedilink
ikt@aussie.zoneEnglish · 15 days agoCrossing the uncanny valley of conversational voiceplus-squarewww.sesame.comexternal-linkmessage-square6fedilinkarrow-up127arrow-down12
arrow-up125arrow-down1external-linkCrossing the uncanny valley of conversational voiceplus-squarewww.sesame.comikt@aussie.zoneEnglish · 15 days agomessage-square6fedilink
Smorty [she/her]@lemmy.blahaj.zoneEnglish · 18 days agoApparently microsofts new phi-4-mini is business-tuned?plus-squarelemmy.blahaj.zoneexternal-linkmessage-square7fedilinkarrow-up135arrow-down12
arrow-up133arrow-down1external-linkApparently microsofts new phi-4-mini is business-tuned?plus-squarelemmy.blahaj.zoneSmorty [she/her]@lemmy.blahaj.zoneEnglish · 18 days agomessage-square7fedilink