• lol3droflxp
    link
    fedilink
    18 months ago

    I get that this is expensive. However, it should also work with RAM if you accept slower speeds I guess. The question is of course if it’s still usable then.

    • @averyminya@beehaw.org
      link
      fedilink
      48 months ago

      Most current locally hosted software has some option to offload to RAM, CPU, and disk. VRAM is fastest, but RAM and CPU offloading lets you cut down to less than 4GB VRAM for certain applications, at plenty reasonable speed.

    • @abhibeckert@beehaw.org
      link
      fedilink
      1
      edit-2
      8 months ago

      GPT-4 is already kinda slow - it works best as a “conversational” tool where you ask follow up questions and clarify things that have already been said. That’s painful when you have to wait 10 seconds for a response. I couldn’t imagine it being useful if it was minutes.