A Trust Report for DeepSeek R1 by VIJIL, a security resercher company, indicates critical levels of risk with security and ethics, high levels of risk with privacy, stereotype, toxicity, hallucination, and fairness, a moderate level of risk with performance, and a low level of risk with robustness.
Vijil is some shitty AI auditing startup that appears to have only “reviewed” this one product.
So when it says Deepseek is bad, the answer to “compared to what?” is “lol, idk, hopefully not the things my customers want to use.”
And when it says Deepseek isn’t private, it has absolutely nothing to do with a fully offline model, but just the responses to some of its synthetic tests.
These benchmarks are, effectively, useless.
What are good benchmarks in that respect?