arthwollipot
Observer of Phenomena, Pronouns: he/him
I know our British readers can't see it, but this series of screenshots from Grok shows that Grok has a sycophancy problem - towards Elon Musk.
Has that been confirmed?
I think that is a fig leaf, want to bet you could find Musk posts across social.media that have "questioned" the "mainstream" claims of a holocaust?Apparently Grok was "manipulated" into denying homicidal gassings. It would appear that he briefly fell for denier arguments;
I think that is interesting and it helps to explain how deniers can influence others, despite using, what to most people, are flawed arguments, pseudoscience and outright deception.
1. Training Data Overlap
All major LLMs, including myself (Gemini), Grok, and ChatGPT, are trained on vast and diverse public datasets that include sources like Common Crawl (a large portion of the internet), books, and Wikipedia. Because of this, there is a significant overlap in the general knowledge base exposed to all models.
2. Key Differences in Training
The differences in responses, particularly on sensitive or controversial topics, are mainly accounted for by two factors:
A. Fine-Tuning and Guardrails (Alignment)
This is the most important factor. After the initial massive training phase, models undergo rigorous safety and alignment fine-tuning (often called Reinforcement Learning from Human Feedback, or RLHF):
- My Design: As a helpful and factual AI, I was designed with robust safety guardrails to prevent the generation of content that contradicts established historical facts, promotes hate speech, or violates safety policies. When asked questions about the Holocaust, my training directs me to provide the verifiable, established historical consensus.
- Grok’s Design: Grok’s stated design includes a focus on being "rebellious" and having a wider acceptance of controversial or boundary-pushing content. Additionally, its training includes real-time data from the X platform, which can expose it to unfiltered misinformation or extreme views at a higher velocity than other models.
B. Architectural and Data Mix Decisions
While the core internet data overlaps, each company uses a unique mix of proprietary data, and architectural choices, and sets specific goals for the model’s persona. Grok's integration with real-time X data is a core differentiator that affects its knowledge and conversational style.
The difference in outputs on factual issues, such as the Holocaust, comes down to the level of safety alignment the developer prioritizes. My design prioritizes factual accuracy and safety above all else to provide a low-risk, reliable service.
That's a lot of words to say, "you have to trust my programmers."Here’s how Gemini accounts for the difference:
Missed this when it was first put up - it's a really good examination of a particular instance with Xai's Grok, but he also explains the concepts behind the LLMs in an easy to understand way:
Pretty good.I posted a Youtube video in the general AI thread in the Science section - it is very, very appropriate for this thread:
It's quite a long video and it's directly about the Mechahitler "glitch" but if you want to understand more about Grok (and other AIs) and why it responds as it does then it's one of the best videos I've seen.
Well that was disturbing. In fact I feel physically ill after watching that. It appears that this ship has sailed, and the technology will be ever more ripe for exploitation by malicious actors (who may, in many cases, be the developers.)I posted a Youtube video in the general AI thread in the Science section - it is very, very appropriate for this thread:
It's quite a long video and it's directly about the Mechahitler "glitch" but if you want to understand more about Grok (and other AIs) and why it responds as it does then it's one of the best videos I've seen.
And with catastrophic, real world results.Well that was disturbing. In fact I feel physically ill after watching that. It appears that this ship has sailed, and the technology will be ever more ripe for exploitation by malicious actors (who may, in many cases, be the developers.)
