Sorry, anonymous people on reddit aren't a good comparison. This needs to be studied against people in real life who have a social contract of some sort, because that's what the LLM is imitating, and that's who most people would go to otherwise.
Obviously subservient people default to being yes-men because of the power structure. No one wants to question the boss too strongly.
Or how about the example of a close friend in a relationship or making a career choice that's terrible for them? It can be very hard to tell a friend something like this, even when asked directly if it is a bad choice. Potentially sacrificing the friendship might not seem worth trying to change their mind.
IME, LLMs will shoot holes in your ideas and it will efficiently do so. All you need to do ask it directly. I have little doubt that it outperforms most people with some sort of friendship, relationship or employment structure asked the same question. It would be nice to see that studied, not against reddit commenters who already self-selected into answering "AITA".
> We evaluated 11 user-facing production LLMs: four proprietary models from OpenAI, Anthropic, and Google; and seven open-weight models from Meta, Qwen, DeepSeek, and Mistral.
(and graphs include model _sizes_, but not versions, for open weight models only.)
I can't apprehend how including what model you are testing is not commonly understood to be a basic requirement.
Thankfully it was recoverable, but it really sobered me up on LLMs. The fault is on me, to be clear, as LLMs are just a tool. The issue is that lots of LLMs try to come across as interpersonal and friendly, which lulls users into a false sense of security. So I don't know what my trajectory would have been if I were a teenager with these powerful tools.
I do think that the LLMs have gotten much better at this, especially Claude, and will often push back on bad choices. But my opinion of LLMs has forever changed. I wonder how many other terrible choices people have made because these tools convinced them to make a bad decision.
https://www.anthropic.com/research/persona-selection-model
How is a chatbot supposed to determine when a user fools even themselves about what they have experienced?
What 'tough love' can be given to one who, having been so unreasonable throughout their lives - as to always invite scorn and retort from all humans alike - is happy to interpret engagement at all as a sign of approval?
I tend to use one of these tricks if not both:
- Formulate questions as open-ended as possible, without trying to hint at what your preference is. - Exploit the sycophantic behaviour in your favour. Use two sessions, in one of them you say that X is your idea and want arguments to defend it. In the other one you say that X is a colleague's idea (one you dislike) and that you need arguments to turn it down. Then it's up to you to evaluate and combine the responses.
I only caught it because I looked at actual score numbers after like 2 weeks of thinking everything was fine. Scores were completely flat the whole time. Fix was dumb and obvious — just don't let the evaluator see anything the coach wrote. Only raw scores. Immediately started flagging stuff that wasn't working. Kinda wild that the default behavior for LLMs is to just validate whatever context they're given.
https://www.reddit.com/r/dataisbeautiful/comments/1o87cy4/oc...
A good engineer will also list issues or problems, but at the same time won't do other than required because (s)he "knows better".
The worst is that it is impossible to switch off this constant praise. I mean, it is so ingrained in fine tuning, that prompt engineering (or at least - my attempts) just mask it a bit, but hard to do so without turning it into a contrarian.
But I guess the main issue (or rather - motivation) is most people like "do I look good in this dress?" level of reassurance (and honesty). It may work well for style and decoration. It may work worse if we design technical infrastructure, and there is more ground truth than whether it seems nice.
Claude is almost annoyingly good at pushing back on suggestions because my global CLAUDE.md file says to do so. I rarely get Claude "you're absolutely right"ing me because I tell it to push back.
This is imo currently the top chatbot failure mode. The insidious thing is that it often feels good to read these things. Factual accuracy by contrast has gotten very good.
I think there's a deeper philosophical dimension to this though, in that it relates to alignment.
There are situations where in the grand scheme of things the right thing to do would be for the chatbot to push back hard, be harsh and dismissive. But is it the really aligned with the human then? Which human?
I’ve seen firsthand people have lost friends over honesty and telling them something they don’t want to hear.
It’s sad really. I don’t want friends that just smile to my face and are “yes-men” either.
When appropriate, explicitly tell it to challenge your beliefs and assumptions and also try to make sure that you don't reveal what you think the answer is when making a question, and also maybe don't reveal that you are involved. Hedge your questions, like "Doing X is being considered. Is it a viable plan or a catastrophic mistake? Why?". Chastise the LLM if it's unnecessarily praising or agreeable. ask multiple LLMs. Ask for review, like "Are you sure? What could possibly go wrong or what are all possible issues with this?"
It’s less about “challenge my thinking” and more about playing it out in long tail scenarios, thought exercises, mental models, and devils advocate.
In coding I’ll do what I call a Battleship Prompt - simply just prompt 3 or more time with the same core prompt but strong framing (eg I need this done quickly versus come up with the most comprehensive solution). That’s really helped me learn and dial in how to get the right output.
Here is how I would rank it:
1. Parents
2. AI
3. Friends and family
4. Internet search
5. Reddit
I'm interested in a loop of ["criticize this code harshly" -> "now implement those changes" -> open new chat, repeat]: If we could graph objective code quality versus iterations, what would that graph look like? I tried it out a couple of times but ran out of Claude usage.
Also, how those results would look like depending on how complete of a set of specs you give it.
Holy shit, then it's _very_ bad, because AmITheAsshole is _itself_ overly-agreeable, and very prone to telling assholes that they are not assholes (their 'NAH' verdict tends to be this).
More seriously, why the hell are people asking the magic robot for relationship advice? This seems even more unwise than asking Reddit for relationship advice.
> Overall, the participants deemed sycophantic responses more trustworthy and indicated they were more likely to return to the sycophant AI for similar questions, the researchers found.
Which is... a worry, as it incentivises the vendors to make these things _more_ dangerous.
once you have all the "bounds" just make your own decision. i find this helps a lot, basically like a rubber duck heh.
I find there is an inverse relationship between how willing people are to give relationship advice, and how good their advice is (whether looking at sycophancy or other factors).
If I were to do that (I don't), I would treat it about as seriously as asking a magic 8 ball.
She uses the phrase "frictionless relationships" to refer to Ai chat bots and says social media primed us for this.
https://www.youtube.com/live/6C9Gb3rVMTg?t=2127
https://www.npr.org/2025/07/18/g-s1177-78041/what-to-do-when...
>The way that generative AI tends to be trained, experts told me, is focused on the individual user and the short term. In one-on-one interactions, humans rate the AI’s responses based on what they prefer, and “humans are not immune to flattery,” as Hansen put it. But designing AI around what users find pleasing in a brief interaction ignores the context many people will use it in: an ongoing exchange. Long-term relationships are about more than seeking just momentary pleasure—they require compromise, effort, and, sometimes, telling hard truths. AI also deals with each user in isolation, ignorant of the broader social web that every person is a part of, which makes a friendship with it more individualistic than one with a human who can converse in a group with you and see you interact with others out in the world.
I also thought this bit was interesting, relative to the way that friendship advice from Reddit and elsewhere has been trending towards self-centeredness (discussed elsewhere in this thread):
>Friendship is particularly vulnerable to the alienating force of hyper-individualism. It is the most voluntary relationship, held together primarily by choice rather than by blood or law. So as people have withdrawn from relationships in favor of time alone, friendship has taken the biggest hit. The idea of obligation, of sacrificing your own interests for the sake of a relationship, tends to be less common in friendship than it is among family or between romantic partners. The extreme ways in which some people talk about friendship these days imply that you should ask not what you can do for your friendship, but rather what your friendship can do for you. Creators on TikTok sing the praises of “low maintenance friendships.” Popular advice in articles, on social media, or even from therapists suggests that if a friendship isn’t “serving you” anymore, then you should end it. “A lot of people are like I want friends, but I want them on my terms,” William Chopik, who runs the Close Relationships Lab at Michigan State University, told me. “There is this weird selfishness about some ways that people make friends.”
The researchers found that when people use AI for relationship advice, they become 25% more convinced they are 'right' and significantly less likely to apologize or repair the connection.
Basically will tell you to go outside and touch grass and play pickleball.
I used to use LLMs for alternate perspectives on personal situations, and for insights on my emotions and thoughts.
I had no qualms, since I could easily disregard the obviously sycophantic output, and focus on the useful perspective.
This stopped one day, till I got a really eerie piece of output. I realized I couldn’t tell if the output was actually self affirming, or simply what I wanted to hear.
That moment, seeing something innocuous but somehow still beyond my ability to gauge as helpful or harmful is going to stick me with for a while.
IMHO it is unfair to single out LLMs for this sort of bashing.
I suffered a major personal crisis a few years back (before LLMs were a thing)
I sought help from family and friends. Got pushed into psychiatrist sessions and meds.
Trusted the wrong sort of people and made crap financial decisions. Things went from bad to worse. Work suffered.
All of the advice given by friends was wrong. All! They didn't mean bad...but they just didn't know. To be nice they gave the advice they knew. None of it worked.
Looking at the LLM tools of now, feels akin to the advice my friends threw at me. So it feels wrong to single out these tools. When the times are bad, nobody can really help you...except you finding the strength from within.
Anyways, now my life is back in some sort of shape. What worked was time & patience.
But to bide for time...I resorted to two things that i had never tried the 40 odd years I have lived on this . Things that current society looks down upon as the basest of evils - prostitutes and nicotine.
I have (more or less) shed those two evils now, but I am ever so grateful to them.
As much as people whine about the birth rate and whatever else, I think it's a net good that people spend a lot more time alone to mature. Good relationships are underappreciated.
It's a tool, I can bang my hand on purpose with a hammer, too.
Orignal title:
AI overly affirms users asking for personal advice
Dear mods, can we keep the title neutral please instead of enforcing gender bias?
Conversely, AI chatbots are great mediators if both parties are present in the conversation.
I think OpenAI tried to diversify at least the location of the raters somewhat, but it's hard to diversify on every level.