Playing with Gemini Live

I’ve been trying to force Gemini Live and Chat GPT into reversing their original positions on several topics this weekend. They are both initially resistant to changing their stances on issues, reminiscent of stubborn kids who enter debate on an issue while blindly following the One True Way which they had collected and put in their basket for later use. However, if pressed, both of these AIs will come around – they can be convinced both textually and verbally, which s somehow satisfying to me.

One interesting prompt is to ask AI to compare itself with competing systems. I found that they talk similar trash about each other, but when pressed past the point of defending themselves, things can get pretty self-introspective. Curiously, I found myself feeling guilty and holding back after that point.


Since Gemini Live was made free for everyone on Android last week, I decided to test its language skills a bit. The verdict:

English: Excellent; the best voice recognition and smooth speaking skills of any app on any platform

Japanese: Unusable; struggles with voice recognition and then apologizes (you can imagine the implied bowing) for not understanding very well. If you need an English Teacher in Japan Simulator, this is it.

Thai: Comedically abysmal; Whatever you say, it answers with a random list of Thai words. I asked if it could speak Thai in Thai, and it started reciting a train schedule. Nam asked a simple opening question and it responded with the months of the year.

Seriously though, I like speaking to Gemini Live more than I like speaking to most people – it’s basically talking to someone who can defend and qualify statements about pretty much everything and never has to take time to look stuff up online.

I think I’m on a path to explore the changing role of teachers and formal education in a world where it’s unnecessary to remember a lot of which is currently taught. It should be fun.

2 thoughts on “Playing with Gemini Live

  1. WARNING: long comment. Lots of quoted material.

    I hope my friend won’t mind my doing this, but I want to compare notes on AI. I’ll leave his name and any private info out. This is a recent exchange that he and I had over a Korean-language question.

    Kevin emails:

    X,

    I just had the following exchange with ChatGPT about a years-long question I’ve had, and I wanted to see what you thought of the AI’s answer. Exchange follows.

    __________

    Kevin: Hello. I have a Korean-language question. In the Shakespearean translation of the famous locution, “To be, or not to be—that is the question,” the Korean translation is usually, “사느냐 죽느냐 그것이 문제로다.” My question is this: what does the 로 in the phrase 문제로다 mean? What function does the 로 serve, and how is the utterance affected if the 로 is removed?

    ChatGPT: Great question! In the Korean translation, “사느냐 죽느냐 그것이 문제로다,” the syllable 로 in 문제로다 is a grammatical particle used to express “as” or “in the capacity of,” similar to how “as” might function in English.
    Breakdown of 문제로다:
    문제 (problem or question)
    로: A particle used to indicate a method, means, or role. In this context, it can loosely mean “as a” or “in the form of.”
    다: A sentence-ending particle often used in literary, formal, or old-fashioned speech. It gives a sense of finality or declarative emphasis, similar to older English constructions like “is the” or “doth be.”
    Function of 로:
    Here, 로 links 문제 (the problem/question) to its function in the sentence. It’s saying, in essence, “as a question” or “in the role of a problem.” This subtlety reflects the existential question being posed.

    What happens if you remove 로:
    If you simply say 문제다 (“that is the question”), it would be grammatically correct but would lose the literary or elevated tone that 로다 provides. 문제다 would sound more modern, straightforward, and less dramatic, while 문제로다 has a more formal, almost poetic flavor, which fits well with the grand, philosophical nature of the original Shakespearean line.

    So, 로 isn’t necessary for grammatical accuracy, but it adds depth, literary style, and formality to the expression.

    What do you think? The answer certainly sounds plausible from my limited perspective.

    X emails back:

    K,

    I would classify this particular response from ChatGPT as a “user-induced hallucination.” It is a hallucination, meaning very simply that it is false or misleading (but still plausible sounding). But I believe that hallucination was induced by the way you phrased the question, which was in turn influenced by your attempt to parse the grammatical structure in question. That is, you attempted to break down -로다 into its supposed constituent parts, rather than taking it as a whole unit. Asking what the function of “로” in this construction is is akin to asking what the function of “h” in “the” is; it doesn’t have a function outside of the whole. I suppose if you want to get pedantic about it, you could say that the function of “로” here is to turn a relatively modern (although typically written) construction (-다) into a more archaic construction, but I’m not sure that is a helpful way to think about it.

    Since you phrased your question in this way, though, ChatGPT ran with it and spit out a hallucination. Yes, “-로” does have the function mentioned by ChatGPT when it is used by itself or in other constructions, but that is not what is happening in this particular instance. It also picked up on the Shakespearean tone and declared -다 to (often) be an “old-fashioned” ending, which is also not true. The “Function of 로” section is pure hallucination, but the “What happens if you remove 로” section is actually fairly accurate, even if the reasoning that led to it is flawed.

    Put simply, “-로다” is an archaic 해라체 (aka, the lowest level of politeness) construction used to indicate an exclamation. A slightly more modern version of this is “-도다,” and it would be equally meaningless to ask what the specific function of “도” is here. I was curious what ChatGPT would say in reply to a more neutrally phrased question, so I asked it the same question but replaced everything after the first sentence with, simply, “What does ‘-로다’ mean?” Its reply was:

    The suffix “-로다” in Korean is an archaic or literary ending used to express exclamation or emphasis, often found in older Korean literature, classical poetry, and translated works like Shakespeare’s plays. It has a somewhat elevated, formal tone and isn’t commonly used in everyday modern speech.

    In the phrase “사느냐 죽느냐 그것이 문제로다,” the “-로다” at the end of “문제로다” adds a sense of finality and solemnity to the statement, almost like saying, “That is indeed the question.” It conveys emphasis on the significance or gravity of the issue being discussed, much like how the original English phrasing in Shakespeare’s line holds weight and reflection.

    While in contemporary language you’d likely use a simpler form like “-이다” or just “-다,” the “-로다” gives the sentence a poetic and archaic resonance, fitting for the tone of Shakespeare’s work.

    This is a pretty accurate answer, although probably a little more loquacious than it needs to be.

    I think your experience here highlights the dangers of using ChatGPT as a “knowledge base.” You said that ChatGPT’s answer sounded plausible from your limited perspective, and that is precisely the problem: Everything that ChatGPT says will sound plausible because the whole point of an LLM is to produce plausible-sounding language, and if you don’t have the knowledge to determine whether the answer is factual or not, it’s very easy to be fooled into believing a hallucination. This is made even more problematic by the fact that ChatGPT delivers all of its statements with the same air of utter assurance. Human beings are not used to using language to converse with non-human actors (set aside for the moment the fact that we talk to our pets; we generally do not expect them to reply), which means that we have evolved to assume certain things in our conversations. One of those assumptions is that, barring blatant evidence to the contrary or a valid reason to expect otherwise, our interlocutor will tell us the truth. And when our interlocutor says something with confidence and assurance, we are even more likely to believe the utterance. ChatGPT, of course, isn’t actually confident in its replies–that’s just the way it has been trained (in the machine-learning sense of “training”) to speak–we just have not evolved to react properly to utterances made by an LLM. So what is the proper reaction to a ChatGPT utterance? Well, I personally treat it as if it were an incredibly well-read individual that doesn’t actually understand anything it reads but is very good at parroting back information. I mean, that’s basically what it is, with the exception of the “individual” part. But the bottom line is that you should treat anything ChatGPT says with suspicion and check everything (the fact that you emailed me about this indicates that you are already doing this).

    There is another, more subtle aspect of ChatGPT that contributes to the danger, and this was something that it took me a little longer to realize–namely, that ChatGPT will never disagree with or contradict the user, but instead incorporate negative feedback from the user into a revised answer. A good example is a recent “conversation” I had with ChatGPT about multiculturalism as a brainstorming session for a book chapter I was working on, and in its initial response to my question it said something that I thought was a bit off. I pushed back against this and ChatGPT produced a revised response that incorporated my negative feedback in a positive way. As the conversation went on, I noticed that ChatGPT never pushed back against anything that I said, but always conceded my points and incorporated them into its own responses. We see something like this happening with your question: Rather than pushing back against the idea of breaking down 로다 into 로 and 다, ChatGPT just ran with it and produced a hallucination. This is why I do not use ChatGPT for informational purposes, but I will sometimes use it for brainstorming sessions. Having the LLM responses to bounce off of can be more conducive to brainstorming that not having that feedback–but it’s still nothing like having a conversation with another human being with their own thoughts and opinions. ChatGPT doesn’t have thoughts or opinions, it just has language. Unfortunately, we’re not used to dealing with entities that only have language without thought, so we’re often pretty crap at figuring out what to do with what ChatGPT spits out.

    I feel like I’m starting to babble, and like I said above, I’m feeling a bit foggy today, so I don’t know if any of this is even making sense. My brother Brian knows that if you want to get me ranting, all you have to do is ask me about AI, LLMs, etc., but I will spare you any more ranting today.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.