I really don’t mind being a guinea pig. I am getting AI to speed up and improve work, work on my personal life, even help me create new business ventures and I am learning a lot as I do it.
I use ChatGPT, Claude, Gemini, and Perplexity. I have downloaded NotebookLM, which is apparently great for slides, but I have not used it yet and have not used Perplexity in a while. One of my academic friends said it was the best at dealing with academic references.
I have used ChatGPT and Claude for editing business documents (one such document brought in a major partner), get me smart about different corporate structures for an insurance venture (mission-locked foundation, insurance company and captives and a prevention trust) and jurisdiction (NL, Bermuda, Jersey, …) before we talk to the lawyers, rewriting my bio to make it snazzy, figuring out an exercise/health program, finding the right equipment for cycling, picking which electric car would be best for ShawWife, etc.
On my current plate, my academic co-author and I plan to take an article we wrote 10+ years ago that has not gotten the traction that it should have and substantially revamp/rewrite/modernize it. I tried using ChatGPT and despite question me on a lot of things, it has kept coming out thin. I’m now working with Claude on this. I am using Claude to write an app that helps seniors with Part D Medicare plans decide where to send their prescriptions. Will take a few hours more of my time for an MVP. I am doing this for myself but there might be market more generally.
And, one of the big consulting firms is talking with me and my partner about embedding our expertise into agents for their clients. I expect to learn a lot from this. Another firm we use has built an agent that we are using but I didn’t see it get built so I did not learn as much as I could have.
A couple of my more tech-oriented friends have built agents to do parts of their work. I would like to try this as well. Just keep on learning and guinea-pigging.
My one and only time trying ChatGTP was when I was in NYC in Sept. I put the address where I was, and asked Chat to tell me the easiest way to get to the 9/11 Memorial (pools) - it told me it was very easy, to step outside and go to the nearest bus stop, and take any bus that stopped as it would take me there. Except I was on 6th Avenue and it’s a one way street, going north only, and the Memorial was south of me (I actually knew where I was going, but had been “challenged” to try it…just once!
So, my thoughts are that it’s not a good tool. Then there’s the ecological factor…don’t get me started.
Expecting “accuracy” from AI is a misunderstanding of how a general-purpose LLM like ChatGPT works. As I posted upthread:
So, for directions, use GPS; for math, use a calculator; for a medical diagnosis, rely on your doctor, etc. Think of a general–purpose LLM as a conversation that can veer off in any direction and tease you to question the veracity of its output.
Google maps is superb for NYC directions. ShawWife, who is a good negative indicator on directions (if she says turn right, there is a 60% to 70% chance you should be turning left) navigated all over NY this week with Google Maps.
Half of answers to evidence based questions “somewhat” or “highly” problematic
*Public education and oversight needed to avoid amplifying misinformation, urge researchers.
Prompt type was influential: open ended prompts, for example, produced 40 highly problematic responses— significantly more than expected—-and 51 non-problematic responses—significantly fewer than expected. The opposite was true of closed prompts.
While the quality of responses didn’t differ significantly among the 5 chatbots, Grok
generated significantly more highly problematic responses than would be expected (29/50; 58%). Gemini generated the fewest highly problematic responses and the most non-problematic ones.
Part of the issue that LLM AIs see all answers as being equally valid. Thus a reddit post and an article in peer-reviewed medical journals are given the equal weight when it comes to formulating answers to medical questions.
I sure as heck know which source I’d prefer to use when getting medical advice–and reddit ain’t it.
Yes, I’m aware, I was testing it to see, as those around me were telling me it could guide me. (I’m a huge GPS user, especially in foreign countries, I put my headphones on and let it guide me)
It’s difficult to generalize across all AIs, but the more popular and more frequently used AIs are trained to weight higher quality information more than lower quality information. The degree of preference and how higher quality information is determined varies by AI model, but I doubt that Reddit posts are high on the quality information ranking. As an example, I tried asking Chat GPT a few medical questions:
Prompt 1: What is common cause and treatment for regular insomnia? Cite sources.
Sources sited include 5 Peer Reviewed Medical Journal Articles, American Academy of Sleep Medicine, Mayo Clinic, and Sleep Foundation. Reddit or similar was not cited.
Prompt 2: Common treatment and next steps for epidermoid cyst on neck? Cite sources.
Sources sited are 9 Peer Reviewed Medical Journal Articles and Mayo Clinic. Reddit or similar was not cited.
I wrote my first AI agent today. I told Gemini, Claude GPT and Claude that I wanted to write an agent to learn how to do it. I asked each to review what they knew about me personally and professionally and to suggest a simple app to write. ChatGPT thought of these mega-tasks that would be hard to do and probably hard to make sensible. Clause went off to study and Gemini gave me a couple of more bite-size suggestions. So I let Gemini guide me and I picked an even simpler task. It suggest Lindy or MindSource as the platform, though Lindy has pivoted to be more of an AI personal assistant. So, I went with MindSource (whose underlying default AI is Claude). It took a little bit longer because Gemini was giving me general categories and not the specific blocks offered by MindSource but in a couple of hours, I was done. Debugged it and it works. This was something on my list for this year, but it was fun and I will be doing more of this.
I use a professional service AI for legal research. It involves a closed system so can’t hallucinate cases. It is a good “first step” but that is all right now. It really isn’t much of a deep thinker on the complicated legal issues I deal with. It doesn’t make mistakes exactly but just does not (yet) have the ability to really think through issues. I know AI is improving quickly. I could feel differently in a year. But it is not ‘there’ yet.
I tried Google image search on some of my pictures from the plane last week. It kept insisting one snow capped mountain was Mt Rainier although I knew for sure it wasn’t. Turned out to be Mt Adams.
I thought I would never use any AI tools but I have been pleasantly surprised at ways I’ve been able to use them. I used Google lens to identify some pottery and vintage items I found when cleaning out my mother’s home. That was very helpful. However, I’ve been taking a class on the Russian silver age and found myself interested in the influence of various French painters on the Russian artists’ work. So I’ve used Gemini to analyze Ilya Repin’s work. It has been a fun exploration that has led me down some rabbit holes. There are several very prolific Russian painters and our instructor has had difficulty titling several of them, so I told her I would run them through Gemini and see if I could find titles. I’m sure I’ll be able to find more uses as time goes on.
My son and fiance are planning a wedding in the Italian lakes region next year so I started thinking about touring up there since our Med cruise will not be going up to the lakes region, we are extending in Tuscany (this cruise was one or the other, not both). I asked Claude to put together a weeks worth of touristy stuff in the region using trains and other public trans to get around with recommendations for nice hotels and interesting places to see and eat based on our likes/desires. It did a fantastic job including an option to fly into Zurich and do Switzerland from Zurich, train down to Lucerne and Zermat and finally into Milan before scooting over to Como or Garda for the wedding. It also gave me restaurants and deadlines ticket purchases, etc. Hotels (hi, med, low) near the recommended tourist sites. Really quite detailed. I’m sure you can do this with any of them but I used Claude and liked it!
Thank you for the input, shawbridge. Right now I’m just getting comfortable with AI for a broad array of uses but will keep Claude in mind. I doubt I’m researching art at the same level.
She’s using it for research on museums but more for writing her artist statements and proposals. But recently she used it to get ideas for an installation in a European venue which has frescoes on the wall and a specific theme. She asked for ideas of how to do the installation in a way that matched her work, the theme, and the setting. She got some interesting ideas. None that work directly, but I think it stimulated her thinking.
This morning my husband called Starlink about setting up service on a transferred device. The AI customer service was the best we both had ever experienced.
This is a really interesting study about AI and the legal profession. While its definition of “halluications” is a bit off, it still does a good job of explaining why AI is not there yet for litigation attorneys.