Live Green, Wear Black.: Testing OttnoAI: Early AI Health Tools, Hallucinations, and the Future I Wish Garmin Would Build

I’ve been spending the last few days testing OttnoAI, a new health‑analytics tool built by a solo founder who promises something rare in the AI space: no training on user data, no retention, and no personalization.

As someone who lives inside Garmin Connect, sleep metrics, and recovery data — and who cares deeply about privacy — that promise alone made me curious enough to dive in.

And honestly, it’s exciting to see a tool this early in its lifecycle already doing so much right. But like any early AI product, especially one trying to interpret messy real‑world data, it also shows the classic signs of an LLM that needs stronger grounding and clearer guardrails.

This post isn’t a teardown. It’s a snapshot of what’s possible, what’s rough around the edges, and what I hope the future of AI‑powered health analytics will look like.

I’m not coming to this as someone who’s new to AI. I completed Northeastern’s AI Applications graduate certificate in 2025, where we spent a lot of time on the limits and ethical use of LLMs as tools to support human tasks. I’m now in an MSIS program that continues to push on those same questions — how to use AI responsibly, how to keep humans in the loop, and how to design systems that don’t overreach.

So when I test an early AI product, I’m looking at it through both lenses: the everyday user who wants insight into their health data, and the practitioner who understands how easily models can hallucinate, drift outside their domain, or misinterpret context without strong guardrails.

What OttnoAI Already Does Well

OttnoAI reads wearable data with surprising nuance. It can interpret Garmin data across steps, heart rate, sleep, stress, and activity types. It spotted patterns in my cycling intensity, sleep variability, and body battery trends that were genuinely useful.

It also tries to connect dots across domains. It doesn’t just say “your steps were low.” Once I told it, “I’ve been having pain in my heel,” it could then say:

your heel pain is affecting your step count
which affects your calorie burn
which affects your weight‑loss goals
which affects your sleep and recovery

That kind of multi‑factor reasoning is exactly what people want from AI health tools. But it’s important to note: I had to tell it about the heel pain. Garmin doesn’t know when I’m in pain, and OttnoAI can’t infer that without me explicitly stating it.

It’s also conversational and adaptive. When I corrected it (“I swim and cycle — step counts don’t tell the whole story”), it adjusted quickly and re‑anchored its analysis.

And the privacy stance is refreshing: no training on user data, no retention, no personalization beyond the session. In a world where most AI tools quietly hoover up everything, this is a breath of fresh air.

Where OttnoAI Shows Its Early‑Stage Edges

Some of the issues I ran into are classic LLM behavior.

Hallucinated timelines and invented domains. At one point, OttnoAI created a “coursework intensity timeline” for me. There is no coursework in my Garmin data. It invented an entire domain because I casually mentioned I’m in an MSIS program and have midterm assignments coming due. That’s domain drift.

Over‑interpreting casual statements. When I said, “We have midterms due March 1,” it decided that meant March 1 was an exam day, and therefore a rest day, and blocked it out on my activity timeline. This is the model treating context as structured data.

Confidently incorrect statements about missing data. It told me it didn’t have my 30‑day history… until I uploaded the CSV… at which point it said, “Oh yes, I do.” This is the LLM equivalent of patting its pockets and saying, “I swear I had my keys.”

Reaching into medical interpretation. It occasionally drifted into diagnosing heel pain, predicting recovery timelines, prescribing caloric deficits, or interpreting autoimmune interactions. This is where early AI tools need the strongest guardrails. Users trust confident language even when the model is guessing.

UI sluggishness and freezing. Typing lag and occasional lockups suggest the frontend is blocking on large model responses. Not unusual for early products, but noticeable.

Why This Matters for Garmin Users

I’ve been wondering for years when Garmin Connect would integrate a safe, privacy‑respecting AI layer to help users make sense of their data.

Garmin collects every step, every heartbeat, every sleep stage, every stress spike, every workout, and every recovery metric. But the moment you try to export your full history, you realize something important: your entire Garmin dataset is big enough to choke most LLMs.

OttnoAI is the first tool I’ve used that even attempts to interpret multi‑month or multi‑year Garmin data in a conversational way.

And that’s why the hallucinations matter — not because they’re embarrassing, but because they highlight the complexity of the problem Garmin itself has not yet solved: grounding AI in real sensor data, avoiding overreach, respecting privacy, staying within the domain, and giving users insight without inventing stories.

OttnoAI is trying to do something Garmin hasn’t done yet. And that alone makes it worth paying attention to.

What Garmin Still Can’t See

One important nuance: Garmin has no idea when my heel hurts. Pain isn’t a sensor. It’s not in the data. It’s something I would have to manually log.

Garmin can see that my steps dropped, that my cycling and swimming increased, that my HRV dipped, that my stress spiked, that my sleep fragmented, and that my body battery tanked — but it can’t connect those dots to the reason unless I tell it.

OttnoAI tried to infer the cause from the pattern, which is impressive for an early tool, but also where the hallucinations and overreach showed up. It guessed at mechanisms it couldn’t possibly know.

This is exactly why Garmin needs an AI layer — not to diagnose or prescribe, but to help users interpret patterns and log the missing context that makes the data meaningful.

The Small, Actionable Insights That Actually Helped

Even with its rough edges, OttnoAI surfaced a few simple, grounded suggestions that Garmin could easily make if it had an AI layer:

A five‑minute box breathing session after driving. Driving reliably spikes my stress. Garmin sees that pattern but never comments on it because it doesn't know that I am driving my car (it could probably tell based on speeds).
Backing off nightly melatonin. Not medical advice — just pattern recognition: melatonin wasn’t improving my deep sleep and linked to studies related to what it is supposed to do.
Using my heated mattress pad to warm the bed, then turning it off when I get in. Garmin tracks sleep temperature deviations but doesn’t interpret them. Ottnoai gave links to studies about temperature and sleep, and looked at fluctuations in my body temperature during sleep.
Trying brown noise instead of white noise. Garmin knows when my sleep is disrupted, but doesn’t identify patterns, ask for additional details nor suggest alternatives.

These are small nudges, not medical directives, and exactly the kind of thing Garmin could safely offer if it built a grounded, domain‑specific AI layer.

How OttnoAI Describes Itself

OttnoAI’s About page makes a few things clear: it’s built by a solo founder intentionally avoiding the “big tech AI” model; it promises no training on user data, no retention, and no personalization beyond the session; and it positions itself as a privacy‑first, human‑centered tool meant to help people understand their own data, not diagnose anything.

That framing matters. It sets expectations: OttnoAI is intentionally lightweight, intentionally private, and intentionally not a medical device. But that same privacy‑first approach also means the model needs stronger guardrails to stay grounded in the data it does have.

The hallucinations and over‑reach aren’t flaws. They’re the natural tension between privacy (don’t store anything), utility (give meaningful insights), and LLM limitations (don’t hallucinate). OttnoAI is trying to thread a needle that even the big players haven’t solved yet.

Why I’m Still Rooting for This Tool

This is exactly what early‑stage AI should look like: ambitious, imperfect, transparent about its goals, willing to ship early and learn, and built by someone who cares about user privacy.

The hallucinations aren’t a failure — they’re a roadmap. The over‑reach isn’t a flaw — it’s a signal of where guardrails need to be added. And the value is already visible: OttnoAI helped me see patterns in my sleep, stress, and activity that I hadn’t connected on my own.

That’s the promise of AI health tools — not replacing clinicians, but helping humans better understand their health data and make meaningful changes in their lives.

Final Thought

Testing early AI products is like watching a musician rehearse: you see the raw talent, the rough edges, and the potential all at once. OttnoAI is already useful. With a bit more transparency and tighter grounding to the actual data, it could become something genuinely powerful — maybe even the AI layer Garmin users have been waiting for.