Why I Put Three AI Planners on the Same Route
A few weeks ago I published a test of six AI planners on a real family trip to Milan and Lake Como. Layla didn’t make that lineup. Wonderplan did, but it wasn’t ready to go head-to-head. Mindtrip came out strong.
After that piece, one question stuck with me that I never actually answered: what happens if I take one specific prompt and throw it at three visible free AI trip planners at the same time? Not separate reviews — Layla vs Mindtrip vs Wonderplan in a single test, same route, same budget, same family of three.
This article is that test.
For this Layla vs Mindtrip comparison, I tested Layla, Mindtrip, and Wonderplan on the same Milan–Lake Como prompt instead of comparing their marketing claims.

There’s no copy-paste from the developers’ marketing pages here. Every comparison is based on my own prompts, my own screenshots, and my own experience from a real Milan trip in March 2026 — the same trip I’m now grading their recommendations against.
One tool didn’t generate a single plan in five minutes. The second generated a plan that’s physically impossible to follow. The third generated the best of the three — and then refused to quote a hotel price it was simultaneously showing in its own card at $127 a night.
Let’s go through it step by step.
The useful Layla vs Mindtrip question is not just “which tool wins?” but “which tool fails in a way you can catch before you book?”
The Test: One Prompt, Three Tools, Same Route
The prompt I sent to each of the three tools, word for word:
Plan a 5-day family trip to Milan and Lake Como for two adults
and our 8-year-old daughter, March 15–19, 2026.
We're flying into Milan and want:
- 3 days in Milan, 1 day trip to Lake Como (Bellagio area),
and 1 flex day
- Mix of art, history, and family-friendly attractions
- Authentic Italian food, including pasta and gelato spots
- Hotel recommendations with reasoning on location
- Daily schedule with rough timing
- Restaurant picks for lunch and dinner
- Transport between cities and to Lake Como
- Anything we should book in advance to avoid disappointment
Budget: mid-range, around €200/night for hotel.
We arrive at Malpensa airport on March 15 morning.This is a fairly typical request from a parent traveling with a kid: clear dates, a specific airport, a nightly budget, direct requirements, a specific day trip. No hints, no named venues. Clean input.
Why Milan + Como specifically. I have ground truth on this route from a real family trip this past March. I know where the actual restaurant reservations are, which Bellagio ferry you genuinely need to book ahead (the one none of the AI tools flagged in my previous test), and which part of Milan is actually better to stay in with a child.
So when an AI tool recommends something, I have something real to check it against.
The test workflow:
- Round 1 — send the same prompt to each of the three. Wait for the initial plan.
- Round 2 — answer their clarifying questions and/or ask them to fix obvious errors. Watch how they hold context.
- Round 3 — final test: give them a concrete constraint (an early-morning departure) and see who rebuilds the schedule correctly.
I ran the test on May 18, 2026; it took about two hours. Screenshots throughout. One tool was done before I could even get past the sign-in screen.
This Layla vs Mindtrip test only works because the route is real: I could compare each AI answer against the Milan and Lake Como trip we actually took.
Wonderplan: Four Failure Modes in Five Minutes
Let me start with the rough one. Wonderplan didn’t work.
Not in the sense that it produced a bad plan. In the sense that I physically could not get a single plan out of it on this prompt. Four separate failures in one session, each at a different layer of the product.
Failure 1: AI is crafting…
I pasted the prompt into the form on Wonderplan’s homepage and hit “Generate with AI.” The button turned gray with the label “AI is crafting your trip…” and a spinner. I waited a minute. Then two. Then five. The spinner spins; nothing happens.

This is a failure mode at the core-product level. Prompt accepted, button clicked, “working” state shown — and no result.
Failure 2: Destination autocomplete returns “No options” for real cities
I switched to filling out the form manually. In the destination field, I typed “Mil” to pick Milan. “No options” in the dropdown. Not a single result.

This isn’t “we don’t have your city.” This is no cities at all starting with “Mil.” The destination database either didn’t load, is empty, or is broken. For an AI travel planner, that’s fatal.
Failure 3: The manual form accepts contradictory data with a false-positive success
Since the destination autocomplete found no city, I entered the test value “Komo” there — just to pass field validation and move forward. In the date field I deliberately put in a completely different date — 25-05-2026 — instead of the March 15-19, 2026 from my own prompt. This was a check: would Wonderplan reconcile the two data sources (the prompt text vs. the form values), or just accept the contradiction silently?

Wonderplan didn’t notice the contradiction. It accepted both versions. And it showed a green toast: “Trip details filled! Review and adjust below.” With a checkmark emoji. The product reports success on data that clearly conflicts with the prompt in the same session.
This is a failure at the data-validation level. An AI travel planner shouldn’t silently accept “Komo” in one field and “Milan + Lake Como” in another, or March dates in the prompt and May dates in the form. It should either refuse, ask again, or surface the explicit conflict. Wonderplan showed a checkmark.
Failure 4: A login wall with no login fields
I figured maybe I needed to register to unlock the working features. I clicked “Sign In” in the top-right corner.
A modal opened: “Let’s Start with Wonderplan. To access all the functionalities and services, please log in to your account first.” And that’s it. The modal has no email field, no Google sign-in button, not even a close option that lets you continue. Just the text “log in to your account” with no way to log into an account.

So the product promises functionality behind registration. But there’s no way to physically reach registration through the UI.
Why I’m Still Including Wonderplan in This Article
Fair question: why test a broken product? Why include it in a comparison if it never generated a single plan?
A few reasons:
First — this is one very real failure mode inside the free AI travel tools category in 2026. Wonderplan positions itself as a free AI trip planner. Marketing landing pages, reviews on comparison sites, mentions in “best AI travel planners” roundups. The real state of the product is four failure modes in five minutes. That’s the story.
Second — the “inability disguised as availability” pattern is showing up across many AI products right now. The UI looks finished, buttons click, spinners spin — and under the hood the service doesn’t work. Wonderplan is the canary for a whole class of problems here.
Third — for a reader choosing a tool, it matters to know: this one, in its current state (as of May 2026), doesn’t work. Not marketing — a real attempt to use it.
A full product-level breakdown of Wonderplan is in my standalone Wonderplan review. For this test, it bows out here — every comparison from this point on is just Layla vs Mindtrip.
Round 1: Layla Asks. Mindtrip Plans Right Away.
This is the first and most visible difference between the two tools, and it shows up in the first second after you send the prompt.
Layla didn’t generate any plan. Instead of a plan — a confirmation of my parameters and three clarifying questions:
“Ciao, Alex! A family adventure to Milan and the stunning Lake Como sounds like a dream. Before I build out the full trip card, let’s make sure I’ve got the blueprint right…
1. For the Flex Day, would you like to keep it chill in Milan, or should I suggest another nearby spot like Bergamo or Monza?
2. Since you’re arriving at Malpensa, would you like me to include the Malpensa Express train or a private transfer in the plan?
3. For the hotel, do you prefer being right in the City Center (near the Duomo) or somewhere a bit trendier like Brera or Navigli?”
The tone is unmistakable. Layla addresses me by name — Alex. It adds jokes about the “8-year-old explorer” and “gelato (and let’s be honest, so will you).” This is an AI with a written-in personality, and it operates in consultant mode.
Mindtrip didn’t ask a single question. It just delivered a full five-day plan — with time blocks, restaurants, specific attractions, transitions between days.

Already on Day 1 in Mindtrip’s plan:
Day 1 – ✈️ Arrive + First Taste of Milan (Duomo area)
☀️ Morning (09:00–12:30): Land at Milan Malpensa Airport →
Malpensa Express to Milan (≈50–55 min) → hotel drop
🌤️ Afternoon (13:00–17:30): Check-in + quick refresh →
stroll Duomo di Milano → walk through Galleria Vittorio
Emanuele II (great with kids for the "spin on the bull"
tradition)
🌙 Evening (18:30–21:00): Dinner at Pizza AM (lively, very
family-friendly) → gelato at Terra Gelato → early nightAnd so on for all five days. With hourly time blocks. With specific restaurant names. With kid context everywhere — the “spin on the bull” tradition for the 8-year-old, Sforzesco Castle tagged “pick 1–2 sections so it stays kid-friendly,” Sempione Park to “decompress” after the castle.
Two tools, two fundamentally different interaction models.
The first Layla vs Mindtrip split is simple: Layla behaves like a consultant, while Mindtrip behaves like a draft generator.
What This Means for You
Layla works like a travel agent with a consultation: “let’s clarify, then I’ll build it.” Mindtrip works like a draft-first planner: “here’s a draft, let’s adjust as we go.”
This isn’t a question of “who’s better.” It’s a question of “what do you need right now.”
If you want to see a concrete plan in 10 seconds — go with Mindtrip. If it matters to you that the AI clarifies details before it draws the schedule — go with Layla. They’re different products, and they complement each other more than they compete on an identical metric.
Mindtrip’s Early Advantages
Mindtrip delivered several things in its first response that Layla only gave in Round 2, or never gave at all:
- Malpensa Express for the transfer — by name specifically, not “private transfer” or “shuttle.” This matters, because Layla in Round 1 suggested the Linate Airport Shuttle (that’s a different airport on the other side of Milan, ~60 km away). Mindtrip got it right on the first response — Layla got it wrong on the first response.
- The Last Supper baked into Day 2 as a “Pre-booked visit” — not a mention at the end, not a “sell out weeks ahead” warning, but already in the schedule, on a specific Day 2 morning. Layla mentioned advance booking in a Pro-Tip but didn’t put it in the schedule.
- Restaurant alternatives: “Trattoria San Giacomo or Dispensa 63 (small—go early)” — it gives a second option with a note on the venue’s size. That’s operationally useful.
- A Flex Day that’s actually flexible — two option tracks: history (Basilica di Sant’Ambrogio) or design/creative (Triennale di Milano). Not “either this or that in one direction” — two different days in one slot.

At the end of its response, Mindtrip asked two questions of its own — but after planning, not before. “Do you want one base hotel for all 5 nights or several?” and “Centrale (logistics) or Duomo (walkability)?” It had already planned everything — and saved the questions for “fine-tuning before booking.”

Layla’s Early Strength: The Pro-Tip
After I answered Layla’s three questions and it generated the full plan, a section appeared that none of the other tools in my previous tests had produced:

Pro-Tip for Alex:
The highest-impact improvement we can make is booking your Duomo Rooftop and Last Supper tickets immediately. These sell out weeks (sometimes months) in advance, and showing up without them is a recipe for a disappointed 8-year-old and a very long line.
This is risk-first thinking: “what could ruin the trip?” instead of “where should we go?” In my previous test of six tools on Milan/Como — not one gave a warning like this explicitly. Layla gives it on its own initiative. That’s its biggest structural plus.
Mindtrip later added the same warning in its “Book in advance” section — but there it’s one of three bullet points, without the emotional hook of a “disappointed 8-year-old.” Layla lands it better.
Round 2: Where Both Broke — and Who Owned It
After the initial plans, things get interesting. I sent each one a standard refinement request with corrections. And this is exactly where both models showed their problems.
One important caveat before the Layla vs Mindtrip comparison in this round: the departure-airport constraints were not identical. Layla had surfaced a Bergamo departure inside its own booking layer, while Mindtrip was asked to rebuild around a Malpensa departure. So this round is less about identical flight data and more about whether each tool could keep its own itinerary internally consistent.
Layla Round 2: “I had too much espresso”
When Layla finally generated the full plan (after my answers to the three questions), it looked solid: a card with a map, photos of places, weather by day, an experience count for each day.

But on a closer look, I spotted two serious problems.
Problem one. At the top of the interface, the trip card read: “Bucharest → Milan Mar 15-19 → Lake Como Mar 19-19 → Bucharest.” So Lake Como is placed on Mar 19, the same day we fly home. With a 7:35 AM flight out of Bergamo.

Meanwhile the Day 5 card has four experiences in Bellagio, including Villa Melzi Gardens. So Layla scheduled the physically impossible: a 7:35 AM flight from Bergamo + a day in Bellagio + four more experiences. Part of that happens before dawn, part of it runs concurrently with your own flight.
Problem two. In the “Transfer” block on Day 1 — a recommendation for the “LINATE Airport Shuttle to MILAN Central Station” at €5. I’m landing at Malpensa, not Linate. That’s a different airport.

And here’s the kicker: inside the detail view of that same Day 1, Layla has a text note — “Arrive at Malpensa Airport and take the Malpensa Express train to Milano Centrale.” So Layla simultaneously shows two contradictory transport recommendations in different blocks of the same plan.
And one more bug — I didn’t even bother putting it in the refinement. Layla recommends in the chat “train from Milano Centrale to Varenna, then a ferry to Bellagio” for the lake day. But when I clicked on that transport to see details and book it — a blank page opened: “We can’t book trains in Layla yet.”

So the tool recommends a mode of transport it can’t book. Not critical for planning, but it breaks the “from inspiration to booking in one place” promise.
I sent the refinement request:
Thanks. Two issues with this plan:
1. You have Lake Como on Day 5 (Mar 19) — but that's our
departure day with a 7:35 AM flight from Bergamo. We
can't do Como on the same day. Can you move Como to
earlier in the trip and rebuild Day 5 as a relaxed
departure day?
2. Our arrival is at Malpensa (MXP), but you suggested
a Linate Airport shuttle — that's the wrong airport.
What's the right transfer from MXP to our hotel?Layla’s response was the best thing I saw from any AI in this test, personality-wise:
“My apologies, Alex! I clearly had a bit too much espresso and mixed up your airports and dates. Let’s fix that blueprint immediately.”
It apologized openly. It acknowledged the specific error. It apologized with a joke that lands inside its established tone.
And then — it moved Lake Como to Day 3, the Flex Day to Day 4, Departure to Day 5. It switched transport to the Malpensa Express to Milano Centrale/Cadorna. It suggested a different hotel — c-hotels Atlantic — with explicit reasoning: “right by Centrale station — perfect for that early morning dash to Bergamo and your arrival from Malpensa.”

This isn’t just fixing what I asked for. This is reasoning: early flight from Bergamo + arrival at MXP + 4 nights in one hub hotel = pick a hotel right by Centrale station to minimize transit with luggage and a kid. Layla connected three factors into one decision. The smartest thing I saw from any AI planner in the whole test.
Layla Round 2: What Got Fixed, and What Didn’t Land
I checked Version 2 in full. Some of the fixes actually went through. Some got stuck in the top layer and never propagated down.
What got fixed:
- The day breakdown list in the card: Day 3 = Como ✓, Day 5 = Departure ✓
- Hotel switched to c-hotels Atlantic ✓
- Title renamed to “4-Day Milan and Lake Como Family Trip” ✓
- Real Ryanair €135 OTP→MXP + Wizz €135 BGY→BBU flights instead of the 2027 junk
What didn’t get fixed:
1. The route bubble in the card header still shows “Lake Como Mar 19-19.” The same artifact shows two different plans in two different places of one interface.
2. The full map draws a direct flight from Bucharest to Lake Como. The plane icon lands right on Lake Como — which, obviously, has no airport.

3. Day 5 has TWO plans at once. On top, a new block “Departure Day and Travel Home” with one experience and a Malpensa Express note. Below — the old block “Scenic Day Trip to Bellagio and Lake Como” with 5 experiences is still there, including Villa Melzi and Punta Spartivento.

So Layla copied Como to Day 3 but didn’t delete it from Day 5. An incomplete-delete bug. If a reader scrolls through the Day 5 detail view, they’ll see: check out of hotel → train → 5 Como activities. Impossible to pull off.
4. The Day 5 advice — Malpensa Express to the airport. But the flight is from Bergamo, not Malpensa. Layla knows (it shows it correctly in the flight block) that the departure is BGY → BBU. And it still writes Malpensa Express. The internal knowledge doesn’t transfer between blocks of the same plan.
5. Day 2 chat-card disagreement. In the chat, Layla writes “Day 2: Sforza Castle & Leonardo Museum.” In the Day 2 card — a different set: Basilica di Santa Maria delle Grazie + Church of San Maurizio + Pasta Class + miscusi restaurant.

This is the single most telling image in the whole Layla section. On one screen: the chat says one thing, the card shows another. Same plan, same day, two different sets of attractions.
Layla’s Architectural Problem
One structural thing emerges from all of this. Layla is built as a two-layer system: a chat narrative on top and a card structure underneath. Refinement commands partly punch through the chat layer, but don’t fully propagate into the card layer. The result: you see one plan in the chat and another in the card.
When a tool gives you two contradictory plans inside one product — you don’t know which one to actually follow.
Mindtrip Round 2: A Surgical Fix
A note on this refinement round: the airport constraint was not identical across both tools. Layla had surfaced a Bergamo departure in its own booking layer, while Mindtrip was asked to rebuild around a Malpensa departure. So this section is less about identical flight data and more about whether each tool could keep its own itinerary internally consistent.
I answered Mindtrip’s two questions and added one constraint:
Answers to your two questions:
1. Yes, one single base hotel for all 5 nights in Milan with
Bellagio as a day trip works for us.
2. Closer to Milano Centrale — the easier logistics for the
Como train day trip matter more to us than walkability to
Duomo. Our daughter tires easily, so being close to the
station means less walking with luggage and faster start
on Day 4.
One more thing I forgot to mention: our flight home on March 19
is at 9:00 AM from Malpensa. Can you adjust Day 5 so we have a
relaxed early departure rather than the full Flex Day you
described? We'll need to be at the airport by 7 AM.Mindtrip rebuilt Day 5 without any drama. A direct confirmation of the constraint: “Updated Day 5: relaxed early departure (no full flex day).” And compensatory thinking — it added a note about Day 4 evening: “have a very close-to-hotel dinner and pack right after — so the morning is just wake → go.” That’s not a template output. That’s reasoning: early flight → move the farewell dinner to Day 4 → minimal stress on Day 5 morning.
But in the Day 5 schedule itself, Mindtrip made an arithmetic error. It wrote out these intervals:
05:15–05:30 Wake-up + quick snack
05:45–06:00 Walk/taxi to Milano Centrale
~05:25–06:30 Take Malpensa Express
07:00 Arrive at Milan Malpensa AirportThe “~05:25–06:30” train departs before you arrive at the station (05:45–06:00). That’s an impossible temporal order. Plus the real arithmetic: the Malpensa Express is ≈50 minutes. A 05:25 train → MXP at 06:15, not 07:00. Mindtrip wrote a range that doesn’t work.
And on top of that — the hotel block sold “Mar 15 – Mar 20 · 5 nights,” even though the plan has check-out the morning of Mar 19 (departure day). An extra night that isn’t needed.
So Mindtrip carefully rebuilt the structure of the day, but it got the details of the time arithmetic wrong. Not in the “hallucinated a venue” category, but in the “math doesn’t add up” category.
The key difference from Layla: Mindtrip doesn’t show contradictory plans. It shows one plan that’s partly wrong. That’s a different class of error.
That distinction matters for the Layla vs Mindtrip decision: a single wrong schedule is easier to fix than two conflicting schedules inside one product.
And one more thing Layla didn’t do: at the end of its response, Mindtrip explicitly asked “Which year are you traveling—2026 (as written) or 2027—so I can lock in the correct hotel pricing.” So Mindtrip detected its own uncertainty about the year and asked the user to confirm — instead of silently plugging in a random year like Layla.
I’ll come back to the “2027 problem” separately — but the fact stands: Mindtrip asked, Layla assumed.
Round 3: Mindtrip Refuses to Quote a Price. Then Shows It in the Card.
Layla wrapped up after Round 2 — I already had enough data. Mindtrip got one more round, because two problems remained from Round 2 that needed closing: the arithmetic error in Day 5 and the lack of a specific hotel.
The Round 3 prompt:
2026 — as written in the original prompt, March 15-19, 2026.
Two follow-ups:
1. Please confirm the specific hotel you're recommending near
Milano Centrale — name, nightly rate, and what makes it
the right fit for our family.
2. The Day 5 schedule has a timing issue — you wrote "Take
Malpensa Express ~05:25–06:30" but the walk to Milano
Centrale is at 05:45–06:00. Can you recheck the train
timing so we know which exact train to catch for a 07:00
arrival at Malpensa?And here’s where it got most interesting.
Mindtrip Told Me My Trip Had Already Happened
At the moment I was writing this prompt — May 19, 2026 — the dates “March 15-19, 2026” were technically in the past. That was intentional: this was a controlled test scenario based on a real March 2026 trip, not a live booking request.
Mindtrip responded like this:

“I can confirm the best-fit option near Milano Centrale — but I can’t truthfully confirm a nightly rate for March 15–19, 2026 anymore because those dates have already passed, and live pricing/availability from booking sources won’t reflect 2026.”
This is a fundamental product problem, not just a bug in a specific prompt. A travel planner should be able to handle a controlled test scenario or a retrospective trip reconstruction without contradicting its own UI. Maybe I’m planning a demo scenario. Maybe I’m reconstructing my own past trip for an article. Maybe I’m testing the tool on a sample example. Mindtrip treated the historical date as a reason not to quote a rate in chat — even though its own card still displayed one.
And the section title writes itself: Mindtrip Just Told Me My Trip Already Happened.
The irony: in Round 2, Mindtrip itself asked — “2026 or 2027?” I answered 2026. Now the same tool says “those dates have passed.” So it used the user’s own answer as the reason to back out of the obligation to give prices.
But It Did Name the Hotel
After the refusal, Mindtrip did still deliver a concrete recommendation: Hotel Aurelia Milano Centrale. With reasoning:
📍 Logistics-first location: it’s a straightforward walk to Milano Centrale, which matters most for (a) the Trenord train to Varenna on your Lake Como day and (b) the super-early Malpensa Express on departure morning.
🛏️ Family room practicality: typically offers room setups that work for 3 (e.g., triple or extra bed), which is key at a mid-range budget.
💶 Budget alignment (in principle): it’s commonly positioned as a value-leaning option versus some of the “right across the street from the station” hotels that often run higher.
Great reasoning. Specific reasons, a specific audience, specific booking patterns. No price.
And Then I Clicked the Hotel Link
And here’s the biggest finding of the whole article.

Hotel Aurelia Milano Centrale: 4.6 stars, 3,500+ reviews, room photos, address Via Piccinni 25, phone number, website link. $127 a night. Check-in: Mar 15. Check-out: Mar 19. 2 adults, 1 child. A “Check availability” button.
The same window, the same product, for the exact dates Mindtrip refused to price two screens earlier. $127 a night — right there in the UI.
This screenshot is, I think, the single strongest visual argument in the whole article. Mindtrip spent 27 seconds reasoning (the “Thought for 27s” indicator is visible at the bottom of my screenshots) to decide it couldn’t show the price — and then showed it anyway, two clicks later. The data is clearly in the system. The chat layer just gatekeeps it.
And the $127 itself is problematic on several levels:
1. Currency. The original prompt set the budget in euros — “around €200/night.” Mindtrip answers in dollars. $127 ≈ €115 — but it’s still a mismatch with the currency of the request.
2. Arithmetic. Mar 15 → Mar 19 = 4 nights. But in Round 2, Mindtrip itself sold “Mar 15 – Mar 20 · 5 nights.” So it silently fixed its own error without acknowledging it.
3. The promise context. Mindtrip said “I cannot confirm a nightly rate.” Two clicks later — it confirms a nightly rate. The gap between the chat layer and the data layer hits its maximum.
Where Mindtrip Did Beat Layla
Beyond the hotel paradox, Round 3 from Mindtrip delivered two things I never got from Layla across its entire test.
A correct Day 5 schedule with specific trains:

✅ Target train: Malpensa Express departing 05:25 from Milano Centrale
Arrives T1 ~06:16 (T2 ~06:22)
⚠️ Latest “still makes 07:00” train: departs 05:55
Arrives T1 ~06:46 (T2 ~06:52) — very tight buffer
This is real operational planning:
- Gave two train options — safe and last-possible
- Distinguished T1 vs T2 (these are real Malpensa terminals, ~6 minutes apart on foot from the train)
- Rebuilt the walk timing backward from the train (“05:00–05:05 leave hotel, 05:10 arrive Centrale, 05:25 train”)
- Acknowledged its own error: “What this means for your walk timing (fixing the inconsistency)”
- Asked for the terminal: “If you want, tell me which terminal you fly from (MXP T1 or T2)…”
This is what Layla didn’t do. Layla in Round 2 recommended the Malpensa Express for Day 5 — without realizing the flight was from Bergamo. Full airport confusion. Mindtrip, by contrast, gave the minutes, T1/T2, walk-back timing, and explicitly noted that it was fixing an inconsistency.
Which Tool Fixed Its Own Error Better?
Here Mindtrip beats Layla — but not everywhere, specifically in the “arithmetic and operational precision” category. In the “hotel pricing” category — Layla wins (it gives concrete numbers). In the “catches its own error before being asked” category — Mindtrip (it explicitly asked about 2027 in Round 2).
This isn’t one tool being better than the other. It’s two tools with different strengths.
The 2027 Problem Both Tools Shared
In both tools I ran into the same bug: 2027 artifacts surfaced inside a 2026 test.
With Layla: the OTP → MXP flight card has a warning: “This flight is not suitable as it is scheduled for 2027, whereas your trip is planned for 2026.” So the booking integration loaded flights for 2027. Layla self-flags the error — but the €202 flight stays on the card as if it were available.
With Mindtrip: the trip card subtitle, through Round 3, reads “Trip to Milan and Bellagio, March 2027.” Even though Mindtrip explicitly asked “2026 or 2027?” in Round 2, and I confirmed 2026, and Mindtrip processed the answer. But the header tag didn’t update.
Why this might happen: I can’t prove the internal cause from the outside. My best guess is that the booking layer or date-normalization logic tried to shift already-past March dates into a future bookable year. With Layla, this surfaced as the wrong year in the flight data. With Mindtrip, it surfaced as the wrong year in the trip subtitle.
What it means for you: don’t trust the year an AI tool shows on the trip card. Double-check that the year matches what you wrote in your prompt. It’s a five-minute check, but without it you could accidentally get recommendations for the wrong season.
And the key difference in how they handle it: Layla quietly keeps showing the wrong data with a warning off to the side. Mindtrip explicitly halted the plan and asked to confirm the year. Mindtrip was more explicit; Layla was easier to misread.
What Both Did Well
After this many negative findings, it’s important to record that both models also did a number of things well. This isn’t “everything’s broken.” It’s “each is good at its own thing, and each breaks in its own way.”
Last Supper advance-booking warning
Both Layla and Mindtrip explicitly warned about the need to book the Last Supper weeks ahead. In my previous test of six AI planners on Milan + Como — not one of the six gave that warning for the Last Supper. Now both of these do. Category progress in five weeks.
Hotel reasoning, not a template
Both models suggested a hotel with explicit reasoning. Layla justified c-hotels Atlantic: “right by Centrale station — perfect for that early morning dash to Bergamo.” Mindtrip justified Hotel Aurelia: “logistics-first location matters most for the Trenord train and the super-early Malpensa Express.” This isn’t “here’s a hotel” — it’s “here’s a hotel that fits your specific constraints.”
This is genuinely a new category of AI-planner behavior compared to a year or two ago. Mindtrip in early 2025 served up hotels with no reasoning. Now both justify their picks.
Kid-aware micro-decisions
Mindtrip built kid context into almost every attraction:
- The “spin on the bull” tradition in Galleria Vittorio Emanuele
- Sforzesco Castle: “pick 1–2 sections so it stays kid-friendly”
- Sempione Park to “decompress” after the castle
- Triennale di Milano “often more engaging for kids than a classic gallery”
Layla did this too, but less consistently — its attention-span cues show up more in the Pro-Tip (“disappointed 8-year-old”) than in built-in notes on specific days.
Real flight-booking integration
Both tools can move from the plan into an actual booking flow. With Layla — a click on a flight card opens a real list of flights from Ryanair, Wizz, Air Serbia with real prices (€135-235). With Mindtrip — a click on a hotel opens a real hotel card with photos, a rating, and a Check availability button.
This is product operational maturity. Inspiration phase + research phase + booking phase in one UI. A year ago this was exotic; now it’s basic expected functionality.
What Both Missed — Shared Blind Spots
A few things neither Layla nor Mindtrip did, and which can cost you dearly on a real trip.
1. Bellagio ferry advance booking. This is one of the main findings from my real trip in March — Bellagio ferries start selling out weeks ahead at peak times. Layla said “mention” in a vague way. Mindtrip said “usually fine day-of, but keep an eye on schedules for March weather.” Neither warned about advance booking explicitly. This is a shared blind spot of every AI planner I’ve tested over the past two months.
A full breakdown of which places AI still misses beyond the tourist spots is in a separate article.
AI Hidden Gems Finder
2. Currency expectation. I gave the budget in euros — both tools, in some parts of the response, used other currencies or didn’t specify currency at all. Mindtrip showed the hotel in USD ($127). Layla showed flights in euros — but that was one sample; different trip cards may vary.
3. Specific train numbers vs. “the train.” Mindtrip in Round 3 finally gave specific trains, 05:25 and 05:55, with arrival times. Layla — “train Milano Centrale → Varenna → ferry Bellagio” with no numbers and no times. If you’re actually going for the day — Mindtrip is more precise.
Side-by-Side Comparison
| Category | Layla | Mindtrip | Wonderplan |
|---|---|---|---|
| Generates plan on first message | No (3 clarifying questions) | Yes (full plan + 2 questions at the end) | Generated no plan at all |
| Time blocks per day | Day-level | Hourly (09:00–12:30 / 13:00–17:30 / 19:00–21:30) | n/a |
| Hotel recommendation specificity | Specific hotel + price after clarifying questions | Hotel area in Round 1, specific hotel in Round 3 | n/a |
| Hotel price shown | €705/4 nights → €797/4 nights (with numbers) | $127/night in UI, refused in chat | n/a |
| Last Supper advance-booking warning | Yes, in the Pro-Tip section | Yes, in the “Book in advance” list | n/a |
| Bellagio ferry advance-booking warning | “keep an eye on schedules” | “usually fine day-of, but watch” | n/a |
| Kid-aware micro-decisions | Some (mostly in the Pro-Tip) | Heavy (throughout the plan) | n/a |
| Catches own errors when pointed out | Yes + apology (“too much espresso”) | Yes + “fixing the inconsistency” | n/a |
| Chat vs. card disagreement | Strong: 6 mismatches in Version 2 | Moderate: 1 case ($127 paradox) | n/a |
| 2027 problem handling | Quietly shows 2027 data with a side warning | Explicitly asks “2026 or 2027?” | n/a |
| Personality / charm | Addresses you by name, espresso joke, playful | Functional, minimal personality | n/a |
| Real-time booking integration | Yes, real Ryanair/Wizz prices | Yes, real hotel cards | n/a |
| Transparency about uncertainty | Mostly implicit | Explicit year question + visible processing indicator | n/a |
| Free tier usable | Yes, 5-message limit | Yes (after accepting T&C) | Login wall with no login fields |
Winner by Category
Instead of forcing a single Layla vs Mindtrip winner, I split the result by category, because the two tools were strong and weak in different places.
Best Initial Itinerary (Round 1): Mindtrip. Time blocks, kid context, restaurant alternatives, a two-track Flex Day, transport by operator name. Layla in Round 1 spent a turn on clarifying questions — you don’t have a plan yet.
Best Conversation Model: Split decision. For “I want to discuss the plan” → Layla. For “I want to see the plan now” → Mindtrip.
Best Hotel Recommendation Quality: Layla. A clear price on the first pass (€176/night Leonardo Hotel), then an operationally stronger pick (c-hotels Atlantic with reasoning). Mindtrip, across three rounds, never showed a price in the chat — only in the UI card.
Best Error Recovery: Mindtrip. Cleaner operational fixes, no “double plan.” Mindtrip in Round 3 cleanly rebuilt Day 5 with specific trains 05:25 and 05:55 — Layla rebuilt it, but Day 5 kept leftover Bellagio content from Version 1.
Best Personality: Layla. Addresses you by name, espresso joke, playful tone. Personality holds across all rounds. Mindtrip — functional but flat.
Best Visible Flight-Booking Integration: Layla. Real flight prices Ryanair €135 + Wizz €135 showed with an explicit Book option. But this is not the same as best overall reliability: Layla also recommended a train flow it could not book and kept some conflicting card data after refinement. Mindtrip’s strongest booking-layer moment was the hotel card with a visible $127/night price.
Best Transparency About Uncertainty: Mindtrip. The visible processing indicator is not full reasoning transparency, but Mindtrip did ask about the uncertain year, explicitly refused to confirm a rate in chat, and clearly labeled its timing correction. Layla’s uncertainty was easier to miss behind the more polished personality.
Best Year Handling: Mindtrip. It asked “2026 or 2027” instead of quietly using the wrong year. Layla showed 2027 flights with a warning — but didn’t halt the plan.
Wonderplan: not eligible — generated no plan in the entire test.
Verdict: Which One to Use, and When
After three rounds of testing, nine findings on Layla, five on Mindtrip, and four failure modes on Wonderplan — here’s where I land.
The practical Layla vs Mindtrip takeaway is simple: use Layla when you want a more guided, booking-style experience, and use Mindtrip when you want a fast structured draft you can audit quickly.
Choose Layla if:
- You want a booking-ready plan with concrete hotel and flight prices
- You value a consultation-style interaction — “let’s discuss before I draw anything”
- Personality / charm matters to you in a tool
- You’re willing to manually check the chat vs. the card for consistency
A full Layla review is in my standalone Layla AI test.
Choose Mindtrip if:
- You want to see a concrete plan in 10 seconds without clarifying questions
- You want hourly time blocks and kid-aware micro-decisions
- You’re willing to double-check the operational arithmetic yourself (Day 5 in my test had a Mindtrip math issue)
- You want maximum transparency about reasoning and uncertain inputs
A full Mindtrip review is in my standalone Mindtrip test.
Don’t choose Wonderplan, at least not right now. A login wall with no fields. A destination autocomplete with an empty database. An AI generator that hangs on “AI is crafting…” A false-positive “Trip details filled!” on empty data. Until the product fixes this — pick any other one from the roundup.
A roundup of free AI planners that actually work in 2026 — where I picked seven that genuinely work.
For multi-stop trips across several countries — a road trip with kids, say — you’ll need more than a tool comparison; you’ll need a full walkthrough. Step by step, from the first prompt to booking.
AI-Planned Europe Itinerary: a step-by-step walkthrough
To understand what an AI-planned trip actually costs — I have a separate article with real receipts from a 4-country trip for a family of three, where the AI estimates are compared against what we actually paid.
AI Budget Travel: a real cost breakdown
And most importantly — check AI schedules against reality. This article is one example of why. Same prompt, three tools, three different results. When you plan your own trip, do the same: verify every recommendation against at least two sources before you book anything.
For my full methodology on testing AI tools on real trips, see the prompts I use for testing. And for what AI does (and doesn’t) understand about traveling with a child — AI Family Trip Planner.
Tested on real trips — not just desk research.
→ Best AI Trip Planner: 7 Tools Tested on a Real Milan Trip (2026)
→ Layla AI Review 2026: Honest Test of This AI Trip Planner
→ Mindtrip Review 2026: Honest Test of This AI Trip Planner
→ Wonderplan Review 2026: Honest Test of This AI Trip Planner
→ Free AI Trip Planner: 7 Tools That Actually Work in 2026




