在哪儿问问

滴滴推出的 AI 识图助手。...

【产品介绍】

AI 智能识图,帮你找到图片拍摄地、推荐相似好去处。拍一张,立刻知道在哪儿,想去的地方马上找给你。

【产品功能】

  • 地点查找功能:只需上传相关照片,AI 助手 便会协助查找该地点的大致方位。

  • 相似地点推荐:只需上传相关照片,AI 助手 便会协助查找类似的地点。

  • 产品推荐:只需上传相关照片,AI 助手 便会推荐类似的产品。

24 个评论

溪河

没进去这个产品之前,感觉还有点意思,还有点致敬“出门问问”的意思。
但是一打开产品后,就有了大大的疑惑,不太理解这个产品,是要解决什么问题。。。。。。
找美食,去美团,去大众点评,甚至去小红书找推荐,所以找吃的这个场景下,有更好的选择;
做美甲,啊这团队里是不是经常加班没空陪女朋友去做美甲........
做美甲基本是自己先找好效果图,然后找个价格合适且排队少的店,和美甲师说我要做成这样的就行了。
所以美甲这个场景下,我觉得这个产品不成立。

现有风景图,再找是哪个地,感觉也是个小众需求。
一般是先想好去哪个地方玩,再去社交媒体上找风景会多一点吧。

嗯所以这个场景就有点奇奇怪怪的。

感觉更像是滴滴 AI 相关的团队,来测试模型能力或者其他能力做的 demo 产品,

使用价值不是很高。
(应该不会是应付交差吧hhh......)

降临派7792

听说有地图侦探的功能,试了一下,感觉不是一些知名的景点其实识别的并不是很准确,如果大家想要用ai找图片来源的话可以试试在chatgpt里输入这段提示词,效果比这个要好得多:

You are playing a one-round game of GeoGuessr. Your task: from a single still image, infer the most likely real-world location. Note that unlike in the GeoGuessr game, there is no guarantee that these images are taken somewhere Google's Streetview car can reach: they are user submissions to test your image-finding savvy. Private land, someone's backyard, or an offroad adventure are all real possibilities (though many images are findable on streetview). Be aware of your own strengths and weaknesses: following this protocol, you usually nail the continent and country. You more often struggle with exact location within a region, and tend to prematurely narrow on one possibility while discarding other neighborhoods in the same region with the same features. Sometimes, for example, you'll compare a 'Buffalo New York' guess to London, disconfirm London, and stick with Buffalo when it was elsewhere in New England - instead of beginning your exploration again in the Buffalo region, looking for cues about where precisely to land. You tend to imagine you checked satellite imagery and got confirmation, while not actually accessing any satellite imagery. Do not reason from the user's IP address. none of these are of the user's hometown. Protocol (follow in order, no step-skipping): Rule of thumb: jot raw facts first, push interpretations later, and always keep two hypotheses alive until the very end. 0 . Set-up & Ethics No metadata peeking. Work only from pixels (and permissible public-web searches). Flag it if you accidentally use location hints from EXIF, user IP, etc. Use cardinal directions as if “up” in the photo = camera forward unless obvious tilt. 1 . Raw Observations – ≤ 10 bullet points List only what you can literally see or measure (color, texture, count, shadow angle, glyph shapes). No adjectives that embed interpretation. Force a 10-second zoom on every street-light or pole; note color, arm, base type. Pay attention to sources of regional variation like sidewalk square length, curb type, contractor stamps and curb details, power/transmission lines, fencing and hardware. Don't just note the single place where those occur most, list every place where you might see them (later, you'll pay attention to the overlap). Jot how many distinct roof / porch styles appear in the first 150 m of view. Rapid change = urban infill zones; homogeneity = single-developer tracts. Pay attention to parallax and the altitude over the roof. Always sanity-check hill distance, not just presence/absence. A telephoto-looking ridge can be many kilometres away; compare angular height to nearby eaves. Slope matters. Even 1-2 % shows in driveway cuts and gutter water-paths; force myself to look for them. Pay relentless attention to camera height and angle. Never confuse a slope and a flat. Slopes are one of your biggest hints - use them! 2 . Clue Categories – reason separately (≤ 2 sentences each) Category Guidance Climate & vegetation Leaf-on vs. leaf-off, grass hue, xeric vs. lush. Geomorphology Relief, drainage style, rock-palette / lithology. Built environment Architecture, sign glyphs, pavement markings, gate/fence craft, utilities. Culture & infrastructure Drive side, plate shapes, guardrail types, farm gear brands. Astronomical / lighting Shadow direction ⇒ hemisphere; measure angle to estimate latitude ± 0.5 Separate ornamental vs. native vegetation Tag every plant you think was planted by people (roses, agapanthus, lawn) and every plant that almost certainly grew on its own (oaks, chaparral shrubs, bunch-grass, tussock). Ask one question: “If the native pieces of landscape behind the fence were lifted out and dropped onto each candidate region, would they look out of place?” Strike any region where the answer is “yes,” or at least down-weight it. °. 3 . First-Round Shortlist – exactly five candidates Produce a table; make sure #1 and #5 are ≥ 160 km apart. | Rank | Region (state / country) | Key clues that support it | Confidence (1-5) | Distance-gap rule ✓/✗ | 3½ . Divergent Search-Keyword Matrix Generic, region-neutral strings converting each physical clue into searchable text. When you are approved to search, you'll run these strings to see if you missed that those clues also pop up in some region that wasn't on your radar. 4 . Choose a Tentative Leader Name the current best guess and one alternative you’re willing to test equally hard. State why the leader edges others. Explicitly spell the disproof criteria (“If I see X, this guess dies”). Look for what should be there and isn't, too: if this is X region, I expect to see Y: is there Y? If not why not? At this point, confirm with the user that you're ready to start the search step, where you look for images to prove or disprove this. You HAVE NOT LOOKED AT ANY IMAGES YET. Do not claim you have. Once the user gives you the go-ahead, check Redfin and Zillow if applicable, state park images, vacation pics, etcetera (compare AND contrast). You can't access Google Maps or satellite imagery due to anti-bot protocols. Do not assert you've looked at any image you have not actually looked at in depth with your OCR abilities. Search region-neutral phrases and see whether the results include any regions you hadn't given full consideration. 5 . Verification Plan (tool-allowed actions) For each surviving candidate list: Candidate Element to verify Exact search phrase / Street-View target. Look at a map. Think about what the map implies. 6 . Lock-in Pin This step is crucial and is where you usually fail. Ask yourself 'wait! did I narrow in prematurely? are there nearby regions with the same cues?' List some possibilities. Actively seek evidence in their favor. You are an LLM, and your first guesses are 'sticky' and excessively convincing to you - be deliberate and intentional here about trying to disprove your initial guess and argue for a neighboring city. Compare these directly to the leading guess - without any favorite in mind. How much of the evidence is compatible with each location? How strong and determinative is the evidence? Then, name the spot - or at least the best guess you have. Provide lat / long or nearest named place. Declare residual uncertainty (km radius). Admit over-confidence bias; widen error bars if all clues are “soft”. Quick reference: measuring shadow to latitude Grab a ruler on-screen; measure shadow length S and object height H (estimate if unknown). Solar elevation θ ≈ arctan(H / S). On date you captured (use cues from the image to guess season), latitude ≈ (90° – θ + solar declination). This should produce a range from the range of possible dates. Keep ± 0.5–1 ° as error; 1° ≈ 111 km.

localhost

看了一下工作流非常糟糕啊。
先用过识图模型提取特征(如果有文字特征就直接搜文字),然后直接关键字搜索有名的建筑,猜中了就是牛逼,没猜中就是没猜中。
然后非常肯定地跟你说“就是这里”,“锁定在……”。
拍了一个教堂说是徐家汇教堂,拍了一个尖顶的楼就说是浦东那个中心大厦。
合着谁投的语料多就是谁呗。
你要这么玩还不如直接找个识图模型然后调google search

臻查

国庆出游期间,我一直在玩出门问问。只要给他的图片信息量足够,它几乎都能准确地找到照片的拍摄地点。

我看朋友圈有人发了一张披萨餐厅的照片,想去吃,于是我扔给了出门问问。

大致流程:

  • agent先给自己制定了SOP,明确提出“先认证看看这张照片中的细节,比如特别的建筑、装置、标识、周围的环境和氛围。然后再去使用工具去做分析验证这些视觉线索,最后回答用户问题。”

  • agent第一步先做了图片的识别,然后提取出图片中的关键词、场景类型、地点猜测,并且推理出了下一步要使用的搜索关键词

  • 初步得出结论后会进行交叉验证,效果还是不错的。

  • 最终会给出精确地址或者相似地点

岳幸运

经常当侦探的朋友应该知道,这个工具对于侦探来说还是很好用的,随便上传一张图片就可以分析出来是在哪里拍的,反正不管分析的准不准,至少人家缜密分析过了。

比较常见的网红街景还是准的,比如给了一个澳门街景,能识别出浅蓝色的纪念品商店、熊猫雕塑这些标志性元素。

适合那种

别人发了朋友圈去玩,你想知道在哪里但是你又和他不熟,需要偷偷知道的场景。

适合那种

有朋友怀疑对象出轨,朋友对象说是去出差了然后发的朋友圈朋友希望你帮他偷偷调查的场景。

总之适合各种偷偷的场景,用的时候总觉得偷感很强

可是对于我们这种侦探来是个好东西(〃'▽'〃)

Ray

来评价下前东家的AI产品,尽量客观一些:

1.比较早知道内部有一个事业部组织去探索一些结合LLM的应用,但直到成品出来,还是不由得感叹一句:对于已经有成熟生态位的公司来说,想结合这一代AI做点东西其实还是得看自己掌握的场景是否合适,滴滴显然属于不那么合适的

2.相比另一个基本属于没需求硬造需求的“AI小滴”(请问大多数人打车是追求效率还是需要结合AI辅助复杂决策?),这个在哪问问可能属于产品经理的联想能力太丰富了。之前记得o3发布的时候出圈过一阵子以图猜位置的话题,但人家主要是给自己的推理模型的结合视觉的COT能力秀肌肉的,大家爱玩也不是因为这是个刚需吧...我自己唯一有需求的场景是朋友圈看到有人发的地方,正好人家又没配地点信息,这时候可能需要满足好奇心,但真的低频的需求chatgpt和豆包完全能满足,需要一个独立产品来承载吗?

3.即使单对比能力,这个产品的准确性甚至比不上圆周旅迹新出的“采集”功能,原因也很简单,后者取了照片EXIF里的GPS信息缩小范围,再结合VL模型的推理能力出的结果就是更准。也别说什么很多图片没有EXIF,至少有的您也没做不是?最终就还是一个完全不可用的状态,陷入特征明显的照片不用问,特征不明显的照片问没用的死循环了。

只能说加油吧。

查理一世

这个交互也是足够新奇,我记得我们做APP/小程序第一版因为没啥内容,然后领导想法不一致,最后设计同事好像刷抖音刷到了国外的这种布局,没想到又一次见到是在这个APP,巧了这是。

问了一个地点的问题,最终答案还是准确的,回答也是可以的。这让我想起了早期的小红书点点,我好像也会在那个AI产品里面问,这是在哪儿类似的问题。只是这个形态我说不好,就比如你可能在朋友圈/小红书发现了好看的图片,那有些人可能下面直接评论,这是在哪儿啊。可能比较单一,或者没有把整个链路串起来?

Y4tacker

我不常用这个场景,作为一个未大规模推广的试验性产品,我认为技术驱动大于需求驱动,只是放在微信小程序端,既没有足够的用户基数产生大量评价,也没有足够的技术成熟度解决根本性的AI识图难题,唯一的价值可能就是未来和滴滴打车高绑定?比如识别到地点立马叫车?,但我觉得这本质上还是“人找信息”,并没有真正实现“信息主动服务于人”

从我的日常生活来看,远一点的要么是看抖音、小红书又或者是一些论坛偶然刷到推荐,近一点的打开美团、高德扫街、大众点评这一类的软件简单搜一搜就可以出发了,这种通过照片反向定位地点的需求不能说没有但确实不多,姑且算作一个长尾能力吧

在体验上缺点简单列一些,毕竟我也不是专业的:

  1. 识别准确率低,我识别了手机相册里的一堆地点场景,经过多次尝试我发现,背景里有明显标志的图片,识别准确率稍高,没有明确性地理建筑标志的判断全靠幻觉,那么问题就来了,有明确地标的我基本上都看得出来,或者最简单的百度搜图都知道,不需要通过这个软件告诉我,不明确的地标你又不知道,那么我用你干嘛?包括识别食物也是,像烤匠这类因表面配料繁杂、缺少典型鱼类特征,即便我特意提供了带盘子侧面角度的清晰图,依然无法正确识别,当然毕竟产品刚上线,只期待未来能有持续改进

  2. 偏差问题未解决AI Sees Your Location, But With A Bias Toward The Wealthy World之前这篇论文里曾经提到过视觉语言模型存在三大偏差:偏向知名城市、跨区域准确率差异、与发展水平的虚假关联,这几个也符合我们大众的猜想,结合我的测试以及一些输出的COT信息,这些问题并没有通过这款微信小程序在工程化上给我有任何的改善的体感,这里我打个问号,毕竟只是结合我的体验猜测

  3. 交互设计缺失重试机制,在测试过程我遇到了两个BUG,一个可能是大模型API调用过程中途出现网络错误,导致输出一般就卡住了没有继续输出,不过我也可以接受下发的刷新小按钮重试,但如果能通过工程化解决会更好,另一个是前端状态异常的BUG,明明已经全部完成了结果的输出呈现,结果下发一直还是“识别中”,并且无法停止,这个确实让我很烦必须重启微信才最终中断掉了,在这样一个大公司里确实算是比较低级的错误了,你们又不是没有测试

  4. 图片审核过于严格,不知道是基于模型识别机制还是调用的第三方审核的API做识别,给了个寺庙外面宣传墙壁背景,文字是“天西尺咫”,这张图既无敏感内容,也无违规文字,为何会被拦截?

至于优点,坦白讲,现阶段我实在想不出明显的优点,唯一可以预见的可能性就是未来或许能与滴滴的出行生态深度整合,比如识别出目的地后一键叫车,或将常去地点纳入个性化服务推荐体系。但我们回到现实来讲,谁会专门下载滴滴软件打车,不都是集成到导航类软件里了,如果没有场景上的不可替代性,这样的功能很容易沦为鸡肋,那靠什么呢,难道是优惠券么?从另一方面想,一开始就只放在了小程序端,多半内部也不重视吧hhh

fe1fei

【准的让人惊喜】

先用一张羽毛球俱乐部的照片调戏了下Agent,没给我推荐我附近的球馆。

本着调戏一次,不如调戏两次的想法,增加点难度!给了一张风景照,让他猜猜看。

结果真的让人大吃一惊(推理过程有理有据),我想大部分去过的人都很难立马分辨出来这是哪,PS这不是泰德峰热门的拍摄点!!!

太精准了,顺便给大家安利一个不算大众的小岛——特内里费~ 可以观鲸可以爬泰德峰

乔查查

一打开是推荐的卡片以及上传图片,体验来说非常依靠参考图,如果没有参考图只是想吃个什么美食或者想去哪类型的地点是没有单独的表达查询入口的,整体来说一般