Milvus, created by Zilliz

Milvus, created by Zilliz · 2025-08-08T15:45:11.800Z

𝐆𝐏𝐓-𝟓 𝐣𝐮𝐬𝐭 𝐝𝐫𝐨𝐩𝐩𝐞𝐝. 𝐈𝐬 𝐭𝐡𝐢𝐬 𝐭𝐡𝐞 𝐞𝐧𝐝 𝐨𝐟 𝐑𝐀𝐆? 𝐆𝐏𝐓-𝟓-𝐦𝐚𝐢𝐧 reduces factual errors by 𝟒𝟒% vs 𝐆𝐏𝐓-𝟒𝐨. 𝐆𝐏𝐓-𝟓-𝐭𝐡𝐢𝐧𝐤𝐢𝐧𝐠 improves by 𝟕𝟖% vs 𝐎𝐩𝐞𝐧𝐀𝐈 𝐨𝟑. It also handles 𝐥𝐨𝐧𝐠𝐞𝐫 𝐜𝐨𝐧𝐭𝐞𝐱𝐭 𝐰𝐢𝐭𝐡 𝐛𝐞𝐭𝐭𝐞𝐫 𝐫𝐞𝐚𝐬𝐨𝐧𝐢𝐧𝐠. If LLMs can 𝐫𝐞𝐦𝐞𝐦𝐛𝐞𝐫 𝐦𝐨𝐫𝐞 and 𝐡𝐚𝐥𝐥𝐮𝐜𝐢𝐧𝐚𝐭𝐞 𝐥𝐞𝐬𝐬 — do we still need retrieval at all? Here’s the reality: even with 𝐆𝐏𝐓-𝟓, real-world applications still struggle with: ⚡ 𝐒𝐩𝐞𝐞𝐝 – Long context takes 𝟏𝟎–𝟑𝟎𝐬 💸 𝐂𝐨𝐬𝐭 – Up to $𝟏𝟎 per 𝟏𝐌-token query 📦 𝐒𝐜𝐚𝐥𝐞 – 𝟏𝐌 tokens ≠ full knowledge base 🧩 𝐃𝐚𝐭𝐚 𝐯𝐚𝐫𝐢𝐞𝐭𝐲 – Real-world inputs include structured and unstructured data like tables, graphs, logs, and more This is where we believe 𝐑𝐀𝐆 (𝐚𝐧𝐝 𝐯𝐞𝐜𝐭𝐨𝐫 𝐃𝐁𝐬) still shine. 𝐑𝐀𝐆 𝐢𝐬𝐧'𝐭 𝐝𝐞𝐚𝐝 — 𝐢𝐭’𝐬 𝐞𝐯𝐨𝐥𝐯𝐢𝐧𝐠 with hybrid search, better embeddings, and multi-hop reasoning.

Software Development

Redwood Shores, CA 6,768 followers

The Vector Database That Delivers Scale, Performance & Cost-Efficiency for Production AI

View all 4 employees

About us

Milvus is a highly flexible, reliable, and blazing-fast cloud-native, open-source vector database. It powers embedding similarity search and AI applications and strives to make vector databases accessible to every organization. Milvus can store, index, and manage a billion+ embedding vectors generated by deep neural networks and other machine learning (ML) models. This level of scale is vital to handling the volumes of unstructured data generated to help organizations to analyze and act on it to provide better service, reduce fraud, avoid downtime, and make decisions faster. Milvus is a graduated-stage project of the LF AI & Data Foundation.

Website: https://milvus.io
External link for Milvus, created by Zilliz
Industry: Software Development
Company size: 51-200 employees
Headquarters: Redwood Shores, CA
Type: Nonprofit
Founded: 2019
Specialties: Open Source and RAG

Locations

Primary

Redwood Shores, CA 94065, US

Get directions

Employees at Milvus, created by Zilliz

See all employees

Updates

Milvus, created by Zilliz

6,768 followers
2mo
Report this post
🚀 𝐈𝐧𝐭𝐫𝐨𝐝𝐮𝐜𝐢𝐧𝐠 𝐌𝐢𝐥𝐯𝐮𝐬 𝟐.𝟔: 𝐁𝐮𝐢𝐥𝐭 𝐟𝐨𝐫 𝐒𝐜𝐚𝐥𝐞, 𝐃𝐞𝐬𝐢𝐠𝐧𝐞𝐝 𝐭𝐨 𝐑𝐞𝐝𝐮𝐜𝐞 𝐂𝐨𝐬𝐭𝐬! You can now slash infrastructure costs while supercharging performance: 72% memory reduction, 4x faster queries, and 400% speed boost over Elasticsearch—all while scaling to hundreds of billions of vectors seamlessly. Swipe through to discover: 🔥 𝐑𝐚𝐁𝐢𝐭𝐐 𝟏-𝐛𝐢𝐭 𝐐𝐮𝐚𝐧𝐭𝐢𝐳𝐚𝐭𝐢𝐨𝐧 — 72% memory reduction + 4x faster QPS without recall loss ⚡ 𝟒𝟎𝟎% 𝐅𝐚𝐬𝐭𝐞𝐫 𝐅𝐮𝐥𝐥-𝐭𝐞𝐱𝐭 𝐒𝐞𝐚𝐫𝐜𝐡 — Up to 3-4x higher QPS than Elasticsearch 🛠️ JSON Path Index — 99% latency reduction (140ms → 1.5ms) for complex filtering 📊 𝐃𝐚𝐭𝐚-𝐈𝐧, 𝐃𝐚𝐭𝐚-𝐎𝐮𝐭 𝐏𝐢𝐩𝐞𝐥𝐢𝐧𝐞 — Raw text/image/audio to search results in one step 🎯 𝐇𝐮𝐧𝐝𝐫𝐞𝐝𝐬 𝐨𝐟 𝐁𝐢𝐥𝐥𝐢𝐨𝐧𝐬 𝐒𝐜𝐚𝐥𝐞 — Woodpecker WAL + streaming architecture 🔸𝐖𝐨𝐨𝐝𝐩𝐞𝐜𝐤𝐞𝐫 𝐖𝐀𝐋 + 𝐬𝐭𝐫𝐞𝐚𝐦𝐢𝐧𝐠 𝐚𝐫𝐜𝐡𝐢𝐭𝐞𝐜𝐭𝐮𝐫𝐞 ...and much more! Be a part of the flock, join Milvus today! Over 10,000 organizations worldwide have already been powered by us. Explore Milvus 2.6 👉 https://lnkd.in/gsEvM6Z2

2 Comments

Like Comment Share
Milvus, created by Zilliz

6,768 followers
3h
Report this post
🚨 𝟵𝟬% 𝗼𝗳 𝘃𝗲𝗰𝘁𝗼𝗿 𝗱𝗮𝘁𝗮𝗯𝗮𝘀𝗲 𝗯𝗲𝗻𝗰𝗵𝗺𝗮𝗿𝗸𝘀 𝗹𝗶𝗲 𝘁𝗼 𝘆𝗼𝘂 𝘄𝗶𝘁𝗵 𝗱𝗲𝗰𝗮𝗱𝗲-𝗼𝗹𝗱 𝗱𝗮𝘁𝗮𝘀𝗲𝘁𝘀 Choosing effective benchmarking methods directly determines whether your AI infrastructure investment becomes a success story or an expensive lesson. 🌟 𝗪𝗵𝗮𝘁 𝗠𝗮𝗸𝗲𝘀 𝗩𝗲𝗰𝘁𝗼𝗿𝗗𝗕𝗕𝗲𝗻𝗰𝗵 𝗗𝗶𝗳𝗳𝗲𝗿𝗲𝗻𝘁? Unlike traditional benchmarks that test with static data and pre-built indexes, 𝗩𝗲𝗰𝘁𝗼𝗿𝗗𝗕𝗕𝗲𝗻𝗰𝗵 addresses fundamental flaws by providing: → 𝗣𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻-𝗿𝗲𝗮𝗹𝗶𝘀𝘁𝗶𝗰 𝘁𝗲𝘀𝘁𝗶𝗻𝗴: Simulates concurrent operations and streaming ingestion → 𝗠𝗼𝗱𝗲𝗿𝗻 𝗱𝗮𝘁𝗮𝘀𝗲𝘁𝘀: Use current embedding models instead of legacy SIFT data → 𝗣𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻-𝗳𝗼𝗰𝘂𝘀𝗲𝗱 𝗺𝗲𝘁𝗿𝗶𝗰𝘀: Measures P95/P99 latency and sustainable throughput → 𝗖𝘂𝘀𝘁𝗼𝗺 𝗱𝗮𝘁𝗮𝘀𝗲𝘁 𝘀𝘂𝗽𝗽𝗼𝗿𝘁: Tests with your own industry-specific data 📋 𝗛𝗼𝘄 𝘁𝗼 𝗥𝘂𝗻 𝗮 𝗥𝗲𝗹𝗶𝗮𝗯𝗹𝗲 𝗘𝘃𝗮𝗹𝘂𝗮𝘁𝗶𝗼𝗻 𝘄𝗶𝘁𝗵 𝗥𝗲𝗮𝗹 𝗗𝗮𝘁𝗮? • 𝗨𝘀𝗲 𝗩𝗲𝗰𝘁𝗼𝗿𝗗𝗕𝗕𝗲𝗻𝗰𝗵 – An open-source benchmark tool supporting 10+ databases with traffic simulation resembling production • 𝗖𝗼𝘃𝗲𝗿 𝗱𝗶𝘃𝗲𝗿𝘀𝗲 𝘀𝗰𝗲𝗻𝗮𝗿𝗶𝗼𝘀 – Test streaming ingestion, metadata filtering, and concurrent operations that reveal actual bottlenecks • 𝗙𝗼𝗰𝘂𝘀 𝗼𝗻 𝗽𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻 𝗺𝗲𝘁𝗿𝗶𝗰𝘀 – Measure P95/P99 latency, sustainable QPS, and recall accuracy with your actual data distributions • 𝗨𝘀𝗲 𝗰𝘂𝘀𝘁𝗼𝗺𝗶𝘇𝗮𝗯𝗹𝗲 𝗱𝗮𝘁𝗮𝘀𝗲𝘁𝘀 – Test with your own vector dataset instead of generic datasets to reflect the real performance in your ___domain 💡 The difference is dramatic: trustworthy performance evaluation based on your actual dataset, search patterns, and traffic characteristics. Know more: https://lnkd.in/ec9_DwPn ——— 👉 Follow Milvus, created by Zilliz, for everything related to unstructured data!
Like Comment Share
Milvus, created by Zilliz

6,768 followers
23h
Report this post
Building Smart Workflows with 𝐑𝐞𝐟𝐥𝐲 𝐱 𝐌𝐢𝐥𝐯𝐮𝐬 𝐑𝐞𝐟𝐥𝐲: AI-native content platform with drag-and-drop workflow building 𝐌𝐢𝐥𝐯𝐮𝐬: A Vector database that handles unstructured data (documents, images, videos) with semantic search Detailed tutorial: https://lnkd.in/eFAnybDh ———— 👉 Follow Milvus, created by Zilliz, for everything related to unstructured data!
Like Comment Share
Milvus, created by Zilliz

6,768 followers
1d
Report this post
Claude Code is exceptional at understanding and refactoring complex codebases. But… it can only discover files through your project structure. For large projects with millions of lines of code, this means Claude Code never gets the full picture of your entire codebase. Introducing 𝐜𝐥𝐚𝐮𝐝𝐞-𝐜𝐨𝐧𝐭𝐞𝐱𝐭 An MCP plugin that gives Claude Code semantic search access to your entire codebase, so it can understand the complete project context and find relevant code across all files. What it enables: 🔍 𝐖𝐡𝐨𝐥𝐞 𝐂𝐨𝐝𝐞𝐛𝐚𝐬𝐞 𝐒𝐞𝐚𝐫𝐜𝐡 – Ask "find functions that handle user authentication" and Claude searches across your entire project, not just current files. 🧠 𝐂𝐨𝐦𝐩𝐥𝐞𝐭𝐞 𝐏𝐫𝐨𝐣𝐞𝐜𝐭 𝐔𝐧𝐝𝐞𝐫𝐬𝐭𝐚𝐧𝐝𝐢𝐧𝐠 – Claude can see how different parts of your codebase relate, even across millions of lines of code. ⚡ 𝐄𝐟𝐟𝐢𝐜𝐢𝐞𝐧𝐭 𝐂𝐨𝐝𝐞𝐛𝐚𝐬𝐞 𝐈𝐧𝐝𝐞𝐱𝐢𝐧𝐠 – Indexes your entire project efficiently using Merkle trees and intelligent AST-based code chunking. Giving AI access to complete codebases is the missing piece for truly collaborative AI development. 𝐜𝐥𝐚𝐮𝐝𝐞-𝐜𝐨𝐧𝐭𝐞𝐱𝐭 is now open source: https://lnkd.in/gBHR9En5 ———— 👉 Follow Milvus, created by Zilliz, for everything related to unstructured data!
Like Comment Share
Milvus, created by Zilliz

6,768 followers
2d
Report this post
𝗔𝗴𝗲𝗻𝘁𝘀 𝘃𝘀 𝗪𝗼𝗿𝗸𝗳𝗹𝗼𝘄𝘀: 𝗖𝗵𝗼𝗼𝘀𝗶𝗻𝗴 𝘁𝗵𝗲 𝗿𝗶𝗴𝗵𝘁 𝘁𝗼𝗼𝗹 Here's what we've learned after building both: 𝗶𝘁'𝘀 𝗻𝗼𝘁 𝗮𝗯𝗼𝘂𝘁 𝗮𝗴𝗲𝗻𝘁𝘀 𝗯𝗲𝗶𝗻𝗴 𝗯𝗮𝗱, 𝗶𝘁'𝘀 𝗮𝗯𝗼𝘂𝘁 𝘂𝘀𝗶𝗻𝗴 𝘁𝗵𝗲 𝗿𝗶𝗴𝗵𝘁 𝗮𝗽𝗽𝗿𝗼𝗮𝗰𝗵 𝗳𝗼𝗿 𝘁𝗵𝗲 𝗿𝗶𝗴𝗵𝘁 𝗽𝗿𝗼𝗯𝗹𝗲𝗺. 𝗪𝗵𝗲𝗻 𝘁𝗼 𝗰𝗵𝗼𝗼𝘀𝗲 𝗪𝗢𝗥𝗞𝗙𝗟𝗢𝗪𝗦: ✅ 𝗣𝗿𝗲𝗱𝗶𝗰𝘁𝗮𝗯𝗹𝗲 𝗽𝗿𝗼𝗰𝗲𝘀𝘀𝗲𝘀 – You know the steps upfront ✅ 𝗣𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻 𝘀𝘆𝘀𝘁𝗲𝗺𝘀 – Reliability and debugging matter most ✅ 𝗛𝗶𝗴𝗵-𝘀𝘁𝗮𝗸𝗲𝘀 𝗱𝗲𝗰𝗶𝘀𝗶𝗼𝗻𝘀 – Financial, medical, or compliance tasks ✅ 𝗧𝗲𝗮𝗺 𝗰𝗼𝗼𝗿𝗱𝗶𝗻𝗮𝘁𝗶𝗼𝗻 – Multiple people need to understand the logic 𝗪𝗼𝗿𝗸𝗳𝗹𝗼𝘄 𝗽𝗮𝘁𝘁𝗲𝗿𝗻𝘀 𝘁𝗵𝗮𝘁 𝘄𝗼𝗿𝗸: - 𝗣𝗿𝗼𝗺𝗽𝘁 𝗰𝗵𝗮𝗶𝗻𝗶𝗻𝗴 – Sequential steps with clear handoffs - 𝗣𝗮𝗿𝗮𝗹𝗹𝗲𝗹𝗶𝘇𝗮𝘁𝗶𝗼𝗻 – Run independent tasks concurrently - 𝗦𝗺𝗮𝗿𝘁 𝗿𝗼𝘂𝘁𝗶𝗻𝗴 – Direct inputs to specialized handlers - 𝗢𝗿𝗰𝗵𝗲𝘀𝘁𝗿𝗮𝘁𝗼𝗿-𝘄𝗼𝗿𝗸𝗲𝗿 – Break down tasks, delegate execution - 𝗘𝘃𝗮𝗹𝘂𝗮𝘁𝗼𝗿-𝗼𝗽𝘁𝗶𝗺𝗶𝘇𝗲𝗿 – Iterative improvement with feedback 𝗪𝗵𝗲𝗻 𝘁𝗼 𝗰𝗵𝗼𝗼𝘀𝗲 𝗔𝗚𝗘𝗡𝗧𝗦: ✅ 𝗖𝗿𝗲𝗮𝘁𝗶𝘃𝗲 𝗲𝘅𝗽𝗹𝗼𝗿𝗮𝘁𝗶𝗼𝗻 – Brainstorming, ideation, content creation ✅ 𝗥𝗲𝘀𝗲𝗮𝗿𝗰𝗵 𝘁𝗮𝘀𝗸𝘀 – Unknown paths, dynamic investigation ✅ 𝗛𝘂𝗺𝗮𝗻-𝗶𝗻-𝘁𝗵𝗲-𝗹𝗼𝗼𝗽 – Where oversight can catch and correct mistakes ✅ 𝗔𝗱𝗮𝗽𝘁𝗶𝘃𝗲 𝘀𝗰𝗲𝗻𝗮𝗿𝗶𝗼𝘀 – When the workflow truly can't be predefined The key insight: 𝗔𝗴𝗲𝗻𝘁𝘀 excel when you need flexibility and can tolerate unpredictability. 𝗪𝗼𝗿𝗸𝗳𝗹𝗼𝘄𝘀 win when you need reliability and clear debugging paths. 𝗖𝗼𝗺𝗺𝗼𝗻 𝗺𝗶𝘀𝘁𝗮𝗸𝗲𝘀 𝘄𝗲 𝘀𝗲𝗲: - Using agents for predictable business processes - Building workflows for truly dynamic creative tasks - Skipping the "Do I really need this complexity?" question What's your experience? Where have you found the sweet spot between agents and workflows? ——— 👉 Follow Milvus, created by Zilliz, for everything related to unstructured data!
Like Comment Share
Milvus, created by Zilliz

6,768 followers
3d
Report this post
Build your first RAG system by following this clear 9-step pipeline. 1. 𝐈𝐧𝐠𝐞𝐬𝐭 & 𝐏𝐫𝐞𝐩𝐫𝐨𝐜𝐞𝐬𝐬 𝐃𝐚𝐭𝐚 Quality data ingestion sets the foundation for your entire pipeline. 𝐊𝐞𝐲 𝐓𝐚𝐬𝐤𝐬: 🔶 Extract text from PDFs, HTML, databases 🔶 Clean: Remove headers, footers, and irrelevant metadata 🔶 Normalize: Fix encoding, handle special characters 🔶 Deduplicate content to prevent redundant retrieval 2. 𝐒𝐩𝐥𝐢𝐭 𝐈𝐧𝐭𝐨 𝐂𝐡𝐮𝐧𝐤𝐬 Create semantically coherent pieces for optimal retrieval. 𝐒𝐭𝐫𝐚𝐭𝐞𝐠𝐢𝐞𝐬: 🔶 Fixed-size: 500–1000 tokens with 10–20% overlap 🔶 Semantic: Split on paragraphs, sentences, sections 🔶 Structure-aware: Respect headers, tables, lists 3. 𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐞 𝐄𝐦𝐛𝐞𝐝𝐝𝐢𝐧𝐠𝐬 Convert text into vector representations for semantic search. 𝐌𝐨𝐝𝐞𝐥 𝐎𝐩𝐭𝐢𝐨𝐧𝐬: 🔶 General: OpenAI text-embedding-3-large, Cohere embed-v3 🔶 Open Source: sentence-transformers, BGE 🔶 Domain-specific: Fine-tuned for legal, medical, and technical content 4. 𝐒𝐭𝐨𝐫𝐞 𝐢𝐧 𝐕𝐞𝐜𝐭𝐨𝐫 𝐃𝐁 & 𝐈𝐧𝐝𝐞𝐱 Enable fast similarity search with proper indexing. 𝐃𝐚𝐭𝐚𝐛𝐚𝐬𝐞 𝐎𝐩𝐭𝐢𝐨𝐧𝐬: 🔶 Cloud: Zilliz 🔶 Self-hosted: Milvus 5. 𝐑𝐞𝐭𝐫𝐢𝐞𝐯𝐞 𝐈𝐧𝐟𝐨𝐫𝐦𝐚𝐭𝐢𝐨𝐧 Transform queries into relevant context using advanced techniques. 𝐑𝐞𝐭𝐫𝐢𝐞𝐯𝐚𝐥 𝐌𝐞𝐭𝐡𝐨𝐝𝐬: 🔶 Semantic: Vector similarity (cosine distance) 🔶 Keyword: BM25 for exact matches 🔶 Hybrid: Combine both 🔶 Re-ranking: Use cross-encoders for result optimization 6. 𝐎𝐫𝐜𝐡𝐞𝐬𝐭𝐫𝐚𝐭𝐞 𝐭𝐡𝐞 𝐏𝐢𝐩𝐞𝐥𝐢𝐧𝐞 Build your workflow and manage the flow. 𝐅𝐫𝐚𝐦𝐞𝐰𝐨𝐫𝐤 𝐎𝐩𝐭𝐢𝐨𝐧𝐬: 🔶 LangChain/LlamaIndex: High-level RAG abstractions 🔶 n8n: Dedicated workflow automation platform 7. 𝐒𝐞𝐥𝐞𝐜𝐭 𝐋𝐋𝐌𝐬 𝐟𝐨𝐫 𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐨𝐧 Choose models that synthesize retrieved information effectively. 𝐌𝐨𝐝𝐞𝐥 𝐂𝐚𝐭𝐞𝐠𝐨𝐫𝐢𝐞𝐬: 🔶 Cloud APIs: GPT-5, Claude, Gemini Pro (highest quality) 🔶 Open Source: Llama3, Mistral, Qwen (cost-effective) 🔶 Local: Ollama, vLLM (privacy-sensitive) 8. 𝐀𝐝𝐝 𝐎𝐛𝐬𝐞𝐫𝐯𝐚𝐛𝐢𝐥𝐢𝐭𝐲 Monitor performance and costs for production readiness. 𝐄𝐬𝐬𝐞𝐧𝐭𝐢𝐚𝐥 𝐌𝐞𝐭𝐫𝐢𝐜𝐬: 🔶 Retrieval Quality: Precision@K, recall, relevance scores 🔶 Generation Quality: Factuality, completeness, citation accuracy 🔶 System Performance: Latency, throughput, error rates 🔶 Use tools like LangSmith, Langfuse or custom dashboards 9. 𝐄𝐯𝐚𝐥𝐮𝐚𝐭𝐞 & 𝐈𝐦𝐩𝐫𝐨𝐯𝐞 Continuously optimize your system based on real-world performance. 𝐄𝐯𝐚𝐥𝐮𝐚𝐭𝐢𝐨𝐧 𝐀𝐩𝐩𝐫𝐨𝐚𝐜𝐡: 🔶 Automated: RAGAS, TruLens for comprehensive metrics 🔶 Human: Expert review and user feedback 🔶 Key Dimensions: Faithfulness, relevance, precision, recall 🔶 Iterate on chunk sizes, embedding models, and prompts based on results Start Simple -> Measure Everything -> Iterate Based on Data -> Scale Gradually — 👉 Follow Milvus, created by Zilliz, for everything related to unstructured data!
Like Comment Share
Milvus, created by Zilliz

6,768 followers
4d
Report this post
🎤 𝐌𝐢𝐥𝐯𝐮𝐬 𝐚𝐭 𝐀𝐈 𝐓𝐢𝐧𝐤𝐞𝐫𝐞𝐫𝐬 𝐒𝐢𝐧𝐠𝐚𝐩𝐨𝐫𝐞 𝐌𝐞𝐞𝐭𝐮𝐩 𝐭𝐨𝐦𝐨𝐫𝐫𝐨𝐰! Ivan Tang, Solutions Architect at Zilliz, will take the stage to showcase how builders bring ideas to life with scalable agentic workflows. Speech Topic: 𝐙𝐞𝐫𝐨-𝐂𝐨𝐝𝐞, 𝐈𝐧𝐟𝐢𝐧𝐢𝐭𝐞 𝐒𝐜𝐚𝐥𝐞：𝐅𝐎𝐒𝐒-𝐩𝐨𝐰𝐞𝐫𝐞𝐝 𝐚𝐠𝐞𝐧𝐭𝐢𝐜 𝐩𝐢𝐩𝐞𝐥𝐢𝐧𝐞𝐬 𝐰𝐢𝐭𝐡 𝐊𝐚𝐟𝐤𝐚 𝐚𝐧𝐝 𝐌𝐢𝐥𝐯𝐮𝐬 Huge thanks to AI Tinkerers, a vibrant cross-sectoral community, for bringing together hands-on builders in foundation models and generative AI. 🙌 Event Details： 📅 Date: August 12th, 2025 ⏰ Time: 6:00 PM – 9:00 PM 📍 Venue: AWS Singapore Office, 5 mins from Downtown MRT stop 🔗 Register here: https://lnkd.in/eGqeA5ND
Like Comment Share
Milvus, created by Zilliz reposted this
Jiang Chen

Milvus open-source vector db @ Zilliz | ex-Google Search Infra
1w
Report this post
I’m a big fan of Claude Code. But… I wish this exceptional AI coder could also remember every nitty gritty in my entire codebase — millions of lines — without bankrupting me on tokens. So we built a plugin: Claude Context This is a semantic search MCP that helps Claude Code remember your codebase, no matter how large it is. What it can do: 🔍 Semantic Code Search – Ask “which functions handle user login?” and it finds ValidateLoginCredential() and friends (no more brittle keyword match). ⚡ Incremental Indexing – Only index what changed, using Merkle trees. 🧠 AST-based Chunking – Chunk by code structure, not just lines. 🗄️ Scalable – Built on Zilliz Cloud vector DB, handles large codebases. Big shoutout to Boris Cherny and Catherine Wu for inventing the best AI coding agent to date. Thanks Anthropic for Claude Opus that made this all possible. I believe Claude Code + search is bringing AI coding to the next level. We open-sourced Claude Context. Github link in the comment.

21 Comments

Like Comment Share
Milvus, created by Zilliz

6,768 followers
6d
Report this post
𝐖𝐡𝐲 𝐦𝐨𝐬𝐭 𝐀𝐈 𝐬𝐞𝐚𝐫𝐜𝐡 𝐨𝐧𝐥𝐲 𝐰𝐨𝐫𝐤 𝐰𝐞𝐥𝐥 𝐢𝐧 𝐄𝐧𝐠𝐥𝐢𝐬𝐡? Multilingual content is everywhere—support tickets, product reviews, help docs. Even in English-first apps, users mix languages. So full-text search must handle more than just English to ensure smooth user experience. But unfortunately, most systems don’t. They break when different languages go through the same analysis pipeline. Main challenges include: 🔷 𝐒𝐩𝐞𝐜𝐢𝐚𝐥𝐢𝐳𝐞𝐝 𝐭𝐨𝐤𝐞𝐧𝐢𝐳𝐞𝐫𝐬 𝐝𝐨𝐧’𝐭 𝐠𝐞𝐧𝐞𝐫𝐚𝐥𝐢𝐳𝐞: Languages like Chinese, Japanese, and Korean need language-specific tokenizers to segment text. 🔷 𝐅𝐢𝐥𝐭𝐞𝐫𝐢𝐧𝐠 𝐫𝐮𝐥𝐞𝐬 𝐜𝐥𝐚𝐬𝐡: Languages like English and French may share tokenization basics, but their stemming and lemmatization rules differ. That’s why many search engines either only support English well, or expect you to build custom analyzers for every language. 𝐌𝐢𝐥𝐯𝐮𝐬 𝟐.𝟔 𝐟𝐢𝐱𝐞𝐬 𝐭𝐡𝐚𝐭, building a multilingual full-text search engine that actually works—across dozens of languages and use cases. 𝐓𝐡𝐫𝐞𝐞 𝐧𝐞𝐰 𝐰𝐚𝐲𝐬 𝐭𝐨 𝐡𝐚𝐧𝐝𝐥𝐞 𝐦𝐮𝐥𝐭𝐢𝐥𝐢𝐧𝐠𝐮𝐚𝐥 𝐭𝐞𝐱𝐭 𝐢𝐧 𝐌𝐢𝐥𝐯𝐮𝐬 𝟐.𝟔 1. 𝐌𝐮𝐥𝐭𝐢-𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞 𝐀𝐧𝐚𝐥𝐲𝐳𝐞𝐫 – 𝐅𝐮𝐥𝐥 𝐜𝐨𝐧𝐭𝐫𝐨𝐥: Assign analyzers by language and let Milvus apply the right one during indexing and search. 2. 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞 𝐈𝐝𝐞𝐧𝐭𝐢𝐟𝐢𝐞𝐫 𝐓𝐨𝐤𝐞𝐧𝐢𝐳𝐞𝐫 – 𝐅𝐮𝐥𝐥 𝐚𝐮𝐭𝐨𝐦𝐚𝐭𝐢𝐨𝐧: Milvus auto-detects the language and routes text through the correct analyzer—no tags needed. 3. 𝐈𝐂𝐔 𝐓𝐨𝐤𝐞𝐧𝐢𝐳𝐞𝐫 – 𝐀 𝐮𝐧𝐢𝐯𝐞𝐫𝐬𝐚𝐥 𝐟𝐚𝐥𝐥𝐛𝐚𝐜𝐤: A Unicode-based tokenizer that works reliably across many languages with zero setup. With these tools, you can bring high-quality full-text search to global apps—without rewriting your entire search stack for every language. 🖊️ Want to see how it works in practice? The blog includes a full hands-on demo with PyMilvus code—so you can try multilingual BM25 search yourself, step by step. 𝐋𝐞𝐚𝐫𝐧 𝐦𝐨𝐫𝐞: https://lnkd.in/g82h9jJb
Like Comment Share
Milvus, created by Zilliz

6,768 followers
1w
Report this post
𝐆𝐏𝐓-𝟓 𝐣𝐮𝐬𝐭 𝐝𝐫𝐨𝐩𝐩𝐞𝐝. 𝐈𝐬 𝐭𝐡𝐢𝐬 𝐭𝐡𝐞 𝐞𝐧𝐝 𝐨𝐟 𝐑𝐀𝐆? 𝐆𝐏𝐓-𝟓-𝐦𝐚𝐢𝐧 reduces factual errors by 𝟒𝟒% vs 𝐆𝐏𝐓-𝟒𝐨. 𝐆𝐏𝐓-𝟓-𝐭𝐡𝐢𝐧𝐤𝐢𝐧𝐠 improves by 𝟕𝟖% vs 𝐎𝐩𝐞𝐧𝐀𝐈 𝐨𝟑. It also handles 𝐥𝐨𝐧𝐠𝐞𝐫 𝐜𝐨𝐧𝐭𝐞𝐱𝐭 𝐰𝐢𝐭𝐡 𝐛𝐞𝐭𝐭𝐞𝐫 𝐫𝐞𝐚𝐬𝐨𝐧𝐢𝐧𝐠. If LLMs can 𝐫𝐞𝐦𝐞𝐦𝐛𝐞𝐫 𝐦𝐨𝐫𝐞 and 𝐡𝐚𝐥𝐥𝐮𝐜𝐢𝐧𝐚𝐭𝐞 𝐥𝐞𝐬𝐬 — do we still need retrieval at all? Here’s the reality: even with 𝐆𝐏𝐓-𝟓, real-world applications still struggle with: ⚡ 𝐒𝐩𝐞𝐞𝐝 – Long context takes 𝟏𝟎–𝟑𝟎𝐬 💸 𝐂𝐨𝐬𝐭 – Up to $𝟏𝟎 per 𝟏𝐌-token query 📦 𝐒𝐜𝐚𝐥𝐞 – 𝟏𝐌 tokens ≠ full knowledge base 🧩 𝐃𝐚𝐭𝐚 𝐯𝐚𝐫𝐢𝐞𝐭𝐲 – Real-world inputs include structured and unstructured data like tables, graphs, logs, and more This is where we believe 𝐑𝐀𝐆 (𝐚𝐧𝐝 𝐯𝐞𝐜𝐭𝐨𝐫 𝐃𝐁𝐬) still shine. 𝐑𝐀𝐆 𝐢𝐬𝐧'𝐭 𝐝𝐞𝐚𝐝 — 𝐢𝐭’𝐬 𝐞𝐯𝐨𝐥𝐯𝐢𝐧𝐠 with hybrid search, better embeddings, and multi-hop reasoning.

This content isn’t available here

Access this content and more in the LinkedIn app

2 Comments

Like Comment Share

Milvus, created by Zilliz

Software Development

Redwood Shores, CA 6,768 followers

The Vector Database That Delivers Scale, Performance & Cost-Efficiency for Production AI

About us

Locations

Employees at Milvus, created by Zilliz

Charles Xie

Founder of Milvus, the open-source vector database built for scalability and adopted by 10k+ companies | Passionate about databases, OSS, and making…

Son Le

Educator 🧑🏫 | Builder

Enwei Jiao

AI data/VectorDB/OLAP

Mahmoud Yasser H.

Updates

Join now to see what you are missing

Similar pages

Zilliz

Pinecone

Qdrant

Weaviate

Chroma

LlamaIndex

Milvus

Docling

LangChain

FastAPI

Browse jobs

Engineer jobs

Intern jobs

Scientist jobs

Infrastructure Engineer jobs

Research Intern jobs

Analyst jobs

Writer jobs

Machine Learning Engineer jobs

Science Specialist jobs

Digital Marketing Specialist jobs

Senior Scientist jobs

Senior Software Engineer jobs

Principal Scientist jobs

Packaging Engineer jobs

Solutions Architect jobs

Developer jobs

Lead jobs

Data Analyst jobs

Marketing Specialist jobs

Marketing Manager jobs