一架梯子,一头程序猿,仰望星空!
LangChain教程(JS版本) > 内容正文

Chroma 向量存储


Chroma

Chroma是一个AI原生的开源向量数据库,专注于提高开发者的效率和开发体验。Chroma采用Apache 2.0许可证。

提示:不了解Chroma,请阅读Chroma JS教程

安装和配置

  1. 在您的计算机上使用Docker运行Chroma
git clone git@github.com:chroma-core/chroma.git
docker-compose up -d --build
  1. 安装Chroma JS SDK
 // npm
npm install -S chromadb
// Yarn
yarn add chromadb
// pnpm
pnpm add chromadb

Chroma具备完全的类型、测试和文档支持。

与其他数据库一样,您可以执行以下操作:

  • .add:添加数据
  • .get:获取数据
  • .update:更新数据
  • .upsert:更新数据或添加数据
  • .delete:删除数据
  • .peek:预览数据
  • .query:执行相似性搜索

索引和查询文档

import { Chroma } from "langchain/vectorstores/chroma";
import { OpenAIEmbeddings } from "langchain/embeddings/openai";
import { TextLoader } from "langchain/document_loaders/fs/text";

// 使用加载器创建文档
const loader = new TextLoader("src/document_loaders/example_data/example.txt");
const docs = await loader.load();

// 创建向量存储并对文档进行索引
const vectorStore = await Chroma.fromDocuments(docs, new OpenAIEmbeddings(), {
  collectionName: "a-test-collection",
  url: "http://localhost:8000", // 可选项,默认为此值
  collectionMetadata: {
    "hnsw:space": "cosine",
  }, // 可选项,可用于指定嵌入空间的距离方法 https://docs.trychroma.com/usage-guide#changing-the-distance-function
});

// 查找最相似的文档
const response = await vectorStore.similaritySearch("你好", 1);

console.log(response);
/*
[
  Document {
    pageContent: 'Foo\nBar\nBaz\n\n',
    metadata: { source: 'src/document_loaders/example_data/example.txt' }
  }
]
*/

索引和查询文本

import { Chroma } from "langchain/vectorstores/chroma";
import { OpenAIEmbeddings } from "langchain/embeddings/openai";

// text sample from Godel, Escher, Bach
const vectorStore = await Chroma.fromTexts(
  [
    `Tortoise: Labyrinth? Labyrinth? Could it Are we in the notorious Little
        Harmonic Labyrinth of the dreaded Majotaur?`,
    "Achilles: Yiikes! What is that?",
    `Tortoise: They say-although I person never believed it myself-that an I
        Majotaur has created a tiny labyrinth sits in a pit in the middle of
        it, waiting innocent victims to get lost in its fears complexity.
        Then, when they wander and dazed into the center, he laughs and
        laughs at them-so hard, that he laughs them to death!`,
    "Achilles: Oh, no!",
    "Tortoise: But it's only a myth. Courage, Achilles.",
  ],
  [{ id: 2 }, { id: 1 }, { id: 3 }],
  new OpenAIEmbeddings(),
  {
    collectionName: "godel-escher-bach",
  }
);

const response = await vectorStore.similaritySearch("scared", 2);

console.log(response);
/*
[
  Document { pageContent: 'Achilles: Oh, no!', metadata: {} },
  Document {
    pageContent: 'Achilles: Yiikes! What is that?',
    metadata: { id: 1 }
  }
]
*/

// You can also filter by metadata
const filteredResponse = await vectorStore.similaritySearch("scared", 2, {
  id: 1,
});

console.log(filteredResponse);
/*
[
  Document {
    pageContent: 'Achilles: Yiikes! What is that?',
    metadata: { id: 1 }
  }
]
*/

从现有文档库中查询文档

import { Chroma } from "langchain/vectorstores/chroma";
import { OpenAIEmbeddings } from "langchain/embeddings/openai";

const vectorStore = await Chroma.fromExistingCollection(
  new OpenAIEmbeddings(),
  { collectionName: "godel-escher-bach" }
);

const response = await vectorStore.similaritySearch("scared", 2);
console.log(response);
/*
[
  Document { pageContent: 'Achilles: Oh, no!', metadata: {} },
  Document {
    pageContent: 'Achilles: Yiikes! What is that?',
    metadata: { id: 1 }
  }
]
*/

删除文档

import { Chroma } from "langchain/vectorstores/chroma";
import { OpenAIEmbeddings } from "langchain/embeddings/openai";

const embeddings = new OpenAIEmbeddings();
const vectorStore = new Chroma(embeddings, {
  collectionName: "test-deletion",
});

const documents = [
  {
    pageContent: `Tortoise: 迷宫?迷宫?我们是不是在臭名昭著的小和谐迷宫里?
    该死的Majotaur制造了一个微小的迷宫,坐落在其中间的一个坑里,等待无辜的受害者迷失在它可怕的复杂中。
    当他们茫然地在中央徘徊时,他会笑个不停——他笑得太厉害了,把他们笑死!`,
    metadata: {
      speaker: "Tortoise",
    },
  },
  {
    pageContent: "Achilles: 呀!那是什么?",
    metadata: {
      speaker: "Achilles",
    },
  },
  {
    pageContent: `Tortoise: 他们说——尽管我自己从不相信——一只Majotaur已经创建了一个微小迷宫,坐落在其中间的一个坑里,
    等待无辜的受害者迷失在它可怕的复杂中。
    然后,当他们茫然漫游进入中心时,他会笑个不停——他笑得太厉害了,把他们笑死!`,
    metadata: {
      speaker: "Tortoise",
    },
  },
  {
    pageContent: "Achilles: 哦,不!",
    metadata: {
      speaker: "Achilles",
    },
  },
  {
    pageContent: "Tortoise: 但这只是一个迷思。加油,阿喀琉斯。",
    metadata: {
      speaker: "Tortoise",
    },
  },
];

// 也支持额外的 {ids: []} 参数来进行更新
const ids = await vectorStore.addDocuments(documents);

const response = await vectorStore.similaritySearch("害怕", 2);
console.log(response);
/*
[
  Document { pageContent: 'Achilles: 哦,不!', metadata: {} },
  Document {
    pageContent: 'Achilles: 呀!那是什么?',
    metadata: { id: 1 }
  }
]
*/

// 你也可以传递一个 "filter" 参数来代替
await vectorStore.delete({ ids });

const response2 = await vectorStore.similaritySearch("害怕", 2);
console.log(response2);

/*
  []
*/


关联主题