Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add Vectara destination connector #2357

Merged
merged 39 commits into from
Feb 1, 2024
Merged

feat: add Vectara destination connector #2357

merged 39 commits into from
Feb 1, 2024

Conversation

potter-potter
Copy link
Contributor

Thanks to Ofer at Vectara, we now have a Vectara destination connector.

  • There are no dependencies since it is all REST calls to API

@potter-potter potter-potter linked an issue Jan 22, 2024 that may be closed by this pull request
ryannikolaidis
ryannikolaidis previously approved these changes Jan 30, 2024
Copy link
Contributor

@ryannikolaidis ryannikolaidis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checking, was there any resolution on validation of the stored and embedded results?

@ryannikolaidis ryannikolaidis self-requested a review January 30, 2024 18:01
@ryannikolaidis ryannikolaidis dismissed their stale review January 30, 2024 18:02

actual should wait for response on "can we test the embedding results"

@potter-potter
Copy link
Contributor Author

potter-potter commented Jan 30, 2024

checking, was there any resolution on validation of the stored and embedded results?

@ryannikolaidis I can make a post request with a text query. (but not an embedding) And then have it give the top example. We do get some metadata we can check for. But once the partitioned file hits Vectara, it is a black box. So no guarantee what the are doing as far as embeddings and search.

Example of response:
{ "responseSet": [ { "response": [ { "text": "I confess all these festivities and fireworks are becoming wearisome. \"If they had known that you wished it, the entertainment would have been put off,\" said the prince, who, like a wound-up clock, by force of habit said things he did not even wish to be believed. \"Don't tease! Well, and what has been decided about Novosiltsev's dispatch? <b>You know everything.</b> \"What can one say about it?\" replied the prince in a cold, listless tone. \"What has been decided?", "score": 0.75966287, "metadata": [ { "name": "url", "value": "example-docs/book-war-and-peace-1p.txt" }, { "name": "filename", "value": "book-war-and-peace-1p.txt" }, { "name": "filetype", "value": "text/plain" }, { "name": "last_modified", "value": "2024-01-18T09:39:46" }, { "name": "lang", "value": "eng" }, { "name": "offset", "value": "76" }, { "name": "len", "value": "20" } ], "documentIndex": 0, "corpusKey": { "customerId": 0, "corpusId": 61, "semantics": "DEFAULT", "dim": [], "metadataFilter": "", "lexicalInterpolationConfig": null }, "resultOffset": 344, "resultLength": 20 } ], "status": [], "document": [ { "id": "726c32ce-acf0-471e-88b2-a4a8b040d324", "metadata": [] } ], "summary": [], "futureId": 1 } ], "status": [], "metrics": null }

I looked for some other endpoints that might give us embedding information. But there wasn't anything I found useful.

@ryannikolaidis
Copy link
Contributor

checking, was there any resolution on validation of the stored and embedded results?

@ryannikolaidis I can make a post request with a text query. (but not an embedding) And then have it give the top example. We do get some metadata we can check for. But once the partitioned file hits Vectara, it is a black box. So no guarantee what the are doing as far as embeddings and search.

Example of response: { "responseSet": [ { "response": [ { "text": "I confess all these festivities and fireworks are becoming wearisome. \"If they had known that you wished it, the entertainment would have been put off,\" said the prince, who, like a wound-up clock, by force of habit said things he did not even wish to be believed. \"Don't tease! Well, and what has been decided about Novosiltsev's dispatch? <b>You know everything.</b> \"What can one say about it?\" replied the prince in a cold, listless tone. \"What has been decided?", "score": 0.75966287, "metadata": [ { "name": "url", "value": "example-docs/book-war-and-peace-1p.txt" }, { "name": "filename", "value": "book-war-and-peace-1p.txt" }, { "name": "filetype", "value": "text/plain" }, { "name": "last_modified", "value": "2024-01-18T09:39:46" }, { "name": "lang", "value": "eng" }, { "name": "offset", "value": "76" }, { "name": "len", "value": "20" } ], "documentIndex": 0, "corpusKey": { "customerId": 0, "corpusId": 61, "semantics": "DEFAULT", "dim": [], "metadataFilter": "", "lexicalInterpolationConfig": null }, "resultOffset": 344, "resultLength": 20 } ], "status": [], "document": [ { "id": "726c32ce-acf0-471e-88b2-a4a8b040d324", "metadata": [] } ], "summary": [], "futureId": 1 } ], "status": [], "metrics": null }

I looked for some other endpoints that might give us embedding information. But there wasn't anything I found useful.

Ahh, I see. I think we could but we would need to build and hit the appropriate app in Vectara that also leverages the data we've written.

Copy link
Contributor

@ryannikolaidis ryannikolaidis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I think checking count is good enough for now. It would be nice as a future enhancement to actually test end to end with a test application we build in Vectara, but this should at least given us confidence that we're writing to it.

@potter-potter potter-potter added this pull request to the merge queue Feb 1, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 1, 2024
@potter-potter potter-potter added this pull request to the merge queue Feb 1, 2024
Merged via the queue into main with commit c100ce2 Feb 1, 2024
43 checks passed
@potter-potter potter-potter deleted the potter/vectara branch February 1, 2024 15:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: fine tune Vectara destination connector for merging
4 participants