Skip to content

Commit

Permalink
Fix SQL multimodal example
Browse files Browse the repository at this point in the history
  • Loading branch information
blythed committed Nov 29, 2023
1 parent d951858 commit 7bcd2fd
Show file tree
Hide file tree
Showing 8 changed files with 448 additions and 4,352 deletions.
4,230 changes: 16 additions & 4,214 deletions examples/multimodal_image_search_clip.ipynb

Large diffs are not rendered by default.

24 changes: 22 additions & 2 deletions examples/sql-example.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,16 @@
"The initial step in any `superduperdb` workflow is to connect to your datastore. To connect to a different datastore, simply add a different `URI`, for example, `postgres://...`."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "587bffd3-7202-4e12-a6de-435d4b16892d",
"metadata": {},
"outputs": [],
"source": [
"!rm .superduperdb/test.ddb"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand All @@ -78,6 +88,16 @@
"db = superduper('duckdb://.superduperdb/test.ddb')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0a5252a3-91a7-4789-85b7-0b8279c228a3",
"metadata": {},
"outputs": [],
"source": [
"db.show('vector_index')"
]
},
{
"cell_type": "markdown",
"id": "b8794451",
Expand Down Expand Up @@ -108,7 +128,7 @@
"!mkdir -p data/coco\n",
"\n",
"# Move the 'images_small' directory to 'data/coco/images'\n",
"!mv images_small data/coco/images"
"!mv images_tiny data/coco/images"
]
},
{
Expand Down Expand Up @@ -224,7 +244,7 @@
"source": [
"## Build SuperDuperDB `Model` Instances\n",
"\n",
"This use-case utilizes the `superduperdb.ext.torch` extension. Both models used output `torch` tensors, which are encoded with `tensor`:"
"This use-case utilizes the `superduperdb.ext.torch` extension. Both models use `torch` tensors in their output, which are encoded with `tensor`:"
]
},
{
Expand Down
57 changes: 36 additions & 21 deletions examples/transfer_learning.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,10 @@
"cell_type": "markdown",
"id": "fe6fd0ab0e1ad844",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"# Transfer Learning with Sentence Transformers and Scikit-Learn"
Expand All @@ -14,7 +17,10 @@
"cell_type": "markdown",
"id": "8dcde44d942793ff",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"## Introduction\n",
Expand All @@ -26,7 +32,10 @@
"cell_type": "markdown",
"id": "1809feca8a8dca5a",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"## Prerequisites\n",
Expand All @@ -41,7 +50,7 @@
"metadata": {},
"outputs": [],
"source": [
"!pip install superduperdb\n",
"# !pip install superduperdb\n",
"!pip install ipython numpy datasets sentence-transformers"
]
},
Expand All @@ -57,7 +66,10 @@
"cell_type": "markdown",
"id": "5379007991707d17",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"First, we need to establish a connection to a MongoDB datastore via SuperDuperDB. You can configure the `MongoDB_URI` based on your specific setup. \n",
Expand All @@ -84,7 +96,7 @@
"\n",
"# SuperDuperDB, now handles your MongoDB database\n",
"# It just super dupers your database\n",
"db = superduper(mongodb_uri)\n",
"db = superduper(mongodb_uri, artifact_store='filesystem://./data/')\n",
"\n",
"# Reference a collection called transfer\n",
"collection = Collection('transfer')"
Expand Down Expand Up @@ -169,10 +181,21 @@
" X='text',\n",
" db=db,\n",
" select=collection.find(),\n",
" listen=True\n",
" listen=True,\n",
" show_progress_bar=True,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aed95196-4454-4d88-beb8-dc6dddfcbe33",
"metadata": {},
"outputs": [],
"source": [
"db.execute(collection.find_one())"
]
},
{
"cell_type": "markdown",
"id": "68fefc17",
Expand Down Expand Up @@ -201,10 +224,10 @@
"\n",
"# Train the model on 'text' data with corresponding labels\n",
"model.fit(\n",
" X='text',\n",
" X='_outputs.text.all-MiniLM-L6-v2.0',\n",
" y='label',\n",
" db=db,\n",
" select=collection.find().featurize({'text': 'all-MiniLM-L6-v2'}),\n",
" select=collection.find(),\n",
")\n"
]
},
Expand All @@ -227,9 +250,9 @@
"source": [
"# Make predictions on 'text' data with the trained SuperDuperDB model\n",
"model.predict(\n",
" X='text',\n",
" X='_outputs.text.all-MiniLM-L6-v2.0',\n",
" db=db,\n",
" select=collection.find().featurize({'text': 'all-MiniLM-L6-v2'}),\n",
" select=collection.find(),\n",
" listen=True,\n",
")"
]
Expand Down Expand Up @@ -258,16 +281,8 @@
"print(r['text'][:100])\n",
"\n",
"# Print the prediction made by the SVC model stored in '_outputs'\n",
"print(r['_outputs']['text']['svc'])"
"print(r['_outputs']['text']['svc']['0'])"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2174a0a3-c32e-4481-a301-370b569ba30c",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
Expand All @@ -286,7 +301,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.5"
"version": "3.11.6"
}
},
"nbformat": 4,
Expand Down
Loading

0 comments on commit 7bcd2fd

Please sign in to comment.