Quickstart¶
intake-mongo
provides quick and easy access to tabular data stored in
MongoDB
This plugin reads MongoDB collections without random access: there is only ever a single partition.
Installation¶
To use this plugin for intake, install with the following command:
conda install -c intake intake-mongo
Usage¶
Ad-hoc¶
After installation, the function intake.open_mongo
will become available. It can be used to fetch a collection from the MongoDB
server, and download the results as a list of dictionaries.
The parameters are of interest when defining a data source:
- uri: a string like
'mongodb://localhost:27017'
to reach the server on. Additional connection information can be supplied as part of the URI or with the separateconnection_kwargs
parameter (a key-value dictionary). - db: the name of the mongo database to access
- collection: a string like
'test_collection'
identifying a dataset on the server, within the given database find_kwargs
: a broad range of possible parameters to pass to the find method, including filtering, sorting, choosing of fields
Creating Catalog Entries¶
To include in a catalog, the plugin must be listed in the plugins of the catalog:
plugins:
source:
- module: intake_mongo
and entries must specify driver: mongo
.
Using a Catalog¶
A full entry might look like:
sources:
sample1:
driver: mongo
args:
uri: "mongodb://localhost:27017"
db: "test-database"
collection: mycollection
connect_kwargs: {"ssl": true}
find_kwargs: {'projection': ['field1', 'field2]}
_id: false
In this case, we specify a connection to the local machine, connect with SSL activated,
and select only “field1” and “field2” of mycollection
to retrieve, with the default
ID column not included in the output.
Note that the set of options available will depend on the version of pymongo installed.