API Documentation¶
We use the Broker
to pose queries for saved data sets (“runs”). A
search query returns a lazy-loaded iterable of Header
objects.
Each Header
encapsulates the metadata for one run. It provides
convenient methods for exploring that metadata and loading the full data.
The Broker object¶
This supports the original Broker API but implemented on intake.Catalog. |
Making a Broker¶
You can instantiate a Broker
by passing it a dictionary of
configuration or by providing the name of a configuration file on disk.
Create a new Broker instance using a configuration file with this name. |
Click the links the table above for details and examples.
Searching¶
Call self as a function. |
|
For some Broker
instance named db
, db()
invokes
Broker.__call__()
and db[]
invokes Broker.__getitem__()
.
Again, click the links the table above for details and examples.
Loading Data¶
These methods are an older way to access data, like this:
header = db[-1]
db.get_table(header)
The newer Header
methods, documented later on this page are more convenient.
header = db[-1]
header.table()
(Notice that we only had to type db
once.) However, these are
still useful to loading data from multiple headers at once, which
the new methods cannot do:
headers = db[-10:] # the ten most recent runs
db.get_table(headers)
Get all documents from one or more runs. |
|
Get Event documents from one or more runs. |
|
Load the data from one or more runs as a table ( |
|
This method is deprecated. |
|
Get all Documents from given run(s). |
|
Pass all the documents from one or more runs into a callback. |
The broker also has a number of methods to introspect headers:
Return the set of all field names (a.k.a “data keys”) in a header. |
|
Saving Data¶
Configuring Filters and Aliases¶
Add query to the list of ‘filter’ queries. |
|
Clear all ‘filter’ queries. |
|
Create an alias for a query. |
|
Create an alias for a “dynamic” query, a function that returns a query. |
This current list of filters and aliases is accessible via the attributes
Broker.filters
and Broker.aliases
respectively. Again, click
the links in the table for examples.
Export Data to Another Broker¶
Serialize a list of runs. |
|
Get the size of files associated with a list of headers. |
Advanced: Controlling the Return Type¶
The attribute Broker.prepare_hook
is a function with the signature
f(name, doc)
that is applied to every document on its way out.
By default Broker.prepare_hook
is set to
wrap_in_deprecated_doct()
. The resultant objects issue warnings if
users attempt to access items with dot access like event.data
instead of
dict-style lookup like event['data']
. To restore the previous behavior (i.e.
suppress the warnings on dot access), set Broker.prepare_hook
to
wrap_in_doct()
.
To obtain plain dictionaries, set Broker.prepare_hook
to
lambda name, doc: copy.deepcopy(doc)
. (Copying of is recommended because
the underlying objects are cached and mutable.)
In a future release of databroker, the default return type may be changed to plain dictionaries for simplicity and improved performance.
Put document contents into a DeprecatedDoct object. |
|
Put document contents into a doct.Document object. |
The Header object¶
This supports the original Header API but implemented on intake’s Entry. |
Metadata¶
The Header
bundles together the metadata of a run, accessible via the
attributes corresponding to the underlying documents:
Measurements are organized into “streams” of asynchronously collected data. The
names of all the streams are listed in the attribute
Header.stream_names
.
Note
It helps to understand how data and metadata are organized in our document model. This is covered well in this section of the bluesky documentation.
The information in these documents is a lot to navigate. Convenience methods make it easier to extract some key information:
Return the names of the fields (‘data keys’) in this run. |
|
Return the names of the devices in this run. |
|
Extract device configuration data from Event Descriptors. |
|
Data¶
The Header
provides various methods for loading the ‘Event’ documents,
which may be large. They all access the same data, presented in various ways
for convenience.
Load the data from one event stream as a table ( |
|
Extract data for one field. |
|
Load all documents from the run. |
|
Load all Event documents from one event stream. |
|
All of the above accept an argument called stream_name
, which distinguishes
concurrently-collected stream of data. (Typical names include ‘primary’ and
‘baseline’.) A list of all a header’s stream names is accessible via the
attribute Header.stream_names
, a list.
To request data from all event streams at once, use the special constant
databroker.ALL
.
Configuration Utilities¶
List the names of the available configuration files. |
|
Search for a databroker configuration file with a given name. |
|
Return the v0 config dict this was created from, or None if N/A. |
|
Access MongoDB storage statistics for this database. |
Back- and Forward-Compat Accessors¶
A self-reference. |
|
Accessor to the version 2 API. |
Internals¶
Registry of externally-stored data |
|
Deprecated¶
Get all Documents from given run(s). |
|