Export tabular data as a CSV file¶

Problem¶

Export data to a CSV file.

Approach¶

Load the data from the databroker into a table (a DataFrame) and use the pandas package to effeciently export to CSV.

Then, to export a batch files from multiple runs, we generate filenames that incorporated metadata from the scan in automated fashion.

Example Solution¶

Suppose we ran a scan like this to generate some data. Notice that we stash the unique ID(s) in a variable, uids. This is the most convenient and reliable way to retrieve the data afterward.

uids = RE(scan[det], motor, -10, 10, 15)

Exporting one file¶

# This may already be imported in your configuration.
from databroker import db

headers = db[uids]
db.get_table(headers).to_csv('filename.csv')

Exporting a subset of the columns¶

Suppose we are only interested in the ‘det’ column and the ‘time’ column.

columns = ['time', 'det']
db.get_table(headers)[columns].to_csv('filename.csv')

Exporting multiple files with automated naming¶

We’ll generate three runs, destined for three separate CSV files. Notice that we include an extra piece of custom metadata. We’ll call it ‘sample’, but it could be any metadata at all.

from bluesky.plans import pchain  # a utility for chaining plans together

# This plan combines three 'scan' plans.
# Each scan includes some optional extra metadata that we'll use later.
master_plan = pchain(scan([det], motor, -10, 10, 15, md={'sample': 'A'}),
                     scan([det], motor, -5, 5, 15, md={'sample': 'B'}),
                     scan([det], motor, -3, 3, 10, md={'sample': 'C'}))
uids = RE(master_plan)

In the example below, we incorporate the value of ‘sample’ in the filename of CSV files.

headers = db[uids]
for h in headers:
    s = h['start']  # the portion of the header with most useful metadata
    table = db.get_table(h)
    # In the filename below, {sample} and {uid} are filled in with
    # values from s.
    filename = '{sample}_{uid}.csv'.format(**s)
    print('saving table as', filename)
    table.to_csv(filename)

The filenames generated by this code will begin with ‘A’, ‘B’, ‘C’, corresponding to the sample. The unique ID is also included. Using the unique IDs in the filenames ensures that we can always go back to find the original data. It’s not pretty, but it’s reliable.