Foraging Behavior Pipeline for Bonsai

(Han Hou @ Aug 2023)

This is still a temporary workaround until AIND behavior pipeline is implemented.

Pipeline structure

1. (On Han's PC) Upload raw behavior data to cloud (github)

From all behavior rigs, fetch raw behavior files (.json) generated by the foraging-bonsai GUI
Turn .json files into .nwb files, which contain both data and metadata
Upload all .nwb files to a single S3 bucket s3://aind-behavior-data/foraging_nwb_bonsai/

2. (In Code Ocean, this repo) Trigger computation (`CO capsule: foraging_behavior_bonsai_pipeline_trigger`, github)

Identify unprocessed .nwb files (github)
Send unprocessed .nwb files to CO pipeline: Han_pipeline_foraging_behavior_bonsai.
In the CO pipeline:
- Distribute .nwb files to parallel workers (CO capsule: foraging_behavior_bonsai_pipeline_assign_job, github)
- Do real analysis on each .nwb file (CO capsule: foraging_behavior_bonsai_nwb, github), where arbitrary dataframes and figures are generated.
Collect and combine results from the workers (CO capsule: foraging_behavior_bonsai_pipeline_collect_and_upload_results, github)
Upload results to this S3 bucket s3://aind-behavior-data/foraging_nwb_bonsai_processed/

3. (In Code Ocean) Visualization by Streamlit app (`CO capsule: foraging-behavior-browser`, github)

The Streamlit app fetches data from the above S3 bucket and generates data viz. You could run the app either on Code Ocean (recommended) or on Streamlit public cloud

Automatic training

See this repo

How to add more rigs

On the rig PC, share the data folder to the Windows Network.
Make sure the data folder is accessible through typing the network address like \\W10DT714033\behavior_data\447-1-D in Windows Explorer on another PC.
Let me know network address, the username, and the passcode. I will create a new entry here.

How to add more analyses

The pipeline is still a prototype at this moment. As you can see in the Streamlit app, so far I only implemented two basic analyses:

compute essential session-wise stats
generate a simple plot of choice-reward history

To add more analyses to the pipeline, just plug in your own function here. Your function should take nwb as an input and generate plots or any other results with filename starting with session_id.

If you would like to access the .nwb files directly or do analysis outside Code Ocean (not recommended though), check out this bucket s3://aind-behavior-data/foraging_nwb_bonsai/. For details, see below.

Pipeline-ready checklist

Checklist before the pipeline is ready to run:

CO pipeline Han_pipeline_foraging_behavior_bonsai:
- No yellow warning sign (otherwise, do a Reproducible Run of that capsule first)
- Check the argument of foraging_behavior_bonsai_pipeline_assign_job that controls the number of capsule instances
- Check the argument of foraging_behavior_bonsai_nwb that controls the number of multiprocessing cores of each instance.
  - This number should match the core number of "Adjust Resources for capsule in pipeline"
- Make sure the pipeline is set to use "Spot instances" (otherwise it takes too long to start) and "without cache" (otherwise the input S3 bucket will not be updated)
Make sure these capsules are not running (Status is four gray dots; VSCode are held or terminated)
- foraging_behavior_bonsai_pipeline_assign_job
- foraging_behavior_bonsai_nwb
- foraging_behavior_bonsai_pipeline_collect_and_upload_results
Make sure one and only one instance of foraging_behavior_bonsai_pipeline_trigger is running.
Make sure one and only one instance of foraging-behavior-bonsai-automatic-training is running.

Notes on manually re-process all nwbs and overwrite S3 database (and thus the Streamlit app)

Important

I should do this after work hours, as it will be disruptive to the AutoTrain system. (see this issue)

Stop the triggering capsule and the AutoTraining capsule.
(optional) Re-generate all nwbs
- Backup nwb folder on my PC and S3
- On S3, move the old /foraging_nwb_bonsai to a backup folder and create a new /foraging_nwb_bonsai
- Re-generate nwbs from jsons on my PC
Backup and clear /foraging_nwb_bonsai_processed bucket
- On S3, copy the folder to a backup folder
- Clear the old folder
  - If you don't clear it, at least you should delete df_sessions.pkl, error_files.json, and pipeline.log (they will be appended, not overwritten)
  - Troubleshooting: when attaching a S3 folder to a capsule, the folder must not be empty (otherwise a "permission denied" error)

Case A: still use the pipeline (recommended)

Make sure to assign 10 or more workers and set CPU number = 16 (for spot machine) and argument = 16. In this case, you'll have > 10 * 16 = 160 total cores!

Trigger the pipeline as usual. In this case, only diff of nwb and nwb_processed will be processed. (it works well if you have already cleaned up the processed folder)

Case B: manually run each capsule (obsoleted)

Manually trigger the batch computation in capsule foraging_behavior_bonsai_nwb:
- Make sure the CPU number of the environment is 16 or more :)
- Run processing_nwb.py manually in parallel (with LOCAL_MANUAL_OVERRIDE = True)
Manually trigger the collect_and_upload capsule:
- Manually register a data asset:
  - Use any name, but mount must be data/foraging_behavior_bonsai_pipeline_results
  - The data asset cannot be registered in VSCode?? @20240303 I can only create data asset outside VSCode.
- In the capsule collect_and_upload_restuls, manually attach the data asset just created, and press Reproducible Run.
  - I have adapted collect_and_upload_restuls so that it can also accept data that are not in /1, /2, ... like those from the pipeline run.
To restore the pipeline, follow above "Pipeline-ready checklist"

Accessing foraging .nwbs for off-pipeline analysis

.nwb datasets	Dataset 1	Dataset 2 (old)
Where are the data collected?	AIND	Janelia and AIND
Behavior hardware	Bonsai-Harp	Bpod
Size	1423 sessions / 92 mice	4327 sessions / 157 mice
Modality	behavior only	3803 sessions / 157 mice: pure behavior 35 sessions / 8 mice: ephys + DLC outputs
Still growing?	Yes; updating daily (by the current repo)	No longer updating
NWB format	New bonsai nwb format	Compatible with the new bonsai nwb format
Raw NWBs	- S3 bucket: `s3://aind-behavior-data/foraging_nwb_bonsai/` - CO data asset: `foraging_nwb_bonsai` (id=f908dd4d-d7ed-4d52-97cf-ccd0e167c659)	- S3 bucket: `s3://aind-behavior-data/foraging_nwb_bpod/` - CO data asset: `foraging_nwb_bpod` (id=4ba57838-f4b7-4215-a9d9-11bccaaf2e1c)
Processed results	- S3 bucket: `s3://aind-behavior-data/foraging_nwb_bonsai_processed/` - CO data asset: `foraging_nwb_bonsai_processed` (id=4ad1364f-6f67-494c-a943-1e957ab574bb)	- S3 bucket: `s3://aind-behavior-data/foraging_nwb_bpod_processed/` - CO data asset: `foraging_nwb_bpod_processed` (id=7f869b24-9132-43d3-8313-4b481effeead)
Code Ocean example capsule	`foraging_behavior_bonsai_nwb`	`foraging_behavior_bonsai_nwb`
Streamlit visualization	The Streamlit behavior browser	Click "include old Bpod sessions" in the app
How to access the master table showing in the app?	the `df_sessions.pkl` file in the "Processed results" path above	same as left (except the "Processed results" path for `bpod`)
Notes	Some sessions have fiber photometry data or ephys data collected at the same time, but they have not been integrated to the .nwbs yet.	Some sessions have fiber photometry data collected at the same time, but they have not been integrated to the .nwbs yet.

What's next

We will likely be refactoring the pipeline after we figure out the AIND behavior metadata schema, but the core ideas and data analysis code developed here will remain. Stay tuned.

Name		Name	Last commit message	Last commit date
Latest commit History 87 Commits
code		code
environment		environment
metadata		metadata
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Foraging Behavior Pipeline for Bonsai

Pipeline structure

1. (On Han's PC) Upload raw behavior data to cloud (github)

2. (In Code Ocean, this repo) Trigger computation (`CO capsule: foraging_behavior_bonsai_pipeline_trigger`, github)

3. (In Code Ocean) Visualization by Streamlit app (`CO capsule: foraging-behavior-browser`, github)

Automatic training

How to add more rigs

How to add more analyses

Pipeline-ready checklist

Notes on manually re-process all nwbs and overwrite S3 database (and thus the Streamlit app)

Case A: still use the pipeline (recommended)

Case B: manually run each capsule (obsoleted)

Accessing foraging .nwbs for off-pipeline analysis

What's next

About

Releases

Packages

Languages

License

AllenNeuralDynamics/aind-foraging-behavior-bonsai-trigger-pipeline

Folders and files

Latest commit

History

Repository files navigation

Foraging Behavior Pipeline for Bonsai

Pipeline structure

1. (On Han's PC) Upload raw behavior data to cloud (github)

2. (In Code Ocean, this repo) Trigger computation (CO capsule: foraging_behavior_bonsai_pipeline_trigger, github)

3. (In Code Ocean) Visualization by Streamlit app (CO capsule: foraging-behavior-browser, github)

Automatic training

How to add more rigs

How to add more analyses

Pipeline-ready checklist

Notes on manually re-process all nwbs and overwrite S3 database (and thus the Streamlit app)

Case A: still use the pipeline (recommended)

Case B: manually run each capsule (obsoleted)

Accessing foraging .nwbs for off-pipeline analysis

What's next

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

2. (In Code Ocean, this repo) Trigger computation (`CO capsule: foraging_behavior_bonsai_pipeline_trigger`, github)

3. (In Code Ocean) Visualization by Streamlit app (`CO capsule: foraging-behavior-browser`, github)

Packages