This is the pipeline for assessing patient function based on data from the Flo System. The fundamental idea is to operate over a 3 step pipeline:
- Extract pose
- Calculate features of motion
- Classify/Regress measure of function
Along the way there is data uploading, cleaning, manipulation, and visualization.
Data is ingested as compressed bag files. That data is put into HDF5 files. All of the non-video generated data (poses, etc.) go into a separate hdf5 file to make it easier to manage.
Everything can be run locally, but is really meant to run on Oracle Cloud Infrastructure (OCI).
- Try to run everything in docker
- All dockerfiles go in dockerfiles. Where possible re-use.
- Assume that any scripts will be run from wherever. Wrap everything up so that it can tolerate that.
- Don't require uses to call docker directly. Wrap it up.
- Any code specific to OCI should be in the
oci_utilities
directory. Everything else should be able to run on any general system with the right compute capabilities. - Lint your python code:
git ls-files | grep 'py$' | xargs pylint --rcfile=pylintrc --unsafe-load-any-extension=y --disable=fixme
you will likely get some import errors for ros packages that run in docker. That is OK. Everything else should be totally clean. - Lint your shell code:
git ls-files | grep 'sh$' | xargs shellcheck -x --source-path=SCRIPTDIR
- Sometimes you will need to break the rules of pylint, generally because you have too many local variables or something. Recognize that these rules exist for a good reason. But if you feel that there is not a cleaer way to refactor your code that will make it more readable, then you can put in pylint disable statements for single errors only (no ignoring whole files).
- ViTables: is really great for being able to explore hdf5 files
- Each step of the pipeline more or less gets its own directory.
common
directory for python code used across multiple other stepsdockerfiles
for dockerfiles to run the various toolsoci_utilities
scripts to help run the pipeline on OCI
Start by installing the OCI CLI.
There are a few different ways to authenticate yourself to work with resources on OCI. The only secure one is to use tokens. To do this, run the authentication script and export the following variables into your shell (easiest to do this from your bashrc/zshrc):
export OCI_CLI_PROFILE='token-oci-profile'
export OCI_CLI_CONFIG_FILE="$HOME/.oci/config"
export OCI_CLI_AUTH='security_token'
Any commands from the oci cli should now work.
- create a folder tree:
- the subject number (three digits
NNN
ex:009
or024
)- gopro
- 3rd-person
- ros
- robot
- podium
- the subject number (three digits
- Reindex any incomplete bag files:
./upload_data/repair_bags.sh -t <directory with bag files>
and you may want to use the-v
option to see progress - packup the parameter files:
./upload_data/tar-meta.sh -t <directory with parameter files>
- put the compressed bag files and tar parameter file into either the robot or podium directories based on where the files came from.
- Compress the bag files:
find <dir with bag files> -name '*.bag' -exec lbzip2 -v {} \;
- put all of the gopro videos into the gopro folder
- put all of the 3rd person videos into the 3rd-person folder
There are two ways to upload to Penn+Box. Using either the web console or FTP. The web console is easier to use but if the upload stops, you are up a creek, and it is kind of slow. FTP requires you to know what you are doing a bit better, but can be faster and can handle upload interuptions.
- Drag the subjects folder into the Penn+Box Folder
OR
- Setup a box password (this is seperate from you Penn password and only works on box)
- install lftp
- connect:
lftp -u <pennid>@upenn.edu ftp.box.com
then enter your box (not penn) password - Now you can navigate around. For example, use cd to go to where you want to put files on box and lcd to where the files are on your "local" machine
- mirror the "local" machine to Box using
mirror --reverse --parallel=<num parallel uploads>
Uploading to OCI requires authentication which is granted through a login token.
You could do this manually using the lab's oci tools
Or you can directly use the upload script ./upload_data/upload_to_oci.sh -t <src dir to upload> -p <prefix>
.
The prefix should be the subject number, fully padded, ex: 009
or 020
.
We need to be able to generate video files for reviewing studies, sharing work, and coding video.
To do this, refer to prep_code_vids/README.md
In order to make processing easier, we move everything into HDF5 files. This allows easier indexing and out of order processing
For instructions on running, see: convert_to_hdf5/README.md
Some of the trials don't have all of the data they need for the extrinsics for the depth/ rgb. Luckily, this doesn't change too much over time. So we can grab data from other trials. To do that, do this:
- Create a new instane:
- Literally any computer (you can even do this locally, but these instruction assume you are in the cloud)
- Push the code files to OCI by running
./oci_utilities/push_code.sh
- Remote into that instance. Ex:
oci-cli-helpers/utilities/oci-ssh.sh $(oci-cli-helpers/utilities/ocid.sh instance rrl-flo-blocksetup_0)
- Setup permissions:
OCI_CLI_AUTH=instance_principal && export OCI_CLI_AUTH
- Install the oci cli:
sudo dnf -y install oraclelinux-developer-release-el8 && sudo dnf -y install python36-oci-cli
- Pull down code onto the remote instance:
oci os object bulk-download -bn 'rrl-flo-run' --download-dir "$HOME/LilFloAssessmentPipeline" --overwrite
- Install python/pip:
sudo dnf -y install python3
- Install h5py:
pip3 install --user h5py
- Run:
./oci_utilities/get_transforms/run.sh
A central component of the pipeline is extracting pose from video. There are a few different tools that can be used to do that, all of them imperfect.
OpenPose provides 2D pose of subjects. You can view more information at openpose/README.md.
View inspect/README.md
During development, testing, etc. It might be good to be able to download data from Penn+Box. The best way to do this with with lftp:
sudo apt install lftp
lftp <penn username>@upenn.edu@ftp.box.com
- use
lcd
!ls
cd
andls
to navigate the remote and local directories - Get files:
- Use mirror to get an entire directory of files:
mirror --parallel=<num_parallel_files> <remote source> <local destination>
- Use pget to get a file via multiple parallel streams:
pget <filename>
- Use mget to get files. Something like
mget flo_recording_2020-12-16-15-3* -P 10
might be useful, 10 is saying to download up to 10 files simultaneously. - Note, you can put an
&
at the end of any command to be able to start the next command. You can recover a command withwait <command number (shown when you put it in background)>
, you can send it back to the background with ctrl-z. You can view all jobs withjobs
- Use mirror to get an entire directory of files: