Transcription data analysis code from the arXiv:1701.06079 manuscript.

To launch the code, open the main.py file. The lines of code are accompanied by comments for an easier understanding of the function. No installation is required to launch the code.

Every function located in a different file normally has a description at the top of the file. Most constants re-used throughout the analysis can be found and modified in the constants.py.

When running the main.py file from the terminal, comment out the first try... except... block, which provides code reloading for interactive environments like Jupyter.

Input data file

To use the code, you need to supply the input data in a form of a table with pre-defined columns. The .csv file must be a comma-separated UTF-8 encoded file. An example of the input file called example_data.csv can be found in the example folder. The following columns are required to execute the code.

Column	Description
time	Time in seconds
frame	An integer corresponding to the frame number of the recorded video. Used to detect the time step
intensity	Fluorescence intensity of the transcription region in the nucleus, in a.u.
ap	Location of the region on the anterior-posterior axis of the embryo, where ap=0 corresponds to the embryo's head
dataset	Dataset name. Can be any string not containing slashes or backslashes
dataset_id	A unique sequential identifier with a one-to-one correspondence to the data set name. The counting starts from 0
construct	The name of the gene construct in the data set. Each data set must contain data from a single construct. In the data sets of the paper cited above, the construct takes one of 3 values: `bac`, `no_sh` or `no_pr`
construct_id	A unique sequential integer identifier for the constructs, starting at 0
gene	Gene name. Each data set must contain data from a single gene. In the data sets of the paper cited above, the construct takes one of 3 values: `hb` (hunchback), `sn` (snail) or `kn` (knirps)
gene_id	Same as `construct_id`, but for genes
trace_id	A sequential integer identifier for fluorescent traces (nuclei), starting at 0. Must be unique within each data set. Each data set may contain multiple traces

Requirements

Python 3

Algorithm details

Nuclear cycles detection

The nuclear cycles are detected in each data set by first selecting a background threshold, which separates the continuous trace into nuclear cycles above the threshold. The default intensity threshold is defined by default_intensity_threshold in constants.py. If a data set needs individual adjustments to the threshold, they must be put into the intensity_thresholds dictionary in constants.py.

The ncs are then numbered. By default, it is assumed that the last observed nc is nc14. In some data sets, not all of the ncs have been recorded, or artefact ncs may be created by thresholding. To correct for this problem, one may manually specify the number of the last nc in the data set by adding it into the last_ncs dictionary in identify_ncs function in calculate.py.

Initial slope fitting

The initial slopes are fitted to all points (non-averaged) of a nuclear cycles, in the first several minutes from the beginning of the nuclear cycle. The exact fit duration is defined by the slope_length_mins constant. The procedure is discussed in the accompanying article. The nc must contain at least 3 frames to be fitted. If the slope fit is negative, a nan value is stored instead.

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
MATLAB conversion code		MATLAB conversion code
archive		archive
example		example
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
calculate.py		calculate.py
constants.py		constants.py
literature_estimates.py		literature_estimates.py
main.py		main.py
plot.py		plot.py
requirements.txt		requirements.txt
support.py		support.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transcription data analysis code from the arXiv:1701.06079 manuscript.

Input data file

Requirements

Algorithm details

Nuclear cycles detection

Initial slope fitting

About

Releases

Packages

Languages

License

Alexander-Serov/TASEP-analysis-drosophila

Folders and files

Latest commit

History

Repository files navigation

Transcription data analysis code from the arXiv:1701.06079 manuscript.

Input data file

Requirements

Algorithm details

Nuclear cycles detection

Initial slope fitting

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages