Skip to content

Commit

Permalink
FIX use orf_num rather than index to flip reverse strand profiles
Browse files Browse the repository at this point in the history
This is related to Issue #54.

Previously, the code assumed the orfs data frame was sorted in the same order
as the profiles. At one point, this was true; however, it seems the access
patterns and/or internal pandas data structures regarding indexes have changed
(see "What's new" for 0.19.2), and it is no longer correct. Thus, ORFs on the
reverse strand were not correctly flipped.

The code now explicitly uses orf_num to match orfs in the data frame with their
profile (which is the designed and expected behavior).
  • Loading branch information
bmmalone committed Apr 2, 2017
1 parent 63ca81c commit 0a9253d
Showing 1 changed file with 9 additions and 6 deletions.
15 changes: 9 additions & 6 deletions rpbp/orf_profile_construction/extract_orf_profiles.py
Original file line number Diff line number Diff line change
Expand Up @@ -230,15 +230,18 @@ def main():
logger.info(msg)

m_reverse = orfs['strand'] == '-'
reverse_indices = np.where(m_reverse)[0]
reverse_orfs = orfs[m_reverse]

for i in tqdm.tqdm(reverse_indices):
if sum_profiles[i].sum() == 0:
for idx, reverse_orf in tqdm.tqdm(reverse_orfs.iterrows()):
orf_num = reverse_orf['orf_num']

if sum_profiles[orf_num].sum() == 0:
continue
orf_len = orfs.iloc[i]['orf_len']
dense = utils.to_dense(sum_profiles, i, length=orf_len)

orf_len = reverse_orf['orf_len']
dense = utils.to_dense(sum_profiles, orf_num, length=orf_len)
dense = dense[::-1]
sum_profiles_lil[i, :orf_len] = dense
sum_profiles_lil[orf_num, :orf_len] = dense

msg = "Writing the sparse matrix to disk"
logger.info(msg)
Expand Down

0 comments on commit 0a9253d

Please sign in to comment.