Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small speedup to snpbin #81

Merged
merged 2 commits into from
Aug 11, 2015
Merged

Small speedup to snpbin #81

merged 2 commits into from
Aug 11, 2015

Conversation

zkamvar
Copy link
Collaborator

@zkamvar zkamvar commented Jul 31, 2015

I realized that the subsetter was converting all the raw bits to integers before subsetting. If the subsetting happens before the integer conversion, we see a speedup in the data with large numbers of SNPs.

Protocol the same as #48.

Old Method:

Unit: milliseconds
           expr        min         lq       mean     median         uq       max neval cld
  x[, the_loci]   5.185082   5.678802   7.731961   6.007779   7.175650 149.25734   200 a
  y[, the_loci]   5.902340   6.467108   7.996073   7.000335   8.356582  39.65144   200 a
  z[, the_loci]  22.080076  26.274703  30.389466  28.280454  31.426308 189.91701   200  b
 zz[, the_loci] 142.129836 165.874361 235.373325 191.655808 309.998984 480.83209   200   c

New Method:

Unit: milliseconds
           expr       min         lq       mean     median         uq       max neval cld
  x[, the_loci]  5.181068   5.522196   6.350872   6.133190   6.624304  12.54894   200 a
  y[, the_loci]  5.676456   6.080247   7.126351   6.584142   7.596402  35.41565   200 a
  z[, the_loci] 10.286474  13.144538  16.275613  15.327015  17.514597 165.40598   200  b
 zz[, the_loci] 93.555897 103.086456 131.967272 108.340221 121.095159 323.77547   200   c

zkamvar added 2 commits July 31, 2015 11:15
This gives a slight increase in performance:

Old Method:

```
Unit: milliseconds
           expr        min         lq       mean     median         uq       max neval cld
  x[, the_loci]   5.185082   5.678802   7.731961   6.007779   7.175650 149.25734   200 a
  y[, the_loci]   5.902340   6.467108   7.996073   7.000335   8.356582  39.65144   200 a
  z[, the_loci]  22.080076  26.274703  30.389466  28.280454  31.426308 189.91701   200  b
 zz[, the_loci] 142.129836 165.874361 235.373325 191.655808 309.998984 480.83209   200   c
```

New Method:

```
Unit: milliseconds
           expr       min         lq       mean     median         uq       max neval cld
  x[, the_loci]  5.181068   5.522196   6.350872   6.133190   6.624304  12.54894   200 a
  y[, the_loci]  5.676456   6.080247   7.126351   6.584142   7.596402  35.41565   200 a
  z[, the_loci] 10.286474  13.144538  16.275613  15.327015  17.514597 165.40598   200  b
 zz[, the_loci] 93.555897 103.086456 131.967272 108.340221 121.095159 323.77547   200   c
```
@thibautjombart
Copy link
Owner

Looks great. Will merge after the current workshop is over, as we will rely on adegenet devel.

thibautjombart added a commit that referenced this pull request Aug 11, 2015
@thibautjombart thibautjombart merged commit 46ccf59 into master Aug 11, 2015
@thibautjombart
Copy link
Owner

Awesome, thanks!

@zkamvar zkamvar deleted the snpbin-faster branch September 1, 2015 22:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants