-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AssertionError in prune2df #132
Comments
Hi @Matthias3033 , Can you list the databases you are using here? From the error, it sounds like there were no genes found in database that overlap with your data. |
Hi @cflerin, these are the databases that I use: |
The databases look fine (although there's no need to use the 7-species when also using 10-species, but it won't cause issues). Are you also using the correct motif annotations file (for human)? How many genes in your expression matrix? And how many modules do you have? |
I am using the correct motif file. The number of genes is 17098. How do I get the number of modules? (with len(modules) I get 4996) |
Just noticed:
which seems self-explanatory. You could try taking out the three 7-species databases and see if it works with the remaining databases. |
Same error. I've also tried it with only one 7 species database - still the same error. |
How much memory do you have available on your machine? You could try reducing the number of processes that pyscenic is using... |
How can I reduce the number of processes? |
Via CLI you have the parameter For the prune2df function (cistarget step) the parameter name is num_workers. For grnboost, I kindely refer you to the arboreto package documentation: https://github.com/tmoerman/arboreto . Briefly, you need to use a construct like this:
|
I ideally have 120 gb of RAM, so the memory should normally not be a problem |
Hi,
I get the following error message when I use the function prune2df:
AssertionError Traceback (most recent call last)
in
3 # Calculate a list of enriched motifs and the corresponding target genes for all modules.
4 with ProgressBar():
----> 5 df = prune2df(dbs, modules, MOTIF_ANNOTATIONS_FNAME_HS)
6
7 # Create regulons from this table of enriched motifs.
~/miniconda3/lib/python3.7/site-packages/pyscenic/prune.py in prune2df(rnkdbs, modules, motif_annotations_fname, rank_threshold, auc_threshold, nes_threshold, motif_similarity_fdr, orthologuous_identity_threshold, weighted_recovery, client_or_address, num_workers, module_chunksize, filter_for_annotation)
349 return _distributed_calc(rnkdbs, modules, motif_annotations_fname, transformation_func, aggregation_func,
350 motif_similarity_fdr, orthologuous_identity_threshold, client_or_address,
--> 351 num_workers, module_chunksize)
352
353
~/miniconda3/lib/python3.7/site-packages/pyscenic/prune.py in _distributed_calc(rnkdbs, modules, motif_annotations_fname, transform_func, aggregate_func, motif_similarity_fdr, orthologuous_identity_threshold, client_or_address, num_workers, module_chunksize)
298 if client_or_address == "dask_multiprocessing":
299 # ... via multiprocessing.
--> 300 return create_graph().compute(scheduler='processes', num_workers=num_workers if num_workers else cpu_count())
301 else:
302 # ... via dask.distributed framework.
~/miniconda3/lib/python3.7/site-packages/dask/base.py in compute(self, **kwargs)
154 dask.base.compute
155 """
--> 156 (result,) = compute(self, traverse=False, **kwargs)
157 return result
158
~/miniconda3/lib/python3.7/site-packages/dask/base.py in compute(*args, **kwargs)
395 keys = [x.dask_keys() for x in collections]
396 postcomputes = [x.dask_postcompute() for x in collections]
--> 397 results = schedule(dsk, keys, **kwargs)
398 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])
399
~/miniconda3/lib/python3.7/site-packages/dask/multiprocessing.py in get(dsk, keys, num_workers, func_loads, func_dumps, optimize_graph, **kwargs)
190 get_id=_process_get_id, dumps=dumps, loads=loads,
191 pack_exception=pack_exception,
--> 192 raise_exception=reraise, **kwargs)
193 finally:
194 if cleanup:
~/miniconda3/lib/python3.7/site-packages/dask/local.py in get_async(apply_async, num_workers, dsk, result, cache, get_id, rerun_exceptions_locally, pack_exception, raise_exception, callbacks, dumps, loads, **kwargs)
499 _execute_task(task, data) # Re-execute locally
500 else:
--> 501 raise_exception(exc, tb)
502 res, worker_id = loads(res_info)
503 state['cache'][key] = res
~/miniconda3/lib/python3.7/site-packages/dask/compatibility.py in reraise(exc, tb)
110 if exc.traceback is not tb:
111 raise exc.with_traceback(tb)
--> 112 raise exc
113
114 else:
~/miniconda3/lib/python3.7/site-packages/dask/local.py in execute_task()
270 try:
271 task, data = loads(task_info)
--> 272 result = _execute_task(task, data)
273 id = get_id()
274 result = dumps((result, id))
~/miniconda3/lib/python3.7/site-packages/dask/local.py in _execute_task()
250 elif istask(arg):
251 func, args = arg[0], arg[1:]
--> 252 args2 = [_execute_task(a, cache) for a in args]
253 return func(*args2)
254 elif not ishashable(arg):
~/miniconda3/lib/python3.7/site-packages/dask/local.py in ()
250 elif istask(arg):
251 func, args = arg[0], arg[1:]
--> 252 args2 = [_execute_task(a, cache) for a in args]
253 return func(*args2)
254 elif not ishashable(arg):
~/miniconda3/lib/python3.7/site-packages/dask/local.py in _execute_task()
251 func, args = arg[0], arg[1:]
252 args2 = [_execute_task(a, cache) for a in args]
--> 253 return func(*args2)
254 elif not ishashable(arg):
255 return arg
~/miniconda3/lib/python3.7/site-packages/pyscenic/transform.py in modules2df()
229 #TODO: Remove this restriction.
230 return pd.concat([module2df(db, module, motif_annotations, weighted_recovery, False, module2features_func)
--> 231 for module in modules])
232
233
~/miniconda3/lib/python3.7/site-packages/pyscenic/transform.py in ()
229 #TODO: Remove this restriction.
230 return pd.concat([module2df(db, module, motif_annotations, weighted_recovery, False, module2features_func)
--> 231 for module in modules])
232
233
~/miniconda3/lib/python3.7/site-packages/pyscenic/transform.py in module2df()
183 try:
184 df_annotated_features, rccs, rankings, genes, avg2stdrcc = module2features_func(db, module, motif_annotations,
--> 185 weighted_recovery=weighted_recovery)
186 except MemoryError:
187 LOGGER.error("Unable to process "{}" on database "{}" because ran out of memory. Stacktrace:".format(module.name, db.name))
~/miniconda3/lib/python3.7/site-packages/pyscenic/transform.py in module2features_auc1st_impl()
127 # Calculate recovery curves, AUC and NES values.
128 # For fast unweighted implementation so weights to None.
--> 129 aucs = calc_aucs(df, db.total_genes, weights, auc_threshold)
130 ness = (aucs - aucs.mean()) / aucs.std()
131
~/miniconda3/lib/python3.7/site-packages/pyscenic/recovery.py in aucs()
282 # for calculationg the maximum AUC.
283 maxauc = float((rank_cutoff+1) * y_max)
--> 284 assert maxauc > 0
285 return auc2d(rankings, weights, rank_cutoff, maxauc)
AssertionError:
As ranking database I use homo sapiens. I do not receive this error message when using Mus musculus for another data set. The error mentioned under issue 85 is not present here. Does anyone have an idea how to fix this error?
The text was updated successfully, but these errors were encountered: