Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only limited number of characters in CURRENT-POP module allowed? #72

Open
THccaa opened this issue Sep 16, 2020 · 1 comment
Open

Only limited number of characters in CURRENT-POP module allowed? #72

THccaa opened this issue Sep 16, 2020 · 1 comment

Comments

@THccaa
Copy link

THccaa commented Sep 16, 2020

When I run GPhoCS I get the following output:

**************************************************************

G-Phocs version 1.3.2,  Oct. 2017

**************************************************************
Setting Thread Count to: 4
Reading control settings from file gphocs.ctr...
Error: uneven terms in sample line for pop 1.
Error: expecting to see 110 samples in pop 1, but was able to read only 111.
Error: argument 'Kv29' is not accepted in POP module of CURRENT-POPS.
Error: argument 'PfTa22' is not accepted in POP module of CURRENT-POPS.
Error: uneven terms in sample line for pop 2.
Error: expecting to see 110 samples in pop 2, but was able to read only 111.
Error: argument 'eSk15' is not accepted in POP module of CURRENT-POPS.
Found 7 errors when parsing CURRENT-POPS in control file gphocs.ctr.

My CURRENT-POP module looks like this (first two species only):

CURRENT-POPS-START

		POP-START
				name		Pf
				samples	PfAd16 d PfAd17 d PfAd18 d PfAd19 d PfAd20 d PfAd21 d PfAd22 d PfAd23 d PfAd24 d PfAd25 d PfAd26 d PfAd27 d PfAd28 d PfAd29 d PfAd30 d PfBi11 d PfBi12 d PfBi13 d PfBi14 d PfBi15 d PfBi16 d PfBi17 d PfBi18 d PfBi19 d PfBi20 d PfBi21 d PfBi22 d PfBi23 d PfIh15 d PfIh16 d PfIh17 d PfIh18 d PfIh19 d PfIh20 d PfIh21 d PfIh22 d PfIh23 d PfIh24 d PfIh25 d PfIh26 d PfIh27 d PfIh28 d PfKv16 d PfKv17 d PfKv18 d PfKv19 d PfKv20 d PfKv21 d PfKv22 d PfKv23 d PfKv24 d PfKv25 d PfKv26 d PfKv27 d PfKv28 d PfKv29 d PfKv30 d PfMo15 d PfMo16 d PfMo17 d PfMo18 d PfMo19 d PfMo20 d PfMo21 d PfMo22 d PfMo23 d PfMo24 d PfMo25 d PfMo26 d PfMo27 d PfMo28 d PfMo29 d PfNa16 d PfNa17 d PfNa18 d PfNa19 d PfNa20 d PfNa21 d PfNa22 d PfNa23 d PfNa24 d PfNa25 d PfNa26 d PfNa27 d PfNa28 d PfNa29 d PfNa30 d PfSk16 d PfSk17 d PfSk18 d PfSk19 d PfSk20 d PfSk21 d PfSk22 d PfSk23 d PfSk24 d PfSk25 d PfSk26 d PfSk27 d PfSk28 d PfSk29 d PfSk30 d PfTa13 d PfTa14 d PfTa15 d PfTa16 d PfTa17 d PfTa18 d PfTa19 d PfTa20 d PfTa21 d PfTa22 d PfTa23 d PfTa24 d PfTa25 d PfTa26 d PfTa27 d PfZm14 d PfZm15 d PfZm16 d PfZm17 d PfZm18 d PfZm19 d PfZm20 d PfZm21 d PfZm22 d PfZm23 d PfZm24 d PfZm25 d PfZm26 d PfZm28 d 
		POP-END

		POP-START
				name		Pse
				samples		PeAd01 d PeAd02 d PeAd03 d PeAd05 d PeAd06 d PeAd08 d PeAd09 d PeAd10 d PeAd11 d PeAd12 d PeAd13 d PeAd14 d PeAd15 d PeIh01 d PeIh02 d PeIh03 d PeIh04 d PeIh05 d PeIh06 d PeIh07 d PeIh08 d PeIh09 d PeIh10 d PeIh11 d PeIh12 d PeIh13 d PeIh14 d PeKv01 d PeKv03 d PeKv04 d PeKv05 d PeKv06 d PeKv07 d PeKv08 d PeKv09 d PeKv10 d PeKv11 d PeKv12 d PeKv13 d PeKv14 d PeKv15 d PeSk01 d PeSk02 d PeSk03 d PeSk04 d PeSk05 d PeSk06 d PeSk07 d PeSk08 d PeSk09 d PeSk10 d PeSk11 d PeSk12 d PeSk13 d PeSk14 d PeSk15 d 
		POP-END

I was surprised about the '110 samples' because in pop 1 (Pf) there are more and in pop 2 (Pe) less than 110 samples. Also in pop 1 there is no sample called 'Kv29', it is 'PKv29'. So I counted the number of characters until 'Kv29' and 'PfTa22' appears in pop 1 and until 'eSk15' appears in pop 2 and it is always 111 characters.

Now I wonder if there is actually a character limit or do I have to alter my CURRENT-POPS module some how?

@igronau
Copy link
Collaborator

igronau commented Sep 21, 2020

Yes, there is a character limit for the "samples" line. I don't recall now what it is, but regardless of this limit, you should prune your sample set. G-PhoCS is specifically designed to make use of thousands of loci from a few samples per population. Having more than 15 samples per populations is probably excessive and will only bog down the computation without adding much statistical signal. What I suggest is running parallel analyses on subsets of your samples for validation, and using ~10 samples per population in each analysis.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants