Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MN catch and effort merging not matching even though match exists #1

Open
hkundel opened this issue May 17, 2023 · 6 comments
Open

Comments

@hkundel
Copy link
Collaborator

hkundel commented May 17, 2023

Likely has to do with the fact that survey IDs can be 15+ digits long, excel may be rounding or cutting off digits to one of the files therefore making them unable to match. Need to add to the code how to read in these files so it explicitly calls the Survey ID Column a character (and not a numeric that can be rounded). If that doesn't work, may have to re-save the csv files, or tell excel they are characters .

@hkundel
Copy link
Collaborator Author

hkundel commented May 17, 2023

Appears to only be a problem with the electrofishing data... the GN and TN data are fine as is

@hkundel
Copy link
Collaborator Author

hkundel commented May 17, 2023

11,614 fish from EF catch not matching to a survey out of 308,942 fish total. but I have confirmed that we do in fact have the survey effort for the surveys where these fish were caught. Likely a DOW or Survey ID mismatch

@hkundel
Copy link
Collaborator Author

hkundel commented May 17, 2023

Okay so there is effort data for survey 880626539624000
on Bassett lake in St. Louis county in 2006 but it won't match up to catch

Also should use the date in the catch file NOT the effort file. Created a column for "Year" for the merge itself though

@hkundel
Copy link
Collaborator Author

hkundel commented May 17, 2023

After running this code

`catch_survey_id <- mn_ef_lmb_smb_catch_26Aug2022 %>%
select(survey_id)%>%
group_by(survey_id)%>%
summarise(Total_catch = n())%>% mutate(file = "CATCH")

effort_survey_id <- mn_ef_lmb_smb_effort_26Aug2022 %>% select(survey_id) %>% mutate(file = "EFFORT")

Survey_ID_Check <- full_join(catch_survey_id, effort_survey_id, by = "survey_id")%>%
filter(is.na(file.y))`

I see that there are survey_ids within the catch data that do NOT have a matching survey id in the effort data... SO

  1. The survey ID numbers are messed up and not matching OR
  2. We are missing effort for those surveys

@hkundel
Copy link
Collaborator Author

hkundel commented May 17, 2023

Okay so there is effort data for survey 880626539624000 on Bassett lake in St. Louis county in 2006 but it won't match up to catch

Also should use the date in the catch file NOT the effort file. Created a column for "Year" for the merge itself though

TUrns out this survey lists "WAE" as target species so I removed it, but I guess I will have to leave all EF effort data even if the species is not LMB or SMB

@mrverhoeven
Copy link
Owner

Corey Geving has told me that Suvey_Id is a poor key to use. More information on this is provided in the MN Readme: https://docs.google.com/document/d/1sXag9RDvbKN0pUB6uRlADz36-kFNqRPwHVGlkJgyWTM/edit?usp=share_link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants