Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find 10 or 100 errors in DBpedia #11

Open
mgns opened this issue Jan 23, 2018 · 4 comments
Open

Find 10 or 100 errors in DBpedia #11

mgns opened this issue Jan 23, 2018 · 4 comments
Labels
warmup-task Warmup task to practice before applying for GSoC.

Comments

@mgns
Copy link
Member

mgns commented Jan 23, 2018

Effort

1 day

Skills

curiosity, attention to detail, spreadsheets

Description

There are several classes of errors in DBpedia. Data may be incorrect or missing. Errors may be caused by different reasons, for instance 1) wrong information or wrong format in Wikipedia or other original source; 2) the DBpedia Extraction Framework (DEF) might be making errors during automatic extraction; 3) there might be errors in the mappings or in the ontology. In this task you will browse through DBpedia entities, read Wikipedia pages, (optionally) run some SPARQL queries via the Web UI and analyze the results that come back. Your objective is to judge whether information is correct and try to detect the possible sources of error. You will log your findings in a spreadsheet that will be reviewed with one of the core developers of DBpedia. They will review your analysis and help you determine the source of error.

Impact

Data quality is one of the most important challenges in open data sets like DBpedia. By finding and categorizing errors, you will learn more about how DBpedia works and help us draft a plan of action that will efficiently improve our data quality by tackling the largest sources of errors first.

@mgns mgns added gsoc-2018 Google Summer of Code 2018. warmup-task Warmup task to practice before applying for GSoC. labels Jan 23, 2018
@mommi84 mommi84 removed gsoc-2018 Google Summer of Code 2018. labels Dec 2, 2018
@dbpedia dbpedia deleted a comment from jimkont Dec 17, 2018
@dbpedia dbpedia deleted a comment from danbutron Dec 17, 2018
@PseudoNerd
Copy link

I'm interested in issue #2 and would like to work on this. Should I wait for this to be assigned to me or start working on it at the earliest?

@mommi84
Copy link
Member

mommi84 commented Mar 4, 2019

Hi @PseudoNerd and thanks for your interest. The issues described in this page are just supposed to be warm-up tasks, so none of them is not going to be selected as a potential GSoC project. You're free to start working on it, however I'd recommend to focus on a project proposal for project #2.

@PseudoNerd
Copy link

PseudoNerd commented Mar 4, 2019

Thank you.

Also, where should I send the first draft of the project proposal to the mentors(the project label says mentor-needed) for evaluation so that I could pointers for writing the final one?

@mommi84
Copy link
Member

mommi84 commented Mar 4, 2019

We are still deciding who will be mentoring project #2, so by the rules @beyzayaman and I are co-mentors ad interim until we have a name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
warmup-task Warmup task to practice before applying for GSoC.
Projects
None yet
Development

No branches or pull requests

3 participants