Skip to content

Java Apache Lucene Web Search Engine - Precision Recall Query Analysis

Notifications You must be signed in to change notification settings

kkousounnis/Lucene

Repository files navigation

Web Site Search Engine - Lucene Libraries

J2EE-Eclipse-Tomcat Apache Server- Lucene - JSP

Quick Application Presentation

p1

Here we can see our first page similar to a search bar. We can find a question to ask from the Ergasia_p14086_Lucene2018_2019\cacm folder\ query.txt.

We Press Search and After we hava passed into Lucene all the test cases from \cacm folder\ CACM file we can get our respond to query.

p2

It responds like a search engine. It shows us first all the common texts and numeric order to how similar are with our query. Afterwards we can read the whole text by pressing ViewFullText.

image

Here is the full text. At the bottom of the page we can return to first page.

Installation Guide - To pass All texts to Lucene

Step Before running our application.

image

Here we take the file \cacm folder\ CACM and pass all the texts it to Lucene.

image

We pass all text and Isbn numbers to Lucene from the given file which we already have \cacm folder\ CACM and ignoring all the common words that we don't need.

image

Here we can see that our texts were passed successful in the last line.

Successful indexing with avoiding common words from commowords file and using porter algorithm for stemming.

Analysis - Precision Recall - Query

Step 1 We make a query in the example we use 7th query at the web search engine we created. Step 2 we take all the results and print to an excel file as we see at the Bottom.

image

First column: We construct the first column from the qrel.text file that we already have been given from CACM folder. This txt file was created by people who have already read the texts and declared if the query is relevant or not, in other words it habe been made with human judgment. We assign R (Relevant) if the text exists in qrel.txt and N (NonRelevant) if it doesn't.

image

image

Second column: The Isbn number of the text that has been returned. Third column: Recall List, fraction with numerator all relevant texts until this point and denominator all texts that were returned. Fourth column: Precision List, fraction all the relevant texts until this point and denominator all the text until this point. Graphical Presentation: As an analysis must be maid we create a precision - recall Diagram.

About

Java Apache Lucene Web Search Engine - Precision Recall Query Analysis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages