From 62ab02632600a98faf4632be33814c0e3ff12859 Mon Sep 17 00:00:00 2001 From: aquatiko Date: Tue, 9 Oct 2018 20:33:28 +0530 Subject: [PATCH 1/2] NLP intro article added --- .../natural-language-processing/index.md | 50 +++++++++++++++++++ 1 file changed, 50 insertions(+) create mode 100644 src/pages/machine-learning/natural-language-processing/index.md diff --git a/src/pages/machine-learning/natural-language-processing/index.md b/src/pages/machine-learning/natural-language-processing/index.md new file mode 100644 index 00000000000..ca2d998f4d2 --- /dev/null +++ b/src/pages/machine-learning/natural-language-processing/index.md @@ -0,0 +1,50 @@ +--- +title: Natural Language Processing +--- +## Natural Language Processing(NLP) + +As the Wikipedia says, "Natural language processing (NLP) is a subfield of computer science, information engineering, and artificial intelligence concerned with the interactions between computers and human (natural) languages, in particular how to program computers to process and analyze large amounts of natural language data." +In simpler terms, it is a process in which natural language generated by humans are made sense of by computers. + +### Challenges in NLP + +#### 1.Easy or mostly solved + -Spam detection + -Part of Speech Tagging + -Named Entity Recognition +#### 2.Intermediate or making good progress + -Sentiment analysis + -Coreference resolution + -Word sense disambiguation + -Parsing + -Machine Translation + -Information Translation +#### 3.Hard or still need lot of work + -Text Summarization + -Machine dialog system + +### Common Techniques + -Structure extraction + -Identify and mark sentence, phrase, and paragraph boundaries + -Language identification + -Tokenization + -Acronym normalization and tagging + -Lemmatization / Stemming + -Entity extraction + -Phrase extraction + +### Popularly Used Libraries + -NLTK, the most widely-mentioned NLP library for Python. + -SpaCy, an industrial-strength NLP library built for performance. + -Gensim, a library for document similarity analysis. + -TextBlob, a user-friendly and intuitive NLTK interface. + -CoreNLP from stanford group + -PolyGlot, a natural language pipeline that supports massive multilingual applications. + + +#### More Information: + +For further reading : + +- Click here for an article about NLP intro. +- Click here for the Wikipedia reference. From 8ee331cc4387ee4c2cc4d44ed76a4800c757f451 Mon Sep 17 00:00:00 2001 From: Rohit Kumar Date: Tue, 9 Oct 2018 21:09:36 +0530 Subject: [PATCH 2/2] Update index.md --- .../natural-language-processing/index.md | 51 +++++++++---------- 1 file changed, 25 insertions(+), 26 deletions(-) diff --git a/src/pages/machine-learning/natural-language-processing/index.md b/src/pages/machine-learning/natural-language-processing/index.md index ca2d998f4d2..04617de9785 100644 --- a/src/pages/machine-learning/natural-language-processing/index.md +++ b/src/pages/machine-learning/natural-language-processing/index.md @@ -7,39 +7,38 @@ As the Wikipedia says, "Natural language processing (NLP) is a subfield of compu In simpler terms, it is a process in which natural language generated by humans are made sense of by computers. ### Challenges in NLP - #### 1.Easy or mostly solved - -Spam detection - -Part of Speech Tagging - -Named Entity Recognition + *Spam detection + *Part of Speech Tagging + *Named Entity Recognition #### 2.Intermediate or making good progress - -Sentiment analysis - -Coreference resolution - -Word sense disambiguation - -Parsing - -Machine Translation - -Information Translation + *Sentiment analysis + *Coreference resolution + *Word sense disambiguation + *Parsing + *Machine Translation + *Information Translation #### 3.Hard or still need lot of work - -Text Summarization - -Machine dialog system + *Text Summarization + *Machine dialog system ### Common Techniques - -Structure extraction - -Identify and mark sentence, phrase, and paragraph boundaries - -Language identification - -Tokenization - -Acronym normalization and tagging - -Lemmatization / Stemming - -Entity extraction - -Phrase extraction + *Structure extraction + *Identify and mark sentence, phrase, and paragraph boundaries + *Language identification + *Tokenization + *Acronym normalization and tagging + *Lemmatization / Stemming + *Entity extraction + *Phrase extraction ### Popularly Used Libraries - -NLTK, the most widely-mentioned NLP library for Python. - -SpaCy, an industrial-strength NLP library built for performance. - -Gensim, a library for document similarity analysis. - -TextBlob, a user-friendly and intuitive NLTK interface. - -CoreNLP from stanford group - -PolyGlot, a natural language pipeline that supports massive multilingual applications. + *NLTK, the most widely-mentioned NLP library for Python. + *SpaCy, an industrial-strength NLP library built for performance. + *Gensim, a library for document similarity analysis. + *TextBlob, a user-friendly and intuitive NLTK interface. + *CoreNLP from stanford group + *PolyGlot, a natural language pipeline that supports massive multilingual applications. #### More Information: