Skip to content

WebService to filter out Stopwords from a sentence and extract top 10 words in web page content

Notifications You must be signed in to change notification settings

AnvithaDineshRao/WebService-FilterOutStopWordsService

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FilterOutStopWordsService

WebService to filter out Stopwords

Service 1. Top10Words

Description:

Analyze the webpage at a given url and return the ten most-frequently occurred words in the webpage. Return the words in the descending order of their appearing frequencies.

Operation:

string[] Top10Words(string url)

Input:

A webpage url in string.

Output:

An array of strings that contains the ten most-frequently occurred words in descending order of their frequencies. You must remove these items that are not semantic words, such as the element tag names and attribute names quoted in angle brackets < … >, if the string represents an XML page or HTML source page.

Service 2. WordFilter

Description:

Analyze a string of words and filter out the function words (stop words) such as “a”, “an”, “in”, “on”, “the”, “is”, “are”, “am”, and any words that are not meaningful to be counted at the top words in search.

Operation:

string WordFilter(string str)

Input:

A string.

Output:

A string with the stop words removed

Implemented a TryItPage similar to a service directory to Test these services(TryItPage.aspx)

About

WebService to filter out Stopwords from a sentence and extract top 10 words in web page content

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published