Abstract base class for document parsers.
| C# | Visual Basic | Visual C++ |
public abstract class Parser
Public MustInherit Class Parser
public ref class Parser abstract
| All Members | Constructors | Methods | Properties | ||
| Icon | Member | Description |
|---|---|---|
| Parser()()() | ||
| Configuration |
Gets the instance of the Configuration class that holds the settings to be used.
| |
| Encoding |
The character encoding used in the document Stream, if applicable.
| |
| Equals(Object) |
Determines whether the specified Object is equal to the current Object.
(Inherited from Object.) | |
| Finalize()()() |
Allows an Object to attempt to free resources and perform other cleanup operations before the Object is reclaimed by garbage collection.
(Inherited from Object.) | |
| GetFilenameFooter(Uri) |
Creates a footer with filename info from the Uri
| |
| GetHashCode()()() |
Serves as a hash function for a particular type.
(Inherited from Object.) | |
| GetNextWord(String) |
Returns the next 'word' in rawBody, is iterative, so subsequent calls move to consecutive words.
| |
| GetType()()() |
Gets the Type of the current instance.
(Inherited from Object.) | |
| GetWordsInUri(Uri) |
Returns list of words as strings in an ArrayList, that are in the Uri
| |
| IsCurrentWordInTitle()()() |
Returns whether the word last returned by GetNextWord is part of the title.
| |
| IsInIgnoredRegion(ArrayList) |
Determines whether current word (at wordStart) is in an ignored region.
| |
| IsStreamNeeded()()() | Obsolete.
Whether the parser would need a stream to be passed to it in order to perform a ReadText or ReadLinks operation.
| |
| MemberwiseClone()()() |
Creates a shallow copy of the current Object.
(Inherited from Object.) | |
| ParseWords(String, ArrayList, WordCollection, StringBuilder, ArrayList) |
Parses rawBody into descrete Word objects and places them in readDocumentWords.
| |
| PreprocessBreakChunk(String) |
Applies any required processing to a chunk of text that typically forms either a word or whitespace block.
| |
| ProcessWordsToFinalIndexedList(WordCollection, Boolean) |
Processes the list of all words found in the document and returns a list that should be index.
| |
| Read(Stream, Uri, Encoding) |
Reads a document and returns an object holding it's text and any links.
| |
| ReadLinks(Stream, Encoding) | Obsolete.
Reads links to other pages.
| |
| ReadText(Stream, Uri, Encoding) | Obsolete.
Reads text and returns list of words and title
| |
| ResetWordPointers()()() |
Resets the current word being processed.
| |
| ToString()()() |
Returns a String that represents the current Object.
(Inherited from Object.) | |
| TruncateWordWithRepeatedChar(String) |
Removes repeated non-letters from word.
| |
| WordEnd |
The current word's end.
| |
| WordStart |
The current word's start.
|