|
||||||||||
HTML Query FiltersScoundrel is capable of extracting query-information from webpages viewed in the Built-in-Browser. This is done by a so called HTMLQueryFilter. You might have noticed that in a subdirectory to your Scoundrel-directory called filter is a file called Amazon.dll. This file contains the implementation for the Amazon-query-filter. On startup, Scoundrel loads this dll into memory and can then filter out query information from Amazon.com. The filter interface has been frozen and it is safe to implement your own filters. There is a sample filter included in the source. If you want to develop a filter, download the source to quickly get you started on how it is done! You do not have to share the source code with the world for a filter you implement (this is intentional so that websites don't have a too easy task of changing their pages to fool your filter), but if you want me to put it up here on the Scoundrel homepage, you need to send me the source so that I can verify that the filter is not too harmful to my users systems. Please send any filter implementations to the email address found at the bottom of this page. Also, if you translate the filter-interface to another language, please send me the interface definition so that I can post it so that others can benefit from it. Thanks! How to create a filter of your ownForum - please post any questions you have regarding the implementation of custom filters in the filter forum. In order to implement a filter you need to create a dll which contains the following four functions (shown in Pascal/Delphi syntax here) and put it in the \filter-subdirectory:
TFilterQueryFoundCallback is defined as: These are described more in detail below. Name
Returns a string containing the name of the filter. This is the name shown in the Go...-menu. Url
Returns a string containing the base url of the filter. This is the url the Built-in-Browser navigates to when the user selects the filter's name in the Go...-menu. Wants
Return an integer telling how much the filter wants the given url/document pair. 0 = you don't want it. 10 = this is my webpage. The reason for a non-boolean value is that I want to be able to have a default filter checking all webpages so that you can in the future put standardized query-information on a webpage without it being under a specific domain. The document-parameter is the return value from the IWebBrowser2::get_Document method, which is an IDispatch-interface supporting the IHTMLDocument, IHTMLDocument2 and IHTMLDocument3 interfaces as specified by Microsoft. The most interesting interface is the IHTMLDocument2-interface, from which you can retrieve a pointer to the body of the document. Process
This is to tell the filter to process the given document and extract any query-information found. When a query has been found, the process-procedure should call the TFilterQueryFoundCallback-procedure given as an argument and be sure to pass the pointer given as the first argument in this call! The document-parameter is the same as for the Wants-function. If you can't get the tracklength, pass 0 (zero) as the tracklength parameter. If you can't get the tracknumber, pass -1 (minus one) as the tracknumber parameter. If you can't get the year, pass -1 (minus one) as the year parameter. If you can't get the genre, pass nil or an empty string as the parameter. |