traction) is a Java class library and interactive development environment for web crawlers.

Web SPHINX consists of two parts: the Crawler Workbench and the Web SPHINX class library.

The Crawler Workbench is a graphical user interface that lets you configure and control a customizable web crawler.

The (unmodified) source code for this library is included in the Web SPHINX source code.

Redistribution is allowed under the terms of the Apache Public License.

Web SPHINX is intended more for personal use, to crawl perhaps a hundred or a thousand web pages.

If you want to use Web SPHINX for large crawls, you should definitely read the next question about memory usage.

THIS SOFTWARE IS PROVIDED BY CARNEGIE MELLON UNIVERSITY ``AS IS'' AND ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.

IN NO EVENT SHALL CARNEGIE MELLON UNIVERSITY NOR ITS EMPLOYEES BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

If you would like to index multiple domains, add while creating the search engine, or on the Manage Domains page on the Dashboard. You can force a re-crawl by clicking the Recrawl button on the Domain page of your dashboard.

(This may be disabled if your domain has more than 50,000 pages or has been re-crawled recently.

Category Classifier was not part of this reimplementation because Category Classifier depended on some other software that belongs to SRC.