PoultryNet Streamlines Access to Poultry-Related Materials on the Web
PoultryNet, a website built with
a powerful search engine at its core, is designed to improve the way poultry professionals
reach poultry-related information on the World Wide Web. PoultryNet’s search engine
scours over 1,200 hand-picked poultry-related websites each week placing over 50,000
documents at the poultry professional’s fingertips.
PoultryNet’s search tool is similar to Yahoo’s and Infoseek’s. Like these more popular search engines, it searches through Internet-based documents for terms that the user enters. PoultryNet is vastly different from these search tools, however, in that its scope is strictly limited to poultry-related websites.
PoultryNet’s research focus has centered on understanding current Internet search technology to fill the gaps left by today’s search tools. One of the largest frustrations with the Internet is information overload; typing the term “poultry” into a major search engine will reveal thousands of links to websites not all of which appear to be related to poultry. Because the major search engines are search tools for all information on the Internet, the user is often plagued by too many unrelated results. For example, if you would like to understand the use of ozone in growout houses, many major search engines would return a list of results peppered with links that are not relevant, but which contain, none-theless, the search terms provided. For example, if the above search is performed on one major search engine, the first result returned is about alligator farming. Interesting, but irrelevant. Nevertheless, popular search tools are important for finding information on the Internet (see sidebar for guidelines for mining information with these tools).
PoultryNet’s Automation Process When someone performs a search on PoultryNet, he/she enters a search term or phrase and receives a list of results. What the user does not see are two highly automated processes underneath: data gathering and data filtering.
PoultryNet first gathers information from other websites by automatically visiting over 1,200 websites once a week and pulling down the most pertinent poultry-related information from each website. The text from these documents is then stored in a database on PoultryNet, which currently houses over 50,000 documents. For the most part, this section is fully automated; the one area that requires human intervention is determining which sites to visit. PoultryNet team members constantly search for new poultry-related websites, and there is also a link for PoultryNet users to submit a site that the team may have overlooked.
Once the user asks PoultryNet to search for a set of terms, PoultryNet matches those terms against the text of all 50,000 documents in its database. This process, data filtering, is automation at its finest: matching search terms against a pool of 50,000 documents takes seconds for PoultryNet but would take days for the average person. The filtering process doesn’t end there, though. Once PoultryNet has found all the matches it can find, it then sorts all of those matches and returns the most relevant matches first. The most relevant documents are those with the most occurrences of the user’s search term or terms in the title, address, and body of the document. Without these two automatic processes, we wouldn’t have the sophisticated Internet search engines we enjoy today.
PoultryNet’s advanced search tool also assists its users by helping them communicate with its search engine. Communicating with a search engine is similar to walking up to a reference librarian, saying “double yolk,” and expecting him/her to produce a number of documents pertaining to double yolks. Most reference librarians will ask for clarification, which is what PoultryNet’s advanced search page does. The goal of the advanced search interface is solely to tease clarification from the user by offering the user a number of options: The user may type a free text query, generate other search terms, select boolean terms (like AND and OR) to connect search terms, include part of the document’s address (.edu if the user wants something only from an educational institution), or supply a range of dates in which the document was last modified. The advanced search page helps the user communicate with the search engine by guiding that person toward creating a search query that yields the best search results.
PoultryNet’s research focus continues to be that of decreasing the gap between the information the industry professional needs and the information available on the Internet. Just about any piece of information a user could ever need related to his/her business is out there; the difficult part is finding and acquiring it.
Tom McKlin, research scientist in GTRI’s Intelligent Machines Branch, contributed this article.