PoultryNet Streamlines Access to Poultry-Related Materials on the Web
PoultryNet, a website built with
a powerful search engine at its core, is designed to improve the way poultry professionals
reach poultry-related information on the World Wide Web. PoultryNet’s search engine
scours over 1,200 hand-picked poultry-related websites each week placing over 50,000
documents at the poultry professional’s fingertips.
PoultryNet’s search tool is similar to Yahoo’s and Infoseek’s. Like these more popular
search engines, it searches through Internet-based documents for terms that the user
enters. PoultryNet is vastly different from these search tools, however, in that
its scope is strictly limited to poultry-related websites.
PoultryNet’s research focus has centered on understanding current Internet search
technology to fill the gaps left by today’s search tools. One of the largest frustrations
with the Internet is information overload; typing the term “poultry” into a major
search engine will reveal thousands of links to websites not all of which appear
to be related to poultry. Because the major search engines are search tools for all
information on the Internet, the user is often plagued by too many unrelated results.
For example, if you would like to understand the use of ozone in growout houses,
many major search engines would return a list of results peppered with links that
are not relevant, but which contain, none-theless, the search terms provided. For
example, if the above search is performed on one major search engine, the first result
returned is about alligator farming. Interesting, but irrelevant. Nevertheless, popular
search tools are important for finding information on the Internet (see sidebar for
guidelines for mining information with these tools).
PoultryNet’s Automation Process When someone performs a search on PoultryNet, he/she
enters a search term or phrase and receives a list of results. What the user does
not see are two highly automated processes underneath: data gathering and data filtering.
PoultryNet first gathers information from other websites by automatically visiting
over 1,200 websites once a week and pulling down the most pertinent poultry-related
information from each website. The text from these documents is then stored in a
database on PoultryNet, which currently houses over 50,000 documents. For the most
part, this section is fully automated; the one area that requires human intervention
is determining which sites to visit. PoultryNet team members constantly search for
new poultry-related websites, and there is also a link for PoultryNet users to submit
a site that the team may have overlooked.
Once the user asks PoultryNet to search for a set of terms, PoultryNet matches those
terms against the text of all 50,000 documents in its database. This process, data
filtering, is automation at its finest: matching search terms against a pool of 50,000
documents takes seconds for PoultryNet but would take days for the average person.
The filtering process doesn’t end there, though. Once PoultryNet has found all the
matches it can find, it then sorts all of those matches and returns the most relevant
matches first. The most relevant documents are those with the most occurrences of
the user’s search term or terms in the title, address, and body of the document.
Without these two automatic processes, we wouldn’t have the sophisticated Internet
search engines we enjoy today.
PoultryNet’s advanced search tool also assists its users by helping them communicate
with its search engine. Communicating with a search engine is similar to walking
up to a reference librarian, saying “double yolk,” and expecting him/her to produce
a number of documents pertaining to double yolks. Most reference librarians will
ask for clarification, which is what PoultryNet’s advanced search page does. The
goal of the advanced search interface is solely to tease clarification from the user
by offering the user a number of options: The user may type a free text query, generate
other search terms, select boolean terms (like AND and OR) to connect search terms,
include part of the document’s address (.edu if the user wants something only from
an educational institution), or supply a range of dates in which the document was
last modified. The advanced search page helps the user communicate with the search
engine by guiding that person toward creating a search query that yields the best
search results.
PoultryNet’s research focus continues to be that of decreasing the gap between the
information the industry professional needs and the information available on the
Internet. Just about any piece of information a user could ever need related to his/her
business is out there; the difficult part is finding and acquiring it.
Tom McKlin, research scientist in GTRI’s Intelligent Machines Branch, contributed
this article.