This Web Archiving Service video tutorial
will show you how utilize the various
report options available to you.
The reports you can run are listed
under the "Reports" tab
when viewing the overview of an individual capture.
Each report has a brief description
explaining its function.
The most important report is
the "Crawl Log."
This report gives you detailed information
about your capture and can help you determine
any errors that were encountered
during the crawl.
You can search for a particular item
to see whether or not it was captured.
Use Ctrl-f on a PC or Command-f on a Mac
to search for the filename.
Then find the corresponding Heritrix
or HTTP status code
that is in the second column.
The main Heritrix codes that you will encounter
are "1," which means successful DNS lookup performed;
"200," which indicates that the item was successfully captured;
"403," which tells you that the item requires
authorization to be viewed, and therefore
was not captured;
"404," which means that the item could not be found,
and therefore was not captured;
"-9998," which means that there was a robots.txt exclusion
for this item, and it was not captured.
These tools are great for browsing
and troubleshooting on your own,
but know that we're happy to work with you
to research any errors or problems
that you come across.
In addition to the "Reports" tab,
we also a useful page of quality assurance tools.
Using the dropdown menu beneath the "Captures" tab
to choose "QA Tools" will take you to the list
of quality assurance tools.
These tools will help pinpoint areas
that are causing problems within your captures.
Each tool has a brief explanation.
For example, checking the list of redirected seed URLs
will clue you in as to which sites
may need updated URLs in order to continue
capturing correctly in the future.
In order to update a URL,
simply click on the "Edit Site" link,
and then change the seed URL information
on the "Edit Site" screen.
This has been a tutorial on analyzing reports.
Check out our additional tutorials
on analyzing capture results,
and comparing captures,
to better understand your capture results.
As always, if you have questions,
feel free to contact us at washelp@ucop.edu