Cache.pm
Netscape::Cache - object class for accessing Netscape cache files
The object oriented interace:
use Netscape::Cache;
$cache = new Netscape::Cache;
while (defined($url = $cache->next_url)) {
print $url, "\n";
}
while (defined($o = $cache->next_object)) {
print
$o->{'URL'}, "\n",
$o->{'CACHEFILE'}, "\n",
$o->{'LAST_MODIFIED'}, "\n",
$o->{'MIME_TYPE'}, "\n";
}
The TIEHASH interface:
use Netscape::Cache;
tie %cache, 'Netscape::Cache';
foreach (sort keys %cache) {
print $cache{$_}->{URL}, "\n";
}
The Netscape::Cache module implements an object class for
accessing the filenames and URLs of the cache files used by the
Netscape web browser. You can access the cached URLs offline via Netscape
if you set Options->Network Preferences->Verify Document
to Never.
Note: You can also use the undocumented pseudo-URLs about:cache,
about:memory-cache and about:global-history to access your cache,
memory cache and history.
There is also an interface for using tied hashes.
$cache = new Netscape::Cache(-cachedir => "$ENV{HOME}/.netscape/cache");
This creates a new instance of the Netscape::Cache object class. The
-cachedir argument is optional. By default, the cache directory setting
is retrieved from ~/.netscape/preferences.
If the Netscape cache index file does not exist, a warning message
will be generated, and the constructor will return undef.
The Netscape::Cache class implements the following methods:
Each of the methods is described separately below..
$url = $history->next_url;
This method returns the next URL from the cache index. Unlike
Netscape::History, this method returns a string and not an
URI::URL-like object.
This method is faster than
next_object
, since it does only evaluate the
URL of the cached file.
$cache->next_object;
This method returns the next URL from the cache index as a
Netscape::Cache::Object
object. See below for accessing the components
(cache filename, content length, mime type and more) of this object.
$cache->get_object;
This method returns the
Netscape::Cache::Object
object for a given URL.
If the URL does not live in the cache index, then the returned value will be
undefined.
Deletes URL from cache index and the related file from the cache.
WARNING: Do not use
delete_object
while in a
next_object
loop!
It is better to collect all objects for delete in a list and do the
deletion after the loop, otherwise you can get strange behaviour (e.g.
malloc panics).
$cache->rewind();
This method is used to move the internal pointer of the cache index to
the first URL in the cache index. You don't need to bother with this
if you have just created the object, but it doesn't harm anything if
you do.
next_object
and
get_object
return an object of the class
Netscape::Cache::Object
. This object is simply a hash, which members
have to be accessed directly (no methods).
An example:
$o = $cache->next_object;
print $o->{'URL'}, "\n";
-
URL
-
The URL of the cached object
-
CACHEFILE
-
The filename of the cached URL in the cache directory. To construct the full
path use ($cache is a Netscape::Cache object and $o a Netscape::Cache::Object
object)
$cache->{'CACHEDIR'} . "/" . $o->{'CACHEFILE'}
-
CACHEFILE_SIZE
-
The size of the cache file.
-
CONTENT_LENGTH
-
The length of the cache file as specified in the HTTP response header.
In general, SIZE and CONTENT_LENGTH are equal. If you interrupt a transfer of
a file, only the first part of the file is written to the cache, resulting
in a smaller CONTENT_LENGTH than SIZE.
-
LAST_MODIFIED
-
The date of last modification of the URL as unix time (seconds since
epoch). Use
scalar localtime $o->{'LAST_MODIFIED'}
to get a human readable date.
-
LAST_VISITED
-
The date of last visit.
-
EXPIRE_DATE
-
If defined, the date of expiry for the URL.
-
MIME_TYPE
-
The MIME type of the URL (eg. text/html or image/jpeg).
-
ENCODING
-
The encoding of the URL (eg. x-gzip for gzipped data).
-
CHARSET
-
The charset of the URL (eg. iso-8859-1).
.
This program loops through all cache objects and prints a HTML-ified list.
The list ist sorted by URL, but you can sort it by last visit date or size,
too.
use Netscape::Cache;
$cache = new Netscape::Cache;
while ($o = $cache->next_object) {
push(@url, $o);
}
# sort by name
@url = sort {$a->{'URL'} cmp $b->{'URL'}} @url;
# sort by visit time
#@url = sort {$b->{'LAST_VISITED'} <=> $a->{'LAST_VISITED'}} @url;
# sort by mime type
#@url = sort {$a->{'MIME_TYPE'} cmp $b->{'MIME_TYPE'}} @url;
# sort by size
#@url = sort {$b->{'CACHEFILE_SIZE'} <=> $a->{'CACHEFILE_SIZE'}} @url;
print "\n";
foreach (@url) {
print
"- {'CACHEDIR'}, "/", $_->{'CACHEFILE'}, "\">",
$_->{'URL'}, " ",
scalar localtime $_->{'LAST_VISITED'}, "
",
"type: ", $_->{'MIME_TYPE'},
",size: ", $_->{'CACHEFILE_SIZE'}, "\n";
}
print "
\n";
The Netscape::Cache module examines the following environment variables:
-
HOME
-
Home directory of the user, used to find Netscape's preferences
($HOME/.netscape). Otherwise, if not set, retrieve the home directory
from the passwd file.
.
There are still some unknown fields (_XXX_FLAG_{1,2,3}).
You can't use
delete_object
while looping with
next_object
. See the
question "What happens if I add or remove keys from a hash while iterating
over it?" in perlfaq4.
keys() or each() on the tied hash are slower than the object
oriented equivalents
next_object
or
next_url
.
Netscape::History
Slaven Rezic <eserte@cs.tu-berlin.de>
Copyright (c) 1997 Slaven Rezic. All rights reserved.
This module is free software; you can redistribute it and/or modify
it under the same terms as Perl itself.