WebService::Validator:UserLContributed PerlWebService::Validator::HTML::W3C(3)NAMEWebService::Validator::HTML::W3C - Access the W3Cs online HTML
validator
SYNOPSIS
use WebService::Validator::HTML::W3C;
my $v = WebService::Validator::HTML::W3C->new(
detailed => 1
);
if ( $v->validate("http://www.example.com/") ) {
if ( $v->is_valid ) {
printf ("%s is valid\n", $v->uri);
} else {
printf ("%s is not valid\n", $v->uri);
foreach my $error ( @{$v->errors} ) {
printf("%s at line %d\n", $error->msg,
$error->line);
}
}
} else {
printf ("Failed to validate the website: %s\n", $v->validator_error);
}
DESCRIPTIONWebService::Validator::HTML::W3C provides access to the W3C's online
Markup validator. As well as reporting on whether a page is valid it
also provides access to a detailed list of the errors and where in the
validated document they occur.
METHODS
new
my $v = WebService::Validator::HTML::W3C->new();
Returns a new instance of the WebService::Validator::HTML::W3C object.
There are various options that can be set when creating the Validator
object like so:
my $v = WebService::Validator::HTML::W3C->new( http_timeout => 20 );
validator_uri
The URI of the validator to use. By default this accesses the W3Cs
validator at http://validator.w3.org/check. If you have a local
installation of the validator ( recommended if you wish to do a lot
of testing ) or wish to use a validator at another location then
you can use this option. Please note that you need to use the full
path to the validator cgi.
ua The user agent to use. Should be an LWP::UserAgent object or
something that provides the same interface. If this argument is
provided, the "http_timeout" and "proxy" arguments are ignored.
http_timeout
How long (in seconds) to wait for the HTTP connection to timeout
when contacting the validator. By default this is 30 seconds.
detailed
This fetches the XML response from the validator in order to
provide information for the errors method. You should set this to
true if you intend to use the errors method.
proxy
An HTTP proxy to use when communicating with the validation
service.
output
Controls which output format is used. Can be either xml or soap12.
The default is soap12 as the XML format is deprecated and is likely
to be removed in the future.
The default will always work so unless you're using a locally
installed Validator you can safely ignore this.
validate
$v->validate( 'http:://www.example.com/' );
Validate a URI. Returns 0 if the validation fails (e.g if the validator
cannot be reached), otherwise 1.
validate_file
$v->validate_file( './file.html' );
Validate a file by uploading it to the W3C Validator. NB This has only
been tested on a Linux box so may not work on non unix machines.
validate_markup
$v->validate_markup( $markup );
Validate a scalar containing HTML.
Alternate interface
You can also pass a hash in to specify what you wish to validate. This
is provided to ensure compatability with the CSS validator module.
$v->validate( uri => 'http://example.com/' );
$v->validate( string => $markup );
$v->validate( file => './file.html' );
is_valid
$v->is_valid;
Returns true (1) if the URI validated otherwise 0.
uri
$v->uri();
Returns the URI of the last page on which validation succeeded.
num_errors
$num_errors = $v->num_errors();
Returns the number of errors that the validator encountered.
errorcount
Synonym for num_errors. There to match CSS Validator interface.
errors
$errors = $v->errors();
foreach my $err ( @$errors ) {
printf("line: %s, col: %s\n\terror: %s\n",
$err->line, $err->col, $err->msg);
}
Returns an array ref of WebService::Validator::HTML::W3C::Error
objects. These have line, col and msg methods that return a line
number, a column in that line and the error that occurred at that
point.
Note that you need XML::XPath for this to work and you must have
initialised WebService::Validator::HTML::W3C with the detailed option.
If you have not set the detailed option a warning will be issued, the
detailed option will be set and a second request made to the validator
in order to fetch the required information.
If there was a problem processing the detailed information then this
method will return 0.
warnings
ONLY available with the SOAP output from the development Validator at
the moment.
$warnings = $v->warnings();
Works exactly the same as errors only returns an array ref of
WebService::Validator::HTML::W3C::Warning objects. In all other
respects it's the same.
validator_error
$error = $v->validator_error();
Returns a string indicating why validation may not have occurred. This
is not the reason that a webpage was invalid. It is the reason that no
meaningful information about the attempted validation could be
obtained. This is most likely to be an HTTP error
Possible values are:
You need to supply a URI to validate
You didn't pass a URI to the validate method
You need to supply a URI with a scheme
The URI you passed to validate didn't have a scheme on the front.
The W3C validator can't handle URIs like www.example.com but
instead needs URIs of the form http://www.example.com/.
Not a W3C Validator or Bad URI
The URI did not return the headers that
WebService::Validator::HTML::W3C relies on so it is likely that
there is not a W3C Validator at that URI. The other possibility is
that it didn't like the URI you provided. Sadly the Validator
doesn't give very useful feedback on this at the moment.
Could not contact validator
WebService::Validator::HTML::W3C could not establish a connection
to the URI.
Did not get a sensible result from the validator
Should never happen and most likely indicates a problem somewhere
but on the off chance that WebService::Validator::HTML::W3C is
unable to make sense of the response from the validator you'll get
this error.
Result format does not appear to be SOAP|XML
If you've asked for detailed results and the reponse from the
validator isn't in the expected format then you'll get this error.
Most likely to happen if you ask for SOAP output from a validator
that doesn't support that format.
You need to provide a uri, string or file to validate
You've passed in a hash ( or in fact more than one argument ) to
validate but the hash does not contain one of the three expected
keys.
validator_uri
$uri = $v->validator_uri();
$v->validator_uri('http://validator.w3.org/check');
Returns or sets the URI of the validator to use. Please note that you
need to use the full path to the validator cgi.
http_timeout
$timeout = $v->http_timeout();
$v->http_timeout(10);
Returns or sets the timeout for the HTTP request.
OTHER MODULES
Please note that there is also an official W3C module that is part of
the W3C::LogValidator distribution. However that module is not very
useful outside the constraints of that package.
WebService::Validator::HTML::W3C is meant as a more general way to
access the W3C Validator.
HTML::Validator uses nsgmls to validate against the W3Cs DTDs. You have
to fetch the relevant DTDs and so on.
There is also the HTML::Parser based HTML::Lint which mostly checks for
known tags rather than XML/HTML validity.
WebService::Validator::CSS::W3C provides the same functionality as this
module for the W3C's CSS validator.
IMPORTANT
This module is not in any way associated with the W3C so please do not
report any problems with this module to them. Also please remember that
the online Validator is a shared resource so do not abuse it. This
means sleeping between requests. If you want to do a lot of testing
against it then please consider downloading and installing the
Validator software which is available from the W3C. Debian testing
users will also find that it is available via apt-get.
BUGS
While the interface to the Validator is fairly stable it may be
updated. I will endeavour to track any changes with this module so
please check on CPAN for new versions if you find things break. Also
note that this module is only guaranteed to work with the currently
stable version of the validator. It will most likely work with any Beta
versions but don't rely on it.
If in doubt please try and run the test suite before reporting bugs.
Note that in order to run tests against the validator service you will
need to have a connection to the internet and also set an environment
variable called TEST_AUTHOR.
That said I'm very happy to hear about bugs. All the more so if they
come with patches ;).
Please use http://rt.cpan.org/ for filing bug reports, and indeed
feature requests.
THANKS
To the various people on the code review ladder mailing list who
provided useful suggestions.
Carl Vincent provided a patch to allow for proxy support.
Chris Dolan provided a patch to allow for custom user agents.
Matt Ryder provided a patch for support of the explanations in the SOAP
output.
SUPPORT
author email or via http://rt.cpan.org/.
AUTHOR
Struan Donald <struan@cpan.org>
<http://www.exo.org.uk/code/>
COPYRIGHT
Copyright (C) 2003-2008 Struan Donald. All rights reserved.
LICENSE
This program is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.
SEE ALSOperl(1).
perl v5.14.12011-07-1WebService::Validator::HTML::W3C(3)