If you are working with trusted web pages or HTML - your web
pages and your HTML - then there is little to worry about in terms
of security.
However if you allow untrusted clients to provide you with web
URLs or raw HTML then you need to consider if a malicious user
could leverage specially crafted pages, to create effects you had
not considered. The core ways in which this may happen are related
to redirection, embedded iframes and base tags. JavaScript can also
be exploited in ingenious ways to reveal information about your
server. Tracker or spy pixels may be used to identify content and
pass information in ways you do not want.
For example, you might not consider it important that you can
view a web site which tells you your IP address and location.
However if you allow a user to create a PDF of this web site then
it will reveal the IP address and location of the server that has
generated the document. While perhaps this might not be a problem,
it is possible that an someone might find a way to access other
apparently innocuous information in a way that could ultimately be
exploited.
Suppose your server has access to an internal intranet. An
attacker knows about URLs on this intranet but cannot access them.
However if you allow them to pass a URL for conversion into a PDF,
then they can specify one of these internal URLs and get back a PDF
of the web pages.
So the core principle here is - do not pass untrusted URLs or
HTML to ABCpdf.
In previous versions of ABCpdf, JavaScript was disabled by
default and had to be manually enabled. However JavaScript has
become critical to the display of most sites and From Version 11
onwards, it is enabled by default. See the HtmlOptions.UseScript
property for details.
Each web engine operates inside a sandbox which limits the
access that web pages have. However the extent to which the web
pages are limited, is largely determined by where they are and what
they wish to access. In general web pages are tied into the local
site from which they come. So any attempt to move outside this
location will normally be denied. You can read more about this on
the web by searching for details of the same-origin policy and how
it is used to prevent cross-site scripting.
Redirection is much more relevant when JavaScript is enabled as
this is the primary mechanism by which it is accomplished. There
are other mechanisms such as meta-refresh tags, but they are
uncommon and fragile. If you pass a URL to ABCpdf, it cannot tell
the difference between intended redirection and malicious
redirection. So it is important to ensure that the types of
redirection that are allowed is controlled.
Base tags are similar but do not use JavaScript - they allow a
web page in one location to reference resources in another. Most
typically this is used to allow page on one site to reference
images or style sheets on a different web site. Obviously to
strictly enforce a same-origin policy here would rather defeat the
concept of the base tag. However there are security measures in
place to limit the extent to which they can be manipulated.
The main thing to consider is that the origin of files on the
local machine is the local machine. So, while the origin of web
pages strictly controls the access they can have, the origin of
files inherently limits the level of control that is possible.
Allowing untrusted users to provide web page URLs provides limited
exposure. Allowing them to provide files to be be passed directly
to AddImageUrl via a 'file:///' protocol URL, provides much more
exposure.
As an alternative, if you have files containing HTML, read the
content and pass it in via the AddImageHtml method. Both the
ABCChrome and (to a lesser extent) the Gecko engines have barriers
in place to limit the access of web pages which are provided this
way. The origin of the HTML is not the local machine and a higher
level of defense is automatically in place. However it may also
mean that images on your local machine, referenced in your HTML,
will not be able to be loaded. See table below for details.
The ABCChrome and ABCWebKit FireShield settings allow you to
specify dynamic permissions associated with file access. If you
need to work with files on the local machine, this is an excellent
way of ensuring that the HTML render process has access only to
those files that it requires. However this is a feature specific to
the ABCChrome and ABCWebKit engines. It does not exist for the
ABCGecko or MSHTML engines.
The file permissions associated with a web server process,
provide yet another layer of defense. But while a sequence of
defensive layers may provide protection it is quite difficult to
know the extent to which they will work in the face of a determined
adversary. It is best not to provide untrusted users with
mechanisms to allow them to insert unvetted HTML or JavaScript in
the first place.
There are some variations between engines and what they allow.
See below for details. Note that while engines should be completely
consistent here, it is our experience that this is not always the
case so you may not want to rely on these permissions as
absolutes.
|
ABChrome
|
ABCGecko
|
Mshtml
|
Redirection from one page to another within the
current web site
|
Allowed
|
Allowed
|
Allowed
|
Redirection from a page on one web site to a page
on a different one
|
Denied
|
Denied
|
Denied
|
Redirection from a page on a web site to a local
file
|
Denied
|
Denied
|
Denied
|
Redirection from one local file to another local
file
|
Allowed
|
Allowed
|
Allowed
|
Redirection from a local file to a page on a web
site
|
Denied
|
Allowed
|
Denied
|
Base tag in a local file referencing resources in
a local directory
|
Allowed
|
Allowed
|
Allowed
|
Base tag in a local file referencing resources on
a remote web site
|
Allowed
|
Allowed
|
Allowed
|
Base tag in a page on a remote web site
referencing resources in a local directory
|
Denied
|
Denied
|
Allowed
|
Image in a local directory, referenced in HTML,
provided via AddImageHtml method.
|
Denied
|
Denied
|
Allowed
|
Image in a local directory, referenced in HTML,
provided via AddImageUrl and a 'file:///' scheme URL.
|
Allowed*
|
Denied
|
Allowed
|
* If you enable the FireShield engine, the default rules will
deny access. Access can be re-enabled by adding FireShield rules to
explicitly allow it.
|