How to prevent XSS when transferring web page content within XML

Question

I have an ASP.NET web application which uses HTTPS and XML based request-response. It seems a user's session cookie can be retrieved via XSS, if the request/response is intercepted and malicious JavaScript code is added as a payload.

Since it is an XML based request-response, the payload has to be HTML encoded. Here is an example:

&quot;&lt;iframe name=&quot;if(0){\u0061lert(1)}else{\u0061lert(document.cookie)}&quot; onload=&quot;eval(name)&quot;/&gt;

Is there any way I can stop executing any malicious script?

How to identify a request/response containing a malicious script? I noticed that all malicious scripts must contain ", &, ( and ), but user data may also contain these characters.
How do I stop such a script to be run on users' browsers? I have already tried decoding all request/ response to corrupt the script which works, but my requests/responses contain user data which contains characters like &. This leads to the whole XML to become corrupt, which crashes the application.

It really depends on how your page is rendering its content. If it's a web application, the end product must be HTML. How is that XML transformed into HTML? This is the key to answering the question. — kravietz, CommentedJun 2, 2014 at 10:08
Kravietz, the XML is transformed to HTML at the client's side. You don't expect me to put in a security fix at the client's side, do you? — KatariaA, CommentedJun 2, 2014 at 11:28
Why not? If it's a JavaScript engine that renders HTML into XML then you can do the escaping on the client side as well. For example, look at Strict Contextual Escaping in AngularJS. — kravietz, CommentedJun 2, 2014 at 19:13
As for corrupt XML output, are you sending the possibly malformed data inside CDATA block? — kravietz, CommentedJun 2, 2014 at 19:14
Kravietz, I was rather looking for a solution at the server side but I will keep this mind if I don't get a better solution. As for your your other comment- the payload can be inserted after any xml node. It is an IFrame with a name attribute and onload event set to evaluate that name. — KatariaA, CommentedJun 2, 2014 at 19:36

atk · Accepted Answer · 2014-06-01 22:25:23Z

You are on the right track, with XML encoding. You can't predict all possible malicious content, so encoding dynamic data to guarantee it is benign is the right approach. The issue you've run into is that things get really wonky if you try to handle output encoding for multiple output locations all at once. The right approach (and you're close) is to encode for the current output location only, and let other output locations encode for themselves.

I think the thing that is confusing you is that you are thinking of the XML as though it is the raw output to the HTML. You need to consider the XML as part of your network communications protocol stack, and the HTML as separate output.

If I read this correctly, you have some server code that writes an XML message, and some client code that runs in the web browser, receives the XML message and writes to HTML, JavaScript, URL or CSS context. (If I read incorrectly, then you probably have a back end writing XML to a front end that is writing HTML, JavaScript, URLs and CSS. In this case, just consider the front end to be the client of the back end in the rest of this post.)

Therefore, the server is responsible for safely writing XML and the client is responsible for correctly writing HTML, JavaScript, URL and CSS data. This way, the code writing the output is responsible for the safe presentation of that output.

atk, <br/> since the payload in request is already encoded by the attacker, encoding it to Unicode/Utf-8 on the server side is not helping. — KatariaA, CommentedJun 2, 2014 at 11:17
@AshishK, again, the server is only reaponsible for encoding the output it is writing. It is not writing HTML output. It is writing XML output. As such, the server is *only responsible for XML encoding. — atk, CommentedJun 2, 2014 at 11:48
atk, are you suggesting I encode the response? I can't find any difference between Html Encoding and Xml Encoding. — KatariaA, CommentedJun 2, 2014 at 12:18
I am suggesting you encode independently in each location that writes output. You are correct that html encoding is the same as xml encoding - html is, in fact, an xml-based specification. When you only encode xml, you only decode for xml. If you want to encode for html, encode when you write the html and don't rely on prior encoding to do it for you. As you found, it doesn't. — atk, CommentedJun 2, 2014 at 22:20
Oh, also, use numeric entities and encode everything that is not alphanumeric. This will prevent many attack vectors that you haven't addressed yet — atk, CommentedJun 2, 2014 at 22:24

Stack Exchange Network

How to prevent XSS when transferring web page content within XML

1 Answer 1

You must log in to answer this question.

Hot Network Questions

How to prevent XSS when transferring web page content within XML

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions