3
\$\begingroup\$

In my case I am saving a URL into my database via Eloquent Model:

namespace App\Models; use Illuminate\Database\Eloquent\Factories\HasFactory; use Illuminate\Database\Eloquent\Model; use Illuminate\Support\Str; class PageLink extends Model { use HasFactory; protected $table='links'; public function setUrlAttribute(string $url):void { $saneUrl = trim($url); $saneUrl = filter_var($url, FILTER_SANITIZE_URL); $saneUrl = preg_replace("/javascript:.*/","",$saneUrl); $saneUrl = htmlspecialchars($url); $saneUrl = Str::of($url)->stripTags()->toString(); /** * Regex was found into https://stackoverflow.com/a/36564776 * The idea is to extract the url from attack vectors such as: * * "http://example.com\"/> <IMG SRC= onmouseover=\"alert('xxs')\"/> * * This allows me to make the entry safe */ preg_match_all('/https?\:\/\/[^\" ]+/i', $saneUrl, $match); if(is_array($match[0]) && isset($match[0][0])){ $saneUrl=$match[0][0]; }elseif (is_string($match[0])){ $saneUrl=$match[0]; }else { $saneUrl=""; } $this->attributes['url']=$saneUrl; } } 

And in my blade view it is rendered as:

<a href="{{$link->url}}">{{$link->url}}</a> 

But also will be provided via AJAX API as well (that is under development). Therefore I want to sanitize my input upon a valid URL. I do not validate it here because my controller does this for me:

 use Illuminate\Contracts\Validation\Validator; use Illuminate\Http\JsonResponse; use Illuminate\Http\Request; Route::post("/url",function(Request $request){ $validator = Validator::create($request->all(),[ 'url'=>"required|url" ],[ 'url'=>"Δεν δώσατε έναν έγγυρο σύνδεσμο" ]); if($validator->fails()){ return new JsonResponse(['msg'=>$validator->error()],400); } $link = new Link(); $link->url = $request->get('url'); try{ $link->save(); }catch(\Exception $e){ report($e); return new JsonResponse(["msg"=>"Link fails to be saved"],500); } return new JsonResponse($link,201); }); 

And here is a unit test for the model:

namespace Tests\Feature\Unit\Model; use App\Models\Link; use Illuminate\Foundation\Testing\RefreshDatabase; use Illuminate\Foundation\Testing\WithFaker; use Tests\TestCase; class UserPageLinkTest extends TestCase { use RefreshDatabase; public static function urlProvider(): array { return [ ['http://example.com<script>alert("XSS");</script>'], ["http://example.com\" onfocus=\"alert(\"Hello\")\""], ["https://example.com\' onfocus='alert('Hello')'"], ["<script>alert(\"XSS\")</script>"], ["http://example.com\"/> <IMG SRC= onmouseover=\"alert('xxs')\"/>"], ["javascript:alert(String.fromCharCode(88,83,83))"] ]; } /** * @dataProvider urlProvider */ public function testUrlSanitizedFromXss($input): void { $model = new Link(); $model->url = $input; $saneUrl = $model->url; $this->assertNotEquals($saneUrl,$input); } } 

The idea behind the code is to make a minimal/lightweight CMS that manages some fixed pages.

In my case could I omit any other filter and just use a regular expression pattern and still be safe from XSS attacks in the input?

\$\endgroup\$
2
  • \$\begingroup\$I avoid Using OWASP ESApi because it is no longer maintained: github.com/OWASP/PHP-ESAPI\$\endgroup\$CommentedFeb 26, 2024 at 12:12
  • 1
    \$\begingroup\$Not worth a full answer, but note that your javascript regex would fail for inputs like JaVaScRiPt: or javajavascriptscript: or java<new line>script: This is why @YourCommonSense's answer is so important. Otherwise you'll be playing an endless cat-and-mouse game.\$\endgroup\$
    – Bee H.
    CommentedFeb 28, 2024 at 20:10

4 Answers 4

7
\$\begingroup\$

The biggest problem here is that you are going overboard with precautions. Security is important, but it must be logical.

  • What $saneUrl = preg_replace("/javascript:.*/","",$saneUrl); is for? Given below you will only allow https:// schema and therefore javascript:// won't make it in anyway?
  • What $saneUrl = htmlspecialchars($url); is for? Given you would apply this sanitization with {{$link->url}} anyway
  • What $saneUrl = Str::of($url)->stripTags()->toString(); is for? What HTML tags you expect to remain after applying htmlspecialchars? ;-)

Another problem is that you are trying to use deliberately wrong URLs. Even trying to "extract the URL from attack vectors" WHY? I am genuinely puzzled.

A generally accepted approach is to validate the data, and abort processing if validation fails. When you get an URL like this http://example.com\"/> <IMG SRC= onmouseover=\"alert('xxs')\"/>, it shouldn't be processed at all. Return a 422 error or whatever and call it a day.

In your place I would do it like this:

public function setUrlAttribute(string $url):void { $url = trim($url); $invalidUrl = !filter_var($url, FILTER_VALIDATE_URL); $invalidScheme = !preg_match('~^http~', $url); if ($invalidUrl || $invalidScheme) { throw WhateverLaravelThrowsInSuchCase("Invalid url"); } $this->attributes['url']=$url; } 

That's all.

\$\endgroup\$
3
  • \$\begingroup\$In other words if any furtger sanitization needed I should do it upon output. (Eg. If I need to display it as Json). Also you say that my controller will catch the http://example.com\"/> <IMG SRC= onmouseover=\"alert('xxs')\"/> atack vector right?\$\endgroup\$CommentedFeb 27, 2024 at 10:32
  • 2
    \$\begingroup\$Yes, exactly. Any further sanitization should be done upon output. in case of JSON output it must be json_encode(). In case of HTML output it must be htmlspecualchars() (which is implemented by {{}} in Blade). Yes, this controller will catch http://example.com\"/> <IMG SRC= onmouseover=\"alert('xxs')\"/> because it's invalid url, and filter_var($url, FILTER_VALIDATE_URL) will return false.\$\endgroup\$CommentedFeb 27, 2024 at 10:35
  • \$\begingroup\$Ok I note it dowm.\$\endgroup\$CommentedFeb 27, 2024 at 10:38
9
\$\begingroup\$

Beware of re-assigning a variable. In the setter method setUrlAttribute() the variable $saneUrl is over-written numerous times:

public function setUrlAttribute(string $url):void { $saneUrl = trim($url); $saneUrl = filter_var($url, FILTER_SANITIZE_URL); $saneUrl = preg_replace("/javascript:.*/","",$saneUrl); $saneUrl = htmlspecialchars($url); $saneUrl = Str::of($url)->stripTags()->toString(); 

Each line in that block except the third line uses $url on the right hand side of the assignment. Perhaps the intention for each line after the first is to reuse $saneUrl and if that is the case then each line before the last is useless because the last one uses the parameter variable instead of the the modified value from the previous lines.

\$\endgroup\$
    6
    \$\begingroup\$

    As some general feedback, I would say that your unit tests are lacking. It looks like you are just checking that your filter function modifies the input in some way? One of the main jobs of tests are to encode in an objective and testable way what the function is supposed to do. Each test input should have a matching expected value to compare to.

    In general, I'm pretty wary about "sanitizing" efforts. If your application does not like the input, it should reject it. Accepting user input but silently modifying it to something different just doesn't make sense. I think that goes doubly if you suspect that the intention is malicious. What is the scenario where you are willing to accept input from a user that you think is trying to subvert your application or attack your users?

    Maybe you have reasons for doing it this way, but if it was me I would use FILTER_VALIDATE_URL and have the function fail if it does not validate. You do have to make sure that you properly encode your urls when you write them out to avoid XSS, but that is always true.

    \$\endgroup\$
    1
    • \$\begingroup\$So upon model I should assume that my controller does not do any sort of input validation and try also to validate my input. But should thid be done after sanitizing it?\$\endgroup\$CommentedFeb 27, 2024 at 8:54
    -1
    \$\begingroup\$

    For validation, I think you can create a custom validator in Laravel.

    Depending on what you want to validate: host, path, query,... you can create and customize the validation you want, and you that with the validator and rules like other rules. Pretty neat!

    In that way, it would do the validation you want along other validation like required or url.

    \$\endgroup\$

      Start asking to get answers

      Find the answer to your question by asking.

      Ask question

      Explore related questions

      See similar questions with these tags.