New in Symfony 6.3: NoSuspiciousCharacters Constraint


Contributed by Mathieu
in #49300.

Take a look at the two following domain names: "symfony.com" and "ѕymfony.com".
The look similar, but they are not the same. In the second domain, the first
letter is not s (the lowercase s letter in Latin script) but ѕ
(a letter called dze in the Cyrillic script).
Using different but similarly looking characters is the base of IDN homograph attacks,
a type of spoofing security attack. That's why it is recommended to check user-submitted,
public-facing identifiers for suspicious characters in order to prevent such attacks.
However, given that Unicode defines more than 150,000 valid characters, this is
a daunting task. For example, did you know that there are invisible characters
such as zero-width spaces? And what about mixing 8 (digit eight in Latin script)
and (digit four in Bengali script)? Don't forget either about combining
characters, such as the "combining dot" that can be placed after the character
i to make it invisible.
In Symfony 6.3, we're introducing a new NoSuspiciousCharacters constraint
so you can validate that strings don't contain any of these problematic characters.
It's based on the Spoofchecker class provided by the PHP intl extension and
it works as follows:

// src/Entity/User.php
namespace App\Entity;

use Symfony\Component\Validator\Constraints as Assert;

class User
{
#[Assert\NoSuspiciousCharacters(
// checks zero-width spaces and numbers looking the same (e.g. 8 and ৪)
checks: NoSuspiciousCharacters::CHECK_INVISIBLE | NoSuspiciousCharacters::CHECK_MIXED_NUMBERS,
restrictionLevel: NoSuspiciousCharacters::RESTRICTION_LEVEL_HIGH,
)]
private string $username;

#[Assert\NoSuspiciousCharacters]
private string $blogUrl;

// ...
}

Read the NoSuspiciousCharacters constraint docs to learn more about its usage
and options.

Sponsor the Symfony project.