Mastering PHP 8.5’s New URI Extension: Safer URL Parsing and Manipulation
This article explains the hidden pitfalls of URL parsing, introduces PHP 8.5’s built‑in URI extension that complies with RFC 3986 and WHATWG standards, shows practical code examples, and recounts the open‑source development process behind the feature.
Overview
PHP 8.5 adds a built‑in URI extension that implements both RFC 3986 and WHATWG URL parsing rules and provides an immutable, fluent API for safely manipulating URL components.
Hidden pitfalls of URL parsing
Two major specifications govern URL parsing: the classic RFC 3986 (https://datatracker.ietf.org/doc/html/rfc3986) and the modern WHATWG URL spec (https://url.spec.whatwg.org/). They are not compatible, so mixing parsers can introduce subtle bugs.
Example input example.com/example/:8080/foo is interpreted differently:
RFC 3986 treats it as a valid relative URL.
WHATWG treats it as invalid without a base URL.
PHP's parse_url() returns host example.com, port 8080, and path /example/:8080/foo, duplicating the port.
<?php
var_dump(parse_url('example.com/example/:8080/foo'));
/*
array(3) {
["host"]=> string(11) "example.com"
["port"]=> int(8080)
["path"]=> string(18) "/example/:8080/foo"
}
*/
?>New API: Safe, powerful, easy to use
The Uri class (namespace Uri\Rfc3986) parses URLs according to both standards and offers with* methods that return modified clones, preserving immutability.
Typical workflow:
Instantiate the URI object.
Determine the default port for the scheme.
Remove the default port if present.
Output the normalized string or the raw original string.
<?php
use Uri\Rfc3986\Uri;
$url = new Uri('HTTPS://thephp.foundation:443/sp%6Fnsor/');
$defaultPortForScheme = match ($url->getScheme()) {
'http' => 80,
'https' => 443,
'ssh' => 22,
default => null,
};
if ($url->getPort() === $defaultPortForScheme) {
$url = $url->withPort(null);
}
// Normalized output (lower‑case scheme, normalized percent‑encoding, no default port)
echo $url->toString(), PHP_EOL; // https://thephp.foundation/sponsor/
// Raw output preserves original case and encoding
echo $url->toRawString(), PHP_EOL; // HTTPS://thephp.foundation/sp%6Fnsor/
?>The API automatically normalizes scheme case, percent‑encoding, and removes default ports. The toRawString() method returns the original representation when needed.
Implementation details
The feature originated from an RFC discussion (https://wiki.php.net/rfc/url_parsing_api) in June 2024 and was approved with a 30:1 vote in May 2025. The implementation reuses existing libraries: uriparser (https://uriparser.github.io/) for RFC 3986 parsing.
Lexbor (https://lexbor.com/) for WHATWG parsing, the same library used in PHP 8.4’s DOM API.
These libraries were extended to support immutable PHP objects and the required “with‑er” methods.
Availability
PHP 8.5 RC 1 bundles the full URI extension, providing out‑of‑the‑box support for both URL standards. Developers can install PHP 8.5, import the Uri class, and use the demonstrated code to handle URLs safely and consistently.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Open Source Tech Hub
Sharing cutting-edge internet technologies and practical AI resources.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
