The Rule of URLs
According to RFC 3986, URLs can only contain a limited set of characters from the US-ASCII character set. These include:
- Alphanumeric:
A-Z,a-z,0-9 - Reserved characters:
-,_,.,~
Any character outside of this set must be Percent-Encoded.
Why spaces are %20
The space character is a "delimiting" character in many systems. If you have a URL like http://example.com/my file.txt, a server might think the URL ends at my.
To fix this, we replace the space with its ASCII value in hexadecimal, prefixed with a %. The hex value for space is 20.
Other Common Encodings
!becomes%21#becomes%23&becomes%26+becomes%2B
Why You Should Care
Improper URL encoding is a common source of bugs. If you're building a search feature and a user types C++, your code needs to encode the + characters, or the server will receive C (because + often represents a space in query strings).
Need to encode or decode a string quickly? Try our URL Encoder.