1/****************************************************************************
2**
3** Copyright (C) 2016 The Qt Company Ltd.
4** Copyright (C) 2016 Intel Corporation.
5** Contact: https://www.qt.io/licensing/
6**
7** This file is part of the QtCore module of the Qt Toolkit.
8**
9** $QT_BEGIN_LICENSE:LGPL$
10** Commercial License Usage
11** Licensees holding valid commercial Qt licenses may use this file in
12** accordance with the commercial license agreement provided with the
13** Software or, alternatively, in accordance with the terms contained in
14** a written agreement between you and The Qt Company. For licensing terms
15** and conditions see https://www.qt.io/terms-conditions. For further
16** information use the contact form at https://www.qt.io/contact-us.
17**
18** GNU Lesser General Public License Usage
19** Alternatively, this file may be used under the terms of the GNU Lesser
20** General Public License version 3 as published by the Free Software
21** Foundation and appearing in the file LICENSE.LGPL3 included in the
22** packaging of this file. Please review the following information to
23** ensure the GNU Lesser General Public License version 3 requirements
24** will be met: https://www.gnu.org/licenses/lgpl-3.0.html.
25**
26** GNU General Public License Usage
27** Alternatively, this file may be used under the terms of the GNU
28** General Public License version 2.0 or (at your option) the GNU General
29** Public license version 3 or any later version approved by the KDE Free
30** Qt Foundation. The licenses are as published by the Free Software
31** Foundation and appearing in the file LICENSE.GPL2 and LICENSE.GPL3
32** included in the packaging of this file. Please review the following
33** information to ensure the GNU General Public License requirements will
34** be met: https://www.gnu.org/licenses/gpl-2.0.html and
35** https://www.gnu.org/licenses/gpl-3.0.html.
36**
37** $QT_END_LICENSE$
38**
39****************************************************************************/
40
41/*!
42 \class QUrl
43 \inmodule QtCore
44
45 \brief The QUrl class provides a convenient interface for working
46 with URLs.
47
48 \reentrant
49 \ingroup io
50 \ingroup network
51 \ingroup shared
52
53
54 It can parse and construct URLs in both encoded and unencoded
55 form. QUrl also has support for internationalized domain names
56 (IDNs).
57
58 The most common way to use QUrl is to initialize it via the
59 constructor by passing a QString. Otherwise, setUrl() can also
60 be used.
61
62 URLs can be represented in two forms: encoded or unencoded. The
63 unencoded representation is suitable for showing to users, but
64 the encoded representation is typically what you would send to
65 a web server. For example, the unencoded URL
66 "http://b├╝hler.example.com/List of applicants.xml"
67 would be sent to the server as
68 "http://xn--bhler-kva.example.com/List%20of%20applicants.xml".
69
70 A URL can also be constructed piece by piece by calling
71 setScheme(), setUserName(), setPassword(), setHost(), setPort(),
72 setPath(), setQuery() and setFragment(). Some convenience
73 functions are also available: setAuthority() sets the user name,
74 password, host and port. setUserInfo() sets the user name and
75 password at once.
76
77 Call isValid() to check if the URL is valid. This can be done at any point
78 during the constructing of a URL. If isValid() returns \c false, you should
79 clear() the URL before proceeding, or start over by parsing a new URL with
80 setUrl().
81
82 Constructing a query is particularly convenient through the use of the \l
83 QUrlQuery class and its methods QUrlQuery::setQueryItems(),
84 QUrlQuery::addQueryItem() and QUrlQuery::removeQueryItem(). Use
85 QUrlQuery::setQueryDelimiters() to customize the delimiters used for
86 generating the query string.
87
88 For the convenience of generating encoded URL strings or query
89 strings, there are two static functions called
90 fromPercentEncoding() and toPercentEncoding() which deal with
91 percent encoding and decoding of QString objects.
92
93 fromLocalFile() constructs a QUrl by parsing a local
94 file path. toLocalFile() converts a URL to a local file path.
95
96 The human readable representation of the URL is fetched with
97 toString(). This representation is appropriate for displaying a
98 URL to a user in unencoded form. The encoded form however, as
99 returned by toEncoded(), is for internal use, passing to web
100 servers, mail clients and so on. Both forms are technically correct
101 and represent the same URL unambiguously -- in fact, passing either
102 form to QUrl's constructor or to setUrl() will yield the same QUrl
103 object.
104
105 QUrl conforms to the URI specification from
106 \l{RFC 3986} (Uniform Resource Identifier: Generic Syntax), and includes
107 scheme extensions from \l{RFC 1738} (Uniform Resource Locators). Case
108 folding rules in QUrl conform to \l{RFC 3491} (Nameprep: A Stringprep
109 Profile for Internationalized Domain Names (IDN)). It is also compatible with the
110 \l{http://freedesktop.org/wiki/Specifications/file-uri-spec/}{file URI specification}
111 from freedesktop.org, provided that the locale encodes file names using
112 UTF-8 (required by IDN).
113
114 \section2 Relative URLs vs Relative Paths
115
116 Calling isRelative() will return whether or not the URL is relative.
117 A relative URL has no \l {scheme}. For example:
118
119 \snippet code/src_corelib_io_qurl.cpp 8
120
121 Notice that a URL can be absolute while containing a relative path, and
122 vice versa:
123
124 \snippet code/src_corelib_io_qurl.cpp 9
125
126 A relative URL can be resolved by passing it as an argument to resolved(),
127 which returns an absolute URL. isParentOf() is used for determining whether
128 one URL is a parent of another.
129
130 \section2 Error checking
131
132 QUrl is capable of detecting many errors in URLs while parsing it or when
133 components of the URL are set with individual setter methods (like
134 setScheme(), setHost() or setPath()). If the parsing or setter function is
135 successful, any previously recorded error conditions will be discarded.
136
137 By default, QUrl setter methods operate in QUrl::TolerantMode, which means
138 they accept some common mistakes and mis-representation of data. An
139 alternate method of parsing is QUrl::StrictMode, which applies further
140 checks. See QUrl::ParsingMode for a description of the difference of the
141 parsing modes.
142
143 QUrl only checks for conformance with the URL specification. It does not
144 try to verify that high-level protocol URLs are in the format they are
145 expected to be by handlers elsewhere. For example, the following URIs are
146 all considered valid by QUrl, even if they do not make sense when used:
147
148 \list
149 \li "http:/filename.html"
150 \li "mailto://example.com"
151 \endlist
152
153 When the parser encounters an error, it signals the event by making
154 isValid() return false and toString() / toEncoded() return an empty string.
155 If it is necessary to show the user the reason why the URL failed to parse,
156 the error condition can be obtained from QUrl by calling errorString().
157 Note that this message is highly technical and may not make sense to
158 end-users.
159
160 QUrl is capable of recording only one error condition. If more than one
161 error is found, it is undefined which error is reported.
162
163 \section2 Character Conversions
164
165 Follow these rules to avoid erroneous character conversion when
166 dealing with URLs and strings:
167
168 \list
169 \li When creating a QString to contain a URL from a QByteArray or a
170 char*, always use QString::fromUtf8().
171 \endlist
172*/
173
174/*!
175 \enum QUrl::ParsingMode
176
177 The parsing mode controls the way QUrl parses strings.
178
179 \value TolerantMode QUrl will try to correct some common errors in URLs.
180 This mode is useful for parsing URLs coming from sources
181 not known to be strictly standards-conforming.
182
183 \value StrictMode Only valid URLs are accepted. This mode is useful for
184 general URL validation.
185
186 \value DecodedMode QUrl will interpret the URL component in the fully-decoded form,
187 where percent characters stand for themselves, not as the beginning
188 of a percent-encoded sequence. This mode is only valid for the
189 setters setting components of a URL; it is not permitted in
190 the QUrl constructor, in fromEncoded() or in setUrl().
191 For more information on this mode, see the documentation for
192 \l {QUrl::ComponentFormattingOption}{QUrl::FullyDecoded}.
193
194 In TolerantMode, the parser has the following behaviour:
195
196 \list
197
198 \li Spaces and "%20": unencoded space characters will be accepted and will
199 be treated as equivalent to "%20".
200
201 \li Single "%" characters: Any occurrences of a percent character "%" not
202 followed by exactly two hexadecimal characters (e.g., "13% coverage.html")
203 will be replaced by "%25". Note that one lone "%" character will trigger
204 the correction mode for all percent characters.
205
206 \li Reserved and unreserved characters: An encoded URL should only
207 contain a few characters as literals; all other characters should
208 be percent-encoded. In TolerantMode, these characters will be
209 accepted if they are found in the URL:
210 space / double-quote / "<" / ">" / "\" /
211 "^" / "`" / "{" / "|" / "}"
212 Those same characters can be decoded again by passing QUrl::DecodeReserved
213 to toString() or toEncoded(). In the getters of individual components,
214 those characters are often returned in decoded form.
215
216 \endlist
217
218 When in StrictMode, if a parsing error is found, isValid() will return \c
219 false and errorString() will return a message describing the error.
220 If more than one error is detected, it is undefined which error gets
221 reported.
222
223 Note that TolerantMode is not usually enough for parsing user input, which
224 often contains more errors and expectations than the parser can deal with.
225 When dealing with data coming directly from the user -- as opposed to data
226 coming from data-transfer sources, such as other programs -- it is
227 recommended to use fromUserInput().
228
229 \sa fromUserInput(), setUrl(), toString(), toEncoded(), QUrl::FormattingOptions
230*/
231
232/*!
233 \enum QUrl::UrlFormattingOption
234
235 The formatting options define how the URL is formatted when written out
236 as text.
237
238 \value None The format of the URL is unchanged.
239 \value RemoveScheme The scheme is removed from the URL.
240 \value RemovePassword Any password in the URL is removed.
241 \value RemoveUserInfo Any user information in the URL is removed.
242 \value RemovePort Any specified port is removed from the URL.
243 \value RemoveAuthority
244 \value RemovePath The URL's path is removed, leaving only the scheme,
245 host address, and port (if present).
246 \value RemoveQuery The query part of the URL (following a '?' character)
247 is removed.
248 \value RemoveFragment
249 \value RemoveFilename The filename (i.e. everything after the last '/' in the path) is removed.
250 The trailing '/' is kept, unless StripTrailingSlash is set.
251 Only valid if RemovePath is not set.
252 \value PreferLocalFile If the URL is a local file according to isLocalFile()
253 and contains no query or fragment, a local file path is returned.
254 \value StripTrailingSlash The trailing slash is removed from the path, if one is present.
255 \value NormalizePathSegments Modifies the path to remove redundant directory separators,
256 and to resolve "."s and ".."s (as far as possible). For non-local paths, adjacent
257 slashes are preserved.
258
259 Note that the case folding rules in \l{RFC 3491}{Nameprep}, which QUrl
260 conforms to, require host names to always be converted to lower case,
261 regardless of the Qt::FormattingOptions used.
262
263 The options from QUrl::ComponentFormattingOptions are also possible.
264
265 \sa QUrl::ComponentFormattingOptions
266*/
267
268/*!
269 \enum QUrl::ComponentFormattingOption
270 \since 5.0
271
272 The component formatting options define how the components of an URL will
273 be formatted when written out as text. They can be combined with the
274 options from QUrl::FormattingOptions when used in toString() and
275 toEncoded().
276
277 \value PrettyDecoded The component is returned in a "pretty form", with
278 most percent-encoded characters decoded. The exact
279 behavior of PrettyDecoded varies from component to
280 component and may also change from Qt release to Qt
281 release. This is the default.
282
283 \value EncodeSpaces Leave space characters in their encoded form ("%20").
284
285 \value EncodeUnicode Leave non-US-ASCII characters encoded in their UTF-8
286 percent-encoded form (e.g., "%C3%A9" for the U+00E9
287 codepoint, LATIN SMALL LETTER E WITH ACUTE).
288
289 \value EncodeDelimiters Leave certain delimiters in their encoded form, as
290 would appear in the URL when the full URL is
291 represented as text. The delimiters are affected
292 by this option change from component to component.
293 This flag has no effect in toString() or toEncoded().
294
295 \value EncodeReserved Leave US-ASCII characters not permitted in the URL by
296 the specification in their encoded form. This is the
297 default on toString() and toEncoded().
298
299 \value DecodeReserved Decode the US-ASCII characters that the URL specification
300 does not allow to appear in the URL. This is the
301 default on the getters of individual components.
302
303 \value FullyEncoded Leave all characters in their properly-encoded form,
304 as this component would appear as part of a URL. When
305 used with toString(), this produces a fully-compliant
306 URL in QString form, exactly equal to the result of
307 toEncoded()
308
309 \value FullyDecoded Attempt to decode as much as possible. For individual
310 components of the URL, this decodes every percent
311 encoding sequence, including control characters (U+0000
312 to U+001F) and UTF-8 sequences found in percent-encoded form.
313 Use of this mode may cause data loss, see below for more information.
314
315 The values of EncodeReserved and DecodeReserved should not be used together
316 in one call. The behavior is undefined if that happens. They are provided
317 as separate values because the behavior of the "pretty mode" with regards
318 to reserved characters is different on certain components and specially on
319 the full URL.
320
321 \section2 Full decoding
322
323 The FullyDecoded mode is similar to the behavior of the functions returning
324 QString in Qt 4.x, in that every character represents itself and never has
325 any special meaning. This is true even for the percent character ('%'),
326 which should be interpreted to mean a literal percent, not the beginning of
327 a percent-encoded sequence. The same actual character, in all other
328 decoding modes, is represented by the sequence "%25".
329
330 Whenever re-applying data obtained with QUrl::FullyDecoded into a QUrl,
331 care must be taken to use the QUrl::DecodedMode parameter to the setters
332 (like setPath() and setUserName()). Failure to do so may cause
333 re-interpretation of the percent character ('%') as the beginning of a
334 percent-encoded sequence.
335
336 This mode is quite useful when portions of a URL are used in a non-URL
337 context. For example, to extract the username, password or file paths in an
338 FTP client application, the FullyDecoded mode should be used.
339
340 This mode should be used with care, since there are two conditions that
341 cannot be reliably represented in the returned QString. They are:
342
343 \list
344 \li \b{Non-UTF-8 sequences:} URLs may contain sequences of
345 percent-encoded characters that do not form valid UTF-8 sequences. Since
346 URLs need to be decoded using UTF-8, any decoder failure will result in
347 the QString containing one or more replacement characters where the
348 sequence existed.
349
350 \li \b{Encoded delimiters:} URLs are also allowed to make a distinction
351 between a delimiter found in its literal form and its equivalent in
352 percent-encoded form. This is most commonly found in the query, but is
353 permitted in most parts of the URL.
354 \endlist
355
356 The following example illustrates the problem:
357
358 \snippet code/src_corelib_io_qurl.cpp 10
359
360 If the two URLs were used via HTTP GET, the interpretation by the web
361 server would probably be different. In the first case, it would interpret
362 as one parameter, with a key of "q" and value "a+=b&c". In the second
363 case, it would probably interpret as two parameters, one with a key of "q"
364 and value "a =b", and the second with a key "c" and no value.
365
366 \sa QUrl::FormattingOptions
367*/
368
369/*!
370 \enum QUrl::UserInputResolutionOption
371 \since 5.4
372
373 The user input resolution options define how fromUserInput() should
374 interpret strings that could either be a relative path or the short
375 form of a HTTP URL. For instance \c{file.pl} can be either a local file
376 or the URL \c{http://file.pl}.
377
378 \value DefaultResolution The default resolution mechanism is to check
379 whether a local file exists, in the working
380 directory given to fromUserInput, and only
381 return a local path in that case. Otherwise a URL
382 is assumed.
383 \value AssumeLocalFile This option makes fromUserInput() always return
384 a local path unless the input contains a scheme, such as
385 \c{http://file.pl}. This is useful for applications
386 such as text editors, which are able to create
387 the file if it doesn't exist.
388
389 \sa fromUserInput()
390*/
391
392/*!
393 \fn QUrl::QUrl(QUrl &&other)
394
395 Move-constructs a QUrl instance, making it point at the same
396 object that \a other was pointing to.
397
398 \since 5.2
399*/
400
401/*!
402 \fn QUrl &QUrl::operator=(QUrl &&other)
403
404 Move-assigns \a other to this QUrl instance.
405
406 \since 5.2
407*/
408
409#include "qurl.h"
410#include "qurl_p.h"
411#include "qplatformdefs.h"
412#include "qstring.h"
413#include "qstringlist.h"
414#include "qdebug.h"
415#include "qhash.h"
416#include "qdir.h" // for QDir::fromNativeSeparators
417#include "qdatastream.h"
418#if QT_CONFIG(topleveldomain)
419#include "qtldurl_p.h"
420#endif
421#include "private/qipaddress_p.h"
422#include "qurlquery.h"
423#include "private/qdir_p.h"
424#include <private/qmemory_p.h>
425
426QT_BEGIN_NAMESPACE
427
428inline static bool isHex(char c)
429{
430 c |= 0x20;
431 return (c >= '0' && c <= '9') || (c >= 'a' && c <= 'f');
432}
433
434static inline QString ftpScheme()
435{
436 return QStringLiteral("ftp");
437}
438
439static inline QString fileScheme()
440{
441 return QStringLiteral("file");
442}
443
444static inline QString webDavScheme()
445{
446 return QStringLiteral("webdavs");
447}
448
449static inline QString webDavSslTag()
450{
451 return QStringLiteral("@SSL");
452}
453
454class QUrlPrivate
455{
456public:
457 enum Section : uchar {
458 Scheme = 0x01,
459 UserName = 0x02,
460 Password = 0x04,
461 UserInfo = UserName | Password,
462 Host = 0x08,
463 Port = 0x10,
464 Authority = UserInfo | Host | Port,
465 Path = 0x20,
466 Hierarchy = Authority | Path,
467 Query = 0x40,
468 Fragment = 0x80,
469 FullUrl = 0xff
470 };
471
472 enum Flags : uchar {
473 IsLocalFile = 0x01
474 };
475
476 enum ErrorCode {
477 // the high byte of the error code matches the Section
478 // the first item in each value must be the generic "Invalid xxx Error"
479 InvalidSchemeError = Scheme << 8,
480
481 InvalidUserNameError = UserName << 8,
482
483 InvalidPasswordError = Password << 8,
484
485 InvalidRegNameError = Host << 8,
486 InvalidIPv4AddressError,
487 InvalidIPv6AddressError,
488 InvalidCharacterInIPv6Error,
489 InvalidIPvFutureError,
490 HostMissingEndBracket,
491
492 InvalidPortError = Port << 8,
493 PortEmptyError,
494
495 InvalidPathError = Path << 8,
496
497 InvalidQueryError = Query << 8,
498
499 InvalidFragmentError = Fragment << 8,
500
501 // the following three cases are only possible in combination with
502 // presence/absence of the path, authority and scheme. See validityError().
503 AuthorityPresentAndPathIsRelative = Authority << 8 | Path << 8 | 0x10000,
504 AuthorityAbsentAndPathIsDoubleSlash,
505 RelativeUrlPathContainsColonBeforeSlash = Scheme << 8 | Authority << 8 | Path << 8 | 0x10000,
506
507 NoError = 0
508 };
509
510 struct Error {
511 QString source;
512 ErrorCode code;
513 int position;
514 };
515
516 QUrlPrivate();
517 QUrlPrivate(const QUrlPrivate &copy);
518 ~QUrlPrivate();
519
520 void parse(const QString &url, QUrl::ParsingMode parsingMode);
521 bool isEmpty() const
522 { return sectionIsPresent == 0 && port == -1 && path.isEmpty(); }
523
524 std::unique_ptr<Error> cloneError() const;
525 void clearError();
526 void setError(ErrorCode errorCode, const QString &source, int supplement = -1);
527 ErrorCode validityError(QString *source = nullptr, int *position = nullptr) const;
528 bool validateComponent(Section section, const QString &input, int begin, int end);
529 bool validateComponent(Section section, const QString &input)
530 { return validateComponent(section, input, 0, uint(input.length())); }
531
532 // no QString scheme() const;
533 void appendAuthority(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const;
534 void appendUserInfo(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const;
535 void appendUserName(QString &appendTo, QUrl::FormattingOptions options) const;
536 void appendPassword(QString &appendTo, QUrl::FormattingOptions options) const;
537 void appendHost(QString &appendTo, QUrl::FormattingOptions options) const;
538 void appendPath(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const;
539 void appendQuery(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const;
540 void appendFragment(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const;
541
542 // the "end" parameters are like STL iterators: they point to one past the last valid element
543 bool setScheme(const QString &value, int len, bool doSetError);
544 void setAuthority(const QString &auth, int from, int end, QUrl::ParsingMode mode);
545 void setUserInfo(const QString &userInfo, int from, int end);
546 void setUserName(const QString &value, int from, int end);
547 void setPassword(const QString &value, int from, int end);
548 bool setHost(const QString &value, int from, int end, QUrl::ParsingMode mode);
549 void setPath(const QString &value, int from, int end);
550 void setQuery(const QString &value, int from, int end);
551 void setFragment(const QString &value, int from, int end);
552
553 inline bool hasScheme() const { return sectionIsPresent & Scheme; }
554 inline bool hasAuthority() const { return sectionIsPresent & Authority; }
555 inline bool hasUserInfo() const { return sectionIsPresent & UserInfo; }
556 inline bool hasUserName() const { return sectionIsPresent & UserName; }
557 inline bool hasPassword() const { return sectionIsPresent & Password; }
558 inline bool hasHost() const { return sectionIsPresent & Host; }
559 inline bool hasPort() const { return port != -1; }
560 inline bool hasPath() const { return !path.isEmpty(); }
561 inline bool hasQuery() const { return sectionIsPresent & Query; }
562 inline bool hasFragment() const { return sectionIsPresent & Fragment; }
563
564 inline bool isLocalFile() const { return flags & IsLocalFile; }
565 QString toLocalFile(QUrl::FormattingOptions options) const;
566
567 QString mergePaths(const QString &relativePath) const;
568
569 QAtomicInt ref;
570 int port;
571
572 QString scheme;
573 QString userName;
574 QString password;
575 QString host;
576 QString path;
577 QString query;
578 QString fragment;
579
580 std::unique_ptr<Error> error;
581
582 // not used for:
583 // - Port (port == -1 means absence)
584 // - Path (there's no path delimiter, so we optimize its use out of existence)
585 // Schemes are never supposed to be empty, but we keep the flag anyway
586 uchar sectionIsPresent;
587 uchar flags;
588
589 // 32-bit: 2 bytes tail padding available
590 // 64-bit: 6 bytes tail padding available
591};
592
593inline QUrlPrivate::QUrlPrivate()
594 : ref(1), port(-1),
595 sectionIsPresent(0),
596 flags(0)
597{
598}
599
600inline QUrlPrivate::QUrlPrivate(const QUrlPrivate &copy)
601 : ref(1), port(copy.port),
602 scheme(copy.scheme),
603 userName(copy.userName),
604 password(copy.password),
605 host(copy.host),
606 path(copy.path),
607 query(copy.query),
608 fragment(copy.fragment),
609 error(copy.cloneError()),
610 sectionIsPresent(copy.sectionIsPresent),
611 flags(copy.flags)
612{
613}
614
615inline QUrlPrivate::~QUrlPrivate()
616 = default;
617
618std::unique_ptr<QUrlPrivate::Error> QUrlPrivate::cloneError() const
619{
620 return error ? qt_make_unique<Error>(*error) : nullptr;
621}
622
623inline void QUrlPrivate::clearError()
624{
625 error.reset();
626}
627
628inline void QUrlPrivate::setError(ErrorCode errorCode, const QString &source, int supplement)
629{
630 if (error) {
631 // don't overwrite an error set in a previous section during parsing
632 return;
633 }
634 error = qt_make_unique<Error>();
635 error->code = errorCode;
636 error->source = source;
637 error->position = supplement;
638}
639
640// From RFC 3986, Appendix A Collected ABNF for URI
641// URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
642//[...]
643// scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )
644//
645// authority = [ userinfo "@" ] host [ ":" port ]
646// userinfo = *( unreserved / pct-encoded / sub-delims / ":" )
647// host = IP-literal / IPv4address / reg-name
648// port = *DIGIT
649//[...]
650// reg-name = *( unreserved / pct-encoded / sub-delims )
651//[..]
652// pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
653//
654// query = *( pchar / "/" / "?" )
655//
656// fragment = *( pchar / "/" / "?" )
657//
658// pct-encoded = "%" HEXDIG HEXDIG
659//
660// unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
661// reserved = gen-delims / sub-delims
662// gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"
663// sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
664// / "*" / "+" / "," / ";" / "="
665// the path component has a complex ABNF that basically boils down to
666// slash-separated segments of "pchar"
667
668// The above is the strict definition of the URL components and we mostly
669// adhere to it, with few exceptions. QUrl obeys the following behavior:
670// - percent-encoding sequences always use uppercase HEXDIG;
671// - unreserved characters are *always* decoded, no exceptions;
672// - the space character and bytes with the high bit set are controlled by
673// the EncodeSpaces and EncodeUnicode bits;
674// - control characters, the percent sign itself, and bytes with the high
675// bit set that don't form valid UTF-8 sequences are always encoded,
676// except in FullyDecoded mode;
677// - sub-delims are always left alone, except in FullyDecoded mode;
678// - gen-delim change behavior depending on which section of the URL (or
679// the entire URL) we're looking at; see below;
680// - characters not mentioned above, like "<", and ">", are usually
681// decoded in individual sections of the URL, but encoded when the full
682// URL is put together (we can change on subjective definition of
683// "pretty").
684//
685// The behavior for the delimiters bears some explanation. The spec says in
686// section 2.2:
687// URIs that differ in the replacement of a reserved character with its
688// corresponding percent-encoded octet are not equivalent.
689// (note: QUrl API mistakenly uses the "reserved" term, so we will refer to
690// them here as "delimiters").
691//
692// For that reason, we cannot encode delimiters found in decoded form and we
693// cannot decode the ones found in encoded form if that would change the
694// interpretation. Conversely, we *can* perform the transformation if it would
695// not change the interpretation. From the last component of a URL to the first,
696// here are the gen-delims we can unambiguously transform when the field is
697// taken in isolation:
698// - fragment: none, since it's the last
699// - query: "#" is unambiguous
700// - path: "#" and "?" are unambiguous
701// - host: completely special but never ambiguous, see setHost() below.
702// - password: the "#", "?", "/", "[", "]" and "@" characters are unambiguous
703// - username: the "#", "?", "/", "[", "]", "@", and ":" characters are unambiguous
704// - scheme: doesn't accept any delimiter, see setScheme() below.
705//
706// Internally, QUrl stores each component in the format that corresponds to the
707// default mode (PrettyDecoded). It deviates from the "strict" FullyEncoded
708// mode in the following way:
709// - spaces are decoded
710// - valid UTF-8 sequences are decoded
711// - gen-delims that can be unambiguously transformed are decoded
712// - characters controlled by DecodeReserved are often decoded, though this behavior
713// can change depending on the subjective definition of "pretty"
714//
715// Note that the list of gen-delims that we can transform is different for the
716// user info (user name + password) and the authority (user info + host +
717// port).
718
719
720// list the recoding table modifications to be used with the recodeFromUser and
721// appendToUser functions, according to the rules above. Spaces and UTF-8
722// sequences are handled outside the tables.
723
724// the encodedXXX tables are run with the delimiters set to "leave" by default;
725// the decodedXXX tables are run with the delimiters set to "decode" by default
726// (except for the query, which doesn't use these functions)
727
728#define decode(x) ushort(x)
729#define leave(x) ushort(0x100 | (x))
730#define encode(x) ushort(0x200 | (x))
731
732static const ushort userNameInIsolation[] = {
733 decode(':'), // 0
734 decode('@'), // 1
735 decode(']'), // 2
736 decode('['), // 3
737 decode('/'), // 4
738 decode('?'), // 5
739 decode('#'), // 6
740
741 decode('"'), // 7
742 decode('<'),
743 decode('>'),
744 decode('^'),
745 decode('\\'),
746 decode('|'),
747 decode('{'),
748 decode('}'),
749 0
750};
751static const ushort * const passwordInIsolation = userNameInIsolation + 1;
752static const ushort * const pathInIsolation = userNameInIsolation + 5;
753static const ushort * const queryInIsolation = userNameInIsolation + 6;
754static const ushort * const fragmentInIsolation = userNameInIsolation + 7;
755
756static const ushort userNameInUserInfo[] = {
757 encode(':'), // 0
758 decode('@'), // 1
759 decode(']'), // 2
760 decode('['), // 3
761 decode('/'), // 4
762 decode('?'), // 5
763 decode('#'), // 6
764
765 decode('"'), // 7
766 decode('<'),
767 decode('>'),
768 decode('^'),
769 decode('\\'),
770 decode('|'),
771 decode('{'),
772 decode('}'),
773 0
774};
775static const ushort * const passwordInUserInfo = userNameInUserInfo + 1;
776
777static const ushort userNameInAuthority[] = {
778 encode(':'), // 0
779 encode('@'), // 1
780 encode(']'), // 2
781 encode('['), // 3
782 decode('/'), // 4
783 decode('?'), // 5
784 decode('#'), // 6
785
786 decode('"'), // 7
787 decode('<'),
788 decode('>'),
789 decode('^'),
790 decode('\\'),
791 decode('|'),
792 decode('{'),
793 decode('}'),
794 0
795};
796static const ushort * const passwordInAuthority = userNameInAuthority + 1;
797
798static const ushort userNameInUrl[] = {
799 encode(':'), // 0
800 encode('@'), // 1
801 encode(']'), // 2
802 encode('['), // 3
803 encode('/'), // 4
804 encode('?'), // 5
805 encode('#'), // 6
806
807 // no need to list encode(x) for the other characters
808 0
809};
810static const ushort * const passwordInUrl = userNameInUrl + 1;
811static const ushort * const pathInUrl = userNameInUrl + 5;
812static const ushort * const queryInUrl = userNameInUrl + 6;
813static const ushort * const fragmentInUrl = userNameInUrl + 6;
814
815static inline void parseDecodedComponent(QString &data)
816{
817 data.replace(QLatin1Char('%'), QLatin1String("%25"));
818}
819
820static inline QString
821recodeFromUser(const QString &input, const ushort *actions, int from, int to)
822{
823 QString output;
824 const QChar *begin = input.constData() + from;
825 const QChar *end = input.constData() + to;
826 if (qt_urlRecode(output, begin, end, nullptr, actions))
827 return output;
828
829 return input.mid(from, to - from);
830}
831
832// appendXXXX functions: copy from the internal form to the external, user form.
833// the internal value is stored in its PrettyDecoded form, so that case is easy.
834static inline void appendToUser(QString &appendTo, const QStringRef &value, QUrl::FormattingOptions options,
835 const ushort *actions)
836{
837 if (options == QUrl::PrettyDecoded) {
838 appendTo += value;
839 return;
840 }
841
842 if (!qt_urlRecode(appendTo, value.data(), value.end(), options, actions))
843 appendTo += value;
844}
845
846static inline void appendToUser(QString &appendTo, const QString &value, QUrl::FormattingOptions options,
847 const ushort *actions)
848{
849 appendToUser(appendTo, QStringRef(&value), options, actions);
850}
851
852
853inline void QUrlPrivate::appendAuthority(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const
854{
855 if ((options & QUrl::RemoveUserInfo) != QUrl::RemoveUserInfo) {
856 appendUserInfo(appendTo, options, appendingTo);
857
858 // add '@' only if we added anything
859 if (hasUserName() || (hasPassword() && (options & QUrl::RemovePassword) == 0))
860 appendTo += QLatin1Char('@');
861 }
862 appendHost(appendTo, options);
863 if (!(options & QUrl::RemovePort) && port != -1)
864 appendTo += QLatin1Char(':') + QString::number(port);
865}
866
867inline void QUrlPrivate::appendUserInfo(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const
868{
869 if (Q_LIKELY(!hasUserInfo()))
870 return;
871
872 const ushort *userNameActions;
873 const ushort *passwordActions;
874 if (options & QUrl::EncodeDelimiters) {
875 userNameActions = userNameInUrl;
876 passwordActions = passwordInUrl;
877 } else {
878 switch (appendingTo) {
879 case UserInfo:
880 userNameActions = userNameInUserInfo;
881 passwordActions = passwordInUserInfo;
882 break;
883
884 case Authority:
885 userNameActions = userNameInAuthority;
886 passwordActions = passwordInAuthority;
887 break;
888
889 case FullUrl:
890 userNameActions = userNameInUrl;
891 passwordActions = passwordInUrl;
892 break;
893
894 default:
895 // can't happen
896 Q_UNREACHABLE();
897 break;
898 }
899 }
900
901 if (!qt_urlRecode(appendTo, userName.constData(), userName.constEnd(), options, userNameActions))
902 appendTo += userName;
903 if (options & QUrl::RemovePassword || !hasPassword()) {
904 return;
905 } else {
906 appendTo += QLatin1Char(':');
907 if (!qt_urlRecode(appendTo, password.constData(), password.constEnd(), options, passwordActions))
908 appendTo += password;
909 }
910}
911
912inline void QUrlPrivate::appendUserName(QString &appendTo, QUrl::FormattingOptions options) const
913{
914 // only called from QUrl::userName()
915 appendToUser(appendTo, userName, options,
916 options & QUrl::EncodeDelimiters ? userNameInUrl : userNameInIsolation);
917}
918
919inline void QUrlPrivate::appendPassword(QString &appendTo, QUrl::FormattingOptions options) const
920{
921 // only called from QUrl::password()
922 appendToUser(appendTo, password, options,
923 options & QUrl::EncodeDelimiters ? passwordInUrl : passwordInIsolation);
924}
925
926inline void QUrlPrivate::appendPath(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const
927{
928 QString thePath = path;
929 if (options & QUrl::NormalizePathSegments) {
930 thePath = qt_normalizePathSegments(path, isLocalFile() ? QDirPrivate::DefaultNormalization : QDirPrivate::RemotePath);
931 }
932
933 QStringRef thePathRef(&thePath);
934 if (options & QUrl::RemoveFilename) {
935 const int slash = path.lastIndexOf(QLatin1Char('/'));
936 if (slash == -1)
937 return;
938 thePathRef = path.leftRef(slash + 1);
939 }
940 // check if we need to remove trailing slashes
941 if (options & QUrl::StripTrailingSlash) {
942 while (thePathRef.length() > 1 && thePathRef.endsWith(QLatin1Char('/')))
943 thePathRef.chop(1);
944 }
945
946 appendToUser(appendTo, thePathRef, options,
947 appendingTo == FullUrl || options & QUrl::EncodeDelimiters ? pathInUrl : pathInIsolation);
948}
949
950inline void QUrlPrivate::appendFragment(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const
951{
952 appendToUser(appendTo, fragment, options,
953 options & QUrl::EncodeDelimiters ? fragmentInUrl :
954 appendingTo == FullUrl ? nullptr : fragmentInIsolation);
955}
956
957inline void QUrlPrivate::appendQuery(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const
958{
959 appendToUser(appendTo, query, options,
960 appendingTo == FullUrl || options & QUrl::EncodeDelimiters ? queryInUrl : queryInIsolation);
961}
962
963// setXXX functions
964
965inline bool QUrlPrivate::setScheme(const QString &value, int len, bool doSetError)
966{
967 // schemes are strictly RFC-compliant:
968 // scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )
969 // we also lowercase the scheme
970
971 // schemes in URLs are not allowed to be empty, but they can be in
972 // "Relative URIs" which QUrl also supports. QUrl::setScheme does
973 // not call us with len == 0, so this can only be from parse()
974 scheme.clear();
975 if (len == 0)
976 return false;
977
978 sectionIsPresent |= Scheme;
979
980 // validate it:
981 int needsLowercasing = -1;
982 const ushort *p = reinterpret_cast<const ushort *>(value.constData());
983 for (int i = 0; i < len; ++i) {
984 if (p[i] >= 'a' && p[i] <= 'z')
985 continue;
986 if (p[i] >= 'A' && p[i] <= 'Z') {
987 needsLowercasing = i;
988 continue;
989 }
990 if (i) {
991 if (p[i] >= '0' && p[i] <= '9')
992 continue;
993 if (p[i] == '+' || p[i] == '-' || p[i] == '.')
994 continue;
995 }
996
997 // found something else
998 // don't call setError needlessly:
999 // if we've been called from parse(), it will try to recover
1000 if (doSetError)
1001 setError(InvalidSchemeError, value, i);
1002 return false;
1003 }
1004
1005 scheme = value.left(len);
1006
1007 if (needsLowercasing != -1) {
1008 // schemes are ASCII only, so we don't need the full Unicode toLower
1009 QChar *schemeData = scheme.data(); // force detaching here
1010 for (int i = needsLowercasing; i >= 0; --i) {
1011 ushort c = schemeData[i].unicode();
1012 if (c >= 'A' && c <= 'Z')
1013 schemeData[i] = QChar(c + 0x20);
1014 }
1015 }
1016
1017 // did we set to the file protocol?
1018 if (scheme == fileScheme()
1019#ifdef Q_OS_WIN
1020 || scheme == webDavScheme()
1021#endif
1022 ) {
1023 flags |= IsLocalFile;
1024 } else {
1025 flags &= ~IsLocalFile;
1026 }
1027 return true;
1028}
1029
1030inline void QUrlPrivate::setAuthority(const QString &auth, int from, int end, QUrl::ParsingMode mode)
1031{
1032 sectionIsPresent &= ~Authority;
1033 sectionIsPresent |= Host;
1034 port = -1;
1035
1036 // we never actually _loop_
1037 while (from != end) {
1038 int userInfoIndex = auth.indexOf(QLatin1Char('@'), from);
1039 if (uint(userInfoIndex) < uint(end)) {
1040 setUserInfo(auth, from, userInfoIndex);
1041 if (mode == QUrl::StrictMode && !validateComponent(UserInfo, auth, from, userInfoIndex))
1042 break;
1043 from = userInfoIndex + 1;
1044 }
1045
1046 int colonIndex = auth.lastIndexOf(QLatin1Char(':'), end - 1);
1047 if (colonIndex < from)
1048 colonIndex = -1;
1049
1050 if (uint(colonIndex) < uint(end)) {
1051 if (auth.at(from).unicode() == '[') {
1052 // check if colonIndex isn't inside the "[...]" part
1053 int closingBracket = auth.indexOf(QLatin1Char(']'), from);
1054 if (uint(closingBracket) > uint(colonIndex))
1055 colonIndex = -1;
1056 }
1057 }
1058
1059 if (uint(colonIndex) < uint(end) - 1) {
1060 // found a colon with digits after it
1061 unsigned long x = 0;
1062 for (int i = colonIndex + 1; i < end; ++i) {
1063 ushort c = auth.at(i).unicode();
1064 if (c >= '0' && c <= '9') {
1065 x *= 10;
1066 x += c - '0';
1067 } else {
1068 x = ulong(-1); // x != ushort(x)
1069 break;
1070 }
1071 }
1072 if (x == ushort(x)) {
1073 port = ushort(x);
1074 } else {
1075 setError(InvalidPortError, auth, colonIndex + 1);
1076 if (mode == QUrl::StrictMode)
1077 break;
1078 }
1079 }
1080
1081 setHost(auth, from, qMin<uint>(end, colonIndex), mode);
1082 if (mode == QUrl::StrictMode && !validateComponent(Host, auth, from, qMin<uint>(end, colonIndex))) {
1083 // clear host too
1084 sectionIsPresent &= ~Authority;
1085 break;
1086 }
1087
1088 // success
1089 return;
1090 }
1091 // clear all sections but host
1092 sectionIsPresent &= ~Authority | Host;
1093 userName.clear();
1094 password.clear();
1095 host.clear();
1096 port = -1;
1097}
1098
1099inline void QUrlPrivate::setUserInfo(const QString &userInfo, int from, int end)
1100{
1101 int delimIndex = userInfo.indexOf(QLatin1Char(':'), from);
1102 setUserName(userInfo, from, qMin<uint>(delimIndex, end));
1103
1104 if (uint(delimIndex) >= uint(end)) {
1105 password.clear();
1106 sectionIsPresent &= ~Password;
1107 } else {
1108 setPassword(userInfo, delimIndex + 1, end);
1109 }
1110}
1111
1112inline void QUrlPrivate::setUserName(const QString &value, int from, int end)
1113{
1114 sectionIsPresent |= UserName;
1115 userName = recodeFromUser(value, userNameInIsolation, from, end);
1116}
1117
1118inline void QUrlPrivate::setPassword(const QString &value, int from, int end)
1119{
1120 sectionIsPresent |= Password;
1121 password = recodeFromUser(value, passwordInIsolation, from, end);
1122}
1123
1124inline void QUrlPrivate::setPath(const QString &value, int from, int end)
1125{
1126 // sectionIsPresent |= Path; // not used, save some cycles
1127 path = recodeFromUser(value, pathInIsolation, from, end);
1128}
1129
1130inline void QUrlPrivate::setFragment(const QString &value, int from, int end)
1131{
1132 sectionIsPresent |= Fragment;
1133 fragment = recodeFromUser(value, fragmentInIsolation, from, end);
1134}
1135
1136inline void QUrlPrivate::setQuery(const QString &value, int from, int iend)
1137{
1138 sectionIsPresent |= Query;
1139 query = recodeFromUser(value, queryInIsolation, from, iend);
1140}
1141
1142// Host handling
1143// The RFC says the host is:
1144// host = IP-literal / IPv4address / reg-name
1145// IP-literal = "[" ( IPv6address / IPvFuture ) "]"
1146// IPvFuture = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" )
1147// [a strict definition of IPv6Address and IPv4Address]
1148// reg-name = *( unreserved / pct-encoded / sub-delims )
1149//
1150// We deviate from the standard in all but IPvFuture. For IPvFuture we accept
1151// and store only exactly what the RFC says we should. No percent-encoding is
1152// permitted in this field, so Unicode characters and space aren't either.
1153//
1154// For IPv4 addresses, we accept broken addresses like inet_aton does (that is,
1155// less than three dots). However, we correct the address to the proper form
1156// and store the corrected address. After correction, we comply to the RFC and
1157// it's exclusively composed of unreserved characters.
1158//
1159// For IPv6 addresses, we accept addresses including trailing (embedded) IPv4
1160// addresses, the so-called v4-compat and v4-mapped addresses. We also store
1161// those addresses like that in the hostname field, which violates the spec.
1162// IPv6 hosts are stored with the square brackets in the QString. It also
1163// requires no transformation in any way.
1164//
1165// As for registered names, it's the other way around: we accept only valid
1166// hostnames as specified by STD 3 and IDNA. That means everything we accept is
1167// valid in the RFC definition above, but there are many valid reg-names
1168// according to the RFC that we do not accept in the name of security. Since we
1169// do accept IDNA, reg-names are subject to ACE encoding and decoding, which is
1170// specified by the DecodeUnicode flag. The hostname is stored in its Unicode form.
1171
1172inline void QUrlPrivate::appendHost(QString &appendTo, QUrl::FormattingOptions options) const
1173{
1174 if (host.isEmpty())
1175 return;
1176 if (host.at(0).unicode() == '[') {
1177 // IPv6 addresses might contain a zone-id which needs to be recoded
1178 if (options != 0)
1179 if (qt_urlRecode(appendTo, host.constBegin(), host.constEnd(), options, nullptr))
1180 return;
1181 appendTo += host;
1182 } else {
1183 // this is either an IPv4Address or a reg-name
1184 // if it is a reg-name, it is already stored in Unicode form
1185 if (options & QUrl::EncodeUnicode && !(options & 0x4000000))
1186 appendTo += qt_ACE_do(host, ToAceOnly, AllowLeadingDot);
1187 else
1188 appendTo += host;
1189 }
1190}
1191
1192// the whole IPvFuture is passed and parsed here, including brackets;
1193// returns null if the parsing was successful, or the QChar of the first failure
1194static const QChar *parseIpFuture(QString &host, const QChar *begin, const QChar *end, QUrl::ParsingMode mode)
1195{
1196 // IPvFuture = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" )
1197 static const char acceptable[] =
1198 "!$&'()*+,;=" // sub-delims
1199 ":" // ":"
1200 "-._~"; // unreserved
1201
1202 // the brackets and the "v" have been checked
1203 const QChar *const origBegin = begin;
1204 if (begin[3].unicode() != '.')
1205 return &begin[3];
1206 if ((begin[2].unicode() >= 'A' && begin[2].unicode() <= 'F') ||
1207 (begin[2].unicode() >= 'a' && begin[2].unicode() <= 'f') ||
1208 (begin[2].unicode() >= '0' && begin[2].unicode() <= '9')) {
1209 // this is so unlikely that we'll just go down the slow path
1210 // decode the whole string, skipping the "[vH." and "]" which we already know to be there
1211 host += QString::fromRawData(begin, 4);
1212
1213 // uppercase the version, if necessary
1214 if (begin[2].unicode() >= 'a')
1215 host[host.length() - 2] = begin[2].unicode() - 0x20;
1216
1217 begin += 4;
1218 --end;
1219
1220 QString decoded;
1221 if (mode == QUrl::TolerantMode && qt_urlRecode(decoded, begin, end, QUrl::FullyDecoded, nullptr)) {
1222 begin = decoded.constBegin();
1223 end = decoded.constEnd();
1224 }
1225
1226 for ( ; begin != end; ++begin) {
1227 if (begin->unicode() >= 'A' && begin->unicode() <= 'Z')
1228 host += *begin;
1229 else if (begin->unicode() >= 'a' && begin->unicode() <= 'z')
1230 host += *begin;
1231 else if (begin->unicode() >= '0' && begin->unicode() <= '9')
1232 host += *begin;
1233 else if (begin->unicode() < 0x80 && strchr(acceptable, begin->unicode()) != nullptr)
1234 host += *begin;
1235 else
1236 return decoded.isEmpty() ? begin : &origBegin[2];
1237 }
1238 host += QLatin1Char(']');
1239 return nullptr;
1240 }
1241 return &origBegin[2];
1242}
1243
1244// ONLY the IPv6 address is parsed here, WITHOUT the brackets
1245static const QChar *parseIp6(QString &host, const QChar *begin, const QChar *end, QUrl::ParsingMode mode)
1246{
1247 // ### Update to use QStringView once QStringView::indexOf and QStringView::lastIndexOf exists
1248 QString decoded;
1249 if (mode == QUrl::TolerantMode) {
1250 // this struct is kept in automatic storage because it's only 4 bytes
1251 const ushort decodeColon[] = { decode(':'), 0 };
1252 if (qt_urlRecode(decoded, begin, end, QUrl::ComponentFormattingOption::PrettyDecoded, decodeColon) == 0)
1253 decoded = QString(begin, end-begin);
1254 } else {
1255 decoded = QString(begin, end-begin);
1256 }
1257
1258 const QLatin1String zoneIdIdentifier("%25");
1259 QIPAddressUtils::IPv6Address address;
1260 QString zoneId;
1261
1262 const QChar *endBeforeZoneId = decoded.constEnd();
1263
1264 int zoneIdPosition = decoded.indexOf(zoneIdIdentifier);
1265 if ((zoneIdPosition != -1) && (decoded.lastIndexOf(zoneIdIdentifier) == zoneIdPosition)) {
1266 zoneId = decoded.mid(zoneIdPosition + zoneIdIdentifier.size());
1267 endBeforeZoneId = decoded.constBegin() + zoneIdPosition;
1268
1269 if (zoneId.isEmpty())
1270 return end;
1271 }
1272
1273 const QChar *ret = QIPAddressUtils::parseIp6(address, decoded.constBegin(), endBeforeZoneId);
1274 if (ret)
1275 return begin + (ret - decoded.constBegin());
1276
1277 host.reserve(host.size() + (decoded.constEnd() - decoded.constBegin()));
1278 host += QLatin1Char('[');
1279 QIPAddressUtils::toString(host, address);
1280
1281 if (!zoneId.isEmpty()) {
1282 host += zoneIdIdentifier;
1283 host += zoneId;
1284 }
1285 host += QLatin1Char(']');
1286 return nullptr;
1287}
1288
1289inline bool QUrlPrivate::setHost(const QString &value, int from, int iend, QUrl::ParsingMode mode)
1290{
1291 const QChar *begin = value.constData() + from;
1292 const QChar *end = value.constData() + iend;
1293
1294 const int len = end - begin;
1295 host.clear();
1296 sectionIsPresent |= Host;
1297 if (len == 0)
1298 return true;
1299
1300 if (begin[0].unicode() == '[') {
1301 // IPv6Address or IPvFuture
1302 // smallest IPv6 address is "[::]" (len = 4)
1303 // smallest IPvFuture address is "[v7.X]" (len = 6)
1304 if (end[-1].unicode() != ']') {
1305 setError(HostMissingEndBracket, value);
1306 return false;
1307 }
1308
1309 if (len > 5 && begin[1].unicode() == 'v') {
1310 const QChar *c = parseIpFuture(host, begin, end, mode);
1311 if (c)
1312 setError(InvalidIPvFutureError, value, c - value.constData());
1313 return !c;
1314 } else if (begin[1].unicode() == 'v') {
1315 setError(InvalidIPvFutureError, value, from);
1316 }
1317
1318 const QChar *c = parseIp6(host, begin + 1, end - 1, mode);
1319 if (!c)
1320 return true;
1321
1322 if (c == end - 1)
1323 setError(InvalidIPv6AddressError, value, from);
1324 else
1325 setError(InvalidCharacterInIPv6Error, value, c - value.constData());
1326 return false;
1327 }
1328
1329 // check if it's an IPv4 address
1330 QIPAddressUtils::IPv4Address ip4;
1331 if (QIPAddressUtils::parseIp4(ip4, begin, end)) {
1332 // yes, it was
1333 QIPAddressUtils::toString(host, ip4);
1334 return true;
1335 }
1336
1337 // This is probably a reg-name.
1338 // But it can also be an encoded string that, when decoded becomes one
1339 // of the types above.
1340 //
1341 // Two types of encoding are possible:
1342 // percent encoding (e.g., "%31%30%2E%30%2E%30%2E%31" -> "10.0.0.1")
1343 // Unicode encoding (some non-ASCII characters case-fold to digits
1344 // when nameprepping is done)
1345 //
1346 // The qt_ACE_do function below applies nameprepping and the STD3 check.
1347 // That means a Unicode string may become an IPv4 address, but it cannot
1348 // produce a '[' or a '%'.
1349
1350 // check for percent-encoding first
1351 QString s;
1352 if (mode == QUrl::TolerantMode && qt_urlRecode(s, begin, end, { }, nullptr)) {
1353 // something was decoded
1354 // anything encoded left?
1355 int pos = s.indexOf(QChar(0x25)); // '%'
1356 if (pos != -1) {
1357 setError(InvalidRegNameError, s, pos);
1358 return false;
1359 }
1360
1361 // recurse
1362 return setHost(s, 0, s.length(), QUrl::StrictMode);
1363 }
1364
1365 s = qt_ACE_do(QString::fromRawData(begin, len), NormalizeAce, ForbidLeadingDot);
1366 if (s.isEmpty()) {
1367 setError(InvalidRegNameError, value);
1368 return false;
1369 }
1370
1371 // check IPv4 again
1372 if (QIPAddressUtils::parseIp4(ip4, s.constBegin(), s.constEnd())) {
1373 QIPAddressUtils::toString(host, ip4);
1374 } else {
1375 host = s;
1376 }
1377 return true;
1378}
1379
1380inline void QUrlPrivate::parse(const QString &url, QUrl::ParsingMode parsingMode)
1381{
1382 // URI-reference = URI / relative-ref
1383 // URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
1384 // relative-ref = relative-part [ "?" query ] [ "#" fragment ]
1385 // hier-part = "//" authority path-abempty
1386 // / other path types
1387 // relative-part = "//" authority path-abempty
1388 // / other path types here
1389
1390 sectionIsPresent = 0;
1391 flags = 0;
1392 clearError();
1393
1394 // find the important delimiters
1395 int colon = -1;
1396 int question = -1;
1397 int hash = -1;
1398 const int len = url.length();
1399 const QChar *const begin = url.constData();
1400 const ushort *const data = reinterpret_cast<const ushort *>(begin);
1401
1402 for (int i = 0; i < len; ++i) {
1403 uint uc = data[i];
1404 if (uc == '#' && hash == -1) {
1405 hash = i;
1406
1407 // nothing more to be found
1408 break;
1409 }
1410
1411 if (question == -1) {
1412 if (uc == ':' && colon == -1)
1413 colon = i;
1414 else if (uc == '?')
1415 question = i;
1416 }
1417 }
1418
1419 // check if we have a scheme
1420 int hierStart;
1421 if (colon != -1 && setScheme(url, colon, /* don't set error */ false)) {
1422 hierStart = colon + 1;
1423 } else {
1424 // recover from a failed scheme: it might not have been a scheme at all
1425 scheme.clear();
1426 sectionIsPresent = 0;
1427 hierStart = 0;
1428 }
1429
1430 int pathStart;
1431 int hierEnd = qMin<uint>(qMin<uint>(question, hash), len);
1432 if (hierEnd - hierStart >= 2 && data[hierStart] == '/' && data[hierStart + 1] == '/') {
1433 // we have an authority, it ends at the first slash after these
1434 int authorityEnd = hierEnd;
1435 for (int i = hierStart + 2; i < authorityEnd ; ++i) {
1436 if (data[i] == '/') {
1437 authorityEnd = i;
1438 break;
1439 }
1440 }
1441
1442 setAuthority(url, hierStart + 2, authorityEnd, parsingMode);
1443
1444 // even if we failed to set the authority properly, let's try to recover
1445 pathStart = authorityEnd;
1446 setPath(url, pathStart, hierEnd);
1447 } else {
1448 userName.clear();
1449 password.clear();
1450 host.clear();
1451 port = -1;
1452 pathStart = hierStart;
1453
1454 if (hierStart < hierEnd)
1455 setPath(url, hierStart, hierEnd);
1456 else
1457 path.clear();
1458 }
1459
1460 if (uint(question) < uint(hash))
1461 setQuery(url, question + 1, qMin<uint>(hash, len));
1462
1463 if (hash != -1)
1464 setFragment(url, hash + 1, len);
1465
1466 if (error || parsingMode == QUrl::TolerantMode)
1467 return;
1468
1469 // The parsing so far was partially tolerant of errors, except for the
1470 // scheme parser (which is always strict) and the authority (which was
1471 // executed in strict mode).
1472 // If we haven't found any errors so far, continue the strict-mode parsing
1473 // from the path component onwards.
1474
1475 if (!validateComponent(Path, url, pathStart, hierEnd))
1476 return;
1477 if (uint(question) < uint(hash) && !validateComponent(Query, url, question + 1, qMin<uint>(hash, len)))
1478 return;
1479 if (hash != -1)
1480 validateComponent(Fragment, url, hash + 1, len);
1481}
1482
1483QString QUrlPrivate::toLocalFile(QUrl::FormattingOptions options) const
1484{
1485 QString tmp;
1486 QString ourPath;
1487 appendPath(ourPath, options, QUrlPrivate::Path);
1488
1489 // magic for shared drive on windows
1490 if (!host.isEmpty()) {
1491 tmp = QLatin1String("//") + host;
1492#ifdef Q_OS_WIN // QTBUG-42346, WebDAV is visible as local file on Windows only.
1493 if (scheme == webDavScheme())
1494 tmp += webDavSslTag();
1495#endif
1496 if (!ourPath.isEmpty() && !ourPath.startsWith(QLatin1Char('/')))
1497 tmp += QLatin1Char('/');
1498 tmp += ourPath;
1499 } else {
1500 tmp = ourPath;
1501#ifdef Q_OS_WIN
1502 // magic for drives on windows
1503 if (ourPath.length() > 2 && ourPath.at(0) == QLatin1Char('/') && ourPath.at(2) == QLatin1Char(':'))
1504 tmp.remove(0, 1);
1505#endif
1506 }
1507 return tmp;
1508}
1509
1510/*
1511 From http://www.ietf.org/rfc/rfc3986.txt, 5.2.3: Merge paths
1512
1513 Returns a merge of the current path with the relative path passed
1514 as argument.
1515
1516 Note: \a relativePath is relative (does not start with '/').
1517*/
1518inline QString QUrlPrivate::mergePaths(const QString &relativePath) const
1519{
1520 // If the base URI has a defined authority component and an empty
1521 // path, then return a string consisting of "/" concatenated with
1522 // the reference's path; otherwise,
1523 if (!host.isEmpty() && path.isEmpty())
1524 return QLatin1Char('/') + relativePath;
1525
1526 // Return a string consisting of the reference's path component
1527 // appended to all but the last segment of the base URI's path
1528 // (i.e., excluding any characters after the right-most "/" in the
1529 // base URI path, or excluding the entire base URI path if it does
1530 // not contain any "/" characters).
1531 QString newPath;
1532 if (!path.contains(QLatin1Char('/')))
1533 newPath = relativePath;
1534 else
1535 newPath = path.leftRef(path.lastIndexOf(QLatin1Char('/')) + 1) + relativePath;
1536
1537 return newPath;
1538}
1539
1540/*
1541 From http://www.ietf.org/rfc/rfc3986.txt, 5.2.4: Remove dot segments
1542
1543 Removes unnecessary ../ and ./ from the path. Used for normalizing
1544 the URL.
1545*/
1546static void removeDotsFromPath(QString *path)
1547{
1548 // The input buffer is initialized with the now-appended path
1549 // components and the output buffer is initialized to the empty
1550 // string.
1551 QChar *out = path->data();
1552 const QChar *in = out;
1553 const QChar *end = out + path->size();
1554
1555 // If the input buffer consists only of
1556 // "." or "..", then remove that from the input
1557 // buffer;
1558 if (path->size() == 1 && in[0].unicode() == '.')
1559 ++in;
1560 else if (path->size() == 2 && in[0].unicode() == '.' && in[1].unicode() == '.')
1561 in += 2;
1562 // While the input buffer is not empty, loop:
1563 while (in < end) {
1564
1565 // otherwise, if the input buffer begins with a prefix of "../" or "./",
1566 // then remove that prefix from the input buffer;
1567 if (path->size() >= 2 && in[0].unicode() == '.' && in[1].unicode() == '/')
1568 in += 2;
1569 else if (path->size() >= 3 && in[0].unicode() == '.'
1570 && in[1].unicode() == '.' && in[2].unicode() == '/')
1571 in += 3;
1572
1573 // otherwise, if the input buffer begins with a prefix of
1574 // "/./" or "/.", where "." is a complete path segment,
1575 // then replace that prefix with "/" in the input buffer;
1576 if (in <= end - 3 && in[0].unicode() == '/' && in[1].unicode() == '.'
1577 && in[2].unicode() == '/') {
1578 in += 2;
1579 continue;
1580 } else if (in == end - 2 && in[0].unicode() == '/' && in[1].unicode() == '.') {
1581 *out++ = QLatin1Char('/');
1582 in += 2;
1583 break;
1584 }
1585
1586 // otherwise, if the input buffer begins with a prefix
1587 // of "/../" or "/..", where ".." is a complete path
1588 // segment, then replace that prefix with "/" in the
1589 // input buffer and remove the last //segment and its
1590 // preceding "/" (if any) from the output buffer;
1591 if (in <= end - 4 && in[0].unicode() == '/' && in[1].unicode() == '.'
1592 && in[2].unicode() == '.' && in[3].unicode() == '/') {
1593 while (out > path->constData() && (--out)->unicode() != '/')
1594 ;
1595 if (out == path->constData() && out->unicode() != '/')
1596 ++in;
1597 in += 3;
1598 continue;
1599 } else if (in == end - 3 && in[0].unicode() == '/' && in[1].unicode() == '.'
1600 && in[2].unicode() == '.') {
1601 while (out > path->constData() && (--out)->unicode() != '/')
1602 ;
1603 if (out->unicode() == '/')
1604 ++out;
1605 in += 3;
1606 break;
1607 }
1608
1609 // otherwise move the first path segment in
1610 // the input buffer to the end of the output
1611 // buffer, including the initial "/" character
1612 // (if any) and any subsequent characters up
1613 // to, but not including, the next "/"
1614 // character or the end of the input buffer.
1615 *out++ = *in++;
1616 while (in < end && in->unicode() != '/')
1617 *out++ = *in++;
1618 }
1619 path->truncate(out - path->constData());
1620}
1621
1622inline QUrlPrivate::ErrorCode QUrlPrivate::validityError(QString *source, int *position) const
1623{
1624 Q_ASSERT(!source == !position);
1625 if (error) {
1626 if (source) {
1627 *source = error->source;
1628 *position = error->position;
1629 }
1630 return error->code;
1631 }
1632
1633 // There are three more cases of invalid URLs that QUrl recognizes and they
1634 // are only possible with constructed URLs (setXXX methods), not with
1635 // parsing. Therefore, they are tested here.
1636 //
1637 // Two cases are a non-empty path that doesn't start with a slash and:
1638 // - with an authority
1639 // - without an authority, without scheme but the path with a colon before
1640 // the first slash
1641 // The third case is an empty authority and a non-empty path that starts
1642 // with "//".
1643 // Those cases are considered invalid because toString() would produce a URL
1644 // that wouldn't be parsed back to the same QUrl.
1645
1646 if (path.isEmpty())
1647 return NoError;
1648 if (path.at(0) == QLatin1Char('/')) {
1649 if (hasAuthority() || path.length() == 1 || path.at(1) != QLatin1Char('/'))
1650 return NoError;
1651 if (source) {
1652 *source = path;
1653 *position = 0;
1654 }
1655 return AuthorityAbsentAndPathIsDoubleSlash;
1656 }
1657
1658 if (sectionIsPresent & QUrlPrivate::Host) {
1659 if (source) {
1660 *source = path;
1661 *position = 0;
1662 }
1663 return AuthorityPresentAndPathIsRelative;
1664 }
1665 if (sectionIsPresent & QUrlPrivate::Scheme)
1666 return NoError;
1667
1668 // check for a path of "text:text/"
1669 for (int i = 0; i < path.length(); ++i) {
1670 ushort c = path.at(i).unicode();
1671 if (c == '/') {
1672 // found the slash before the colon
1673 return NoError;
1674 }
1675 if (c == ':') {
1676 // found the colon before the slash, it's invalid
1677 if (source) {
1678 *source = path;
1679 *position = i;
1680 }
1681 return RelativeUrlPathContainsColonBeforeSlash;
1682 }
1683 }
1684 return NoError;
1685}
1686
1687bool QUrlPrivate::validateComponent(QUrlPrivate::Section section, const QString &input,
1688 int begin, int end)
1689{
1690 // What we need to look out for, that the regular parser tolerates:
1691 // - percent signs not followed by two hex digits
1692 // - forbidden characters, which should always appear encoded
1693 // '"' / '<' / '>' / '\' / '^' / '`' / '{' / '|' / '}' / BKSP
1694 // control characters
1695 // - delimiters not allowed in certain positions
1696 // . scheme: parser is already strict
1697 // . user info: gen-delims except ":" disallowed ("/" / "?" / "#" / "[" / "]" / "@")
1698 // . host: parser is stricter than the standard
1699 // . port: parser is stricter than the standard
1700 // . path: all delimiters allowed
1701 // . fragment: all delimiters allowed
1702 // . query: all delimiters allowed
1703 static const char forbidden[] = "\"<>\\^`{|}\x7F";
1704 static const char forbiddenUserInfo[] = ":/?#[]@";
1705
1706 Q_ASSERT(section != Authority && section != Hierarchy && section != FullUrl);
1707
1708 const ushort *const data = reinterpret_cast<const ushort *>(input.constData());
1709 for (uint i = uint(begin); i < uint(end); ++i) {
1710 uint uc = data[i];
1711 if (uc >= 0x80)
1712 continue;
1713
1714 bool error = false;
1715 if ((uc == '%' && (uint(end) < i + 2 || !isHex(data[i + 1]) || !isHex(data[i + 2])))
1716 || uc <= 0x20 || strchr(forbidden, uc)) {
1717 // found an error
1718 error = true;
1719 } else if (section & UserInfo) {
1720 if (section == UserInfo && strchr(forbiddenUserInfo + 1, uc))
1721 error = true;
1722 else if (section != UserInfo && strchr(forbiddenUserInfo, uc))
1723 error = true;
1724 }
1725
1726 if (!error)
1727 continue;
1728
1729 ErrorCode errorCode = ErrorCode(int(section) << 8);
1730 if (section == UserInfo) {
1731 // is it the user name or the password?
1732 errorCode = InvalidUserNameError;
1733 for (uint j = uint(begin); j < i; ++j)
1734 if (data[j] == ':') {
1735 errorCode = InvalidPasswordError;
1736 break;
1737 }
1738 }
1739
1740 setError(errorCode, input, i);
1741 return false;
1742 }
1743
1744 // no errors
1745 return true;
1746}
1747
1748#if 0
1749inline void QUrlPrivate::validate() const
1750{
1751 QUrlPrivate *that = (QUrlPrivate *)this;
1752 that->encodedOriginal = that->toEncoded(); // may detach
1753 parse(ParseOnly);
1754
1755 QURL_SETFLAG(that->stateFlags, Validated);
1756
1757 if (!isValid)
1758 return;
1759
1760 QString auth = authority(); // causes the non-encoded forms to be valid
1761
1762 // authority() calls canonicalHost() which sets this
1763 if (!isHostValid)
1764 return;
1765
1766 if (scheme == QLatin1String("mailto")) {
1767 if (!host.isEmpty() || port != -1 || !userName.isEmpty() || !password.isEmpty()) {
1768 that->isValid = false;
1769 that->errorInfo.setParams(0, QT_TRANSLATE_NOOP(QUrl, "expected empty host, username,"
1770 "port and password"),
1771 0, 0);
1772 }
1773 } else if (scheme == ftpScheme() || scheme == httpScheme()) {
1774 if (host.isEmpty() && !(path.isEmpty() && encodedPath.isEmpty())) {
1775 that->isValid = false;
1776 that->errorInfo.setParams(0, QT_TRANSLATE_NOOP(QUrl, "the host is empty, but not the path"),
1777 0, 0);
1778 }
1779 }
1780}
1781#endif
1782
1783/*!
1784 \macro QT_NO_URL_CAST_FROM_STRING
1785 \relates QUrl
1786
1787 Disables automatic conversions from QString (or char *) to QUrl.
1788
1789 Compiling your code with this define is useful when you have a lot of
1790 code that uses QString for file names and you wish to convert it to
1791 use QUrl for network transparency. In any code that uses QUrl, it can
1792 help avoid missing QUrl::resolved() calls, and other misuses of
1793 QString to QUrl conversions.
1794
1795 \oldcode
1796 url = filename; // probably not what you want
1797 \newcode
1798 url = QUrl::fromLocalFile(filename);
1799 url = baseurl.resolved(QUrl(filename));
1800 \endcode
1801
1802 \sa QT_NO_CAST_FROM_ASCII
1803*/
1804
1805
1806/*!
1807 Constructs a URL by parsing \a url. QUrl will automatically percent encode
1808 all characters that are not allowed in a URL and decode the percent-encoded
1809 sequences that represent an unreserved character (letters, digits, hyphens,
1810 undercores, dots and tildes). All other characters are left in their
1811 original forms.
1812
1813 Parses the \a url using the parser mode \a parsingMode. In TolerantMode
1814 (the default), QUrl will correct certain mistakes, notably the presence of
1815 a percent character ('%') not followed by two hexadecimal digits, and it
1816 will accept any character in any position. In StrictMode, encoding mistakes
1817 will not be tolerated and QUrl will also check that certain forbidden
1818 characters are not present in unencoded form. If an error is detected in
1819 StrictMode, isValid() will return false. The parsing mode DecodedMode is not
1820 permitted in this context.
1821
1822 Example:
1823
1824 \snippet code/src_corelib_io_qurl.cpp 0
1825
1826 To construct a URL from an encoded string, you can also use fromEncoded():
1827
1828 \snippet code/src_corelib_io_qurl.cpp 1
1829
1830 Both functions are equivalent and, in Qt 5, both functions accept encoded
1831 data. Usually, the choice of the QUrl constructor or setUrl() versus
1832 fromEncoded() will depend on the source data: the constructor and setUrl()
1833 take a QString, whereas fromEncoded takes a QByteArray.
1834
1835 \sa setUrl(), fromEncoded(), TolerantMode
1836*/
1837QUrl::QUrl(const QString &url, ParsingMode parsingMode) : d(nullptr)
1838{
1839 setUrl(url, parsingMode);
1840}
1841
1842/*!
1843 Constructs an empty QUrl object.
1844*/
1845QUrl::QUrl() : d(nullptr)
1846{
1847}
1848
1849/*!
1850 Constructs a copy of \a other.
1851*/
1852QUrl::QUrl(const QUrl &other) : d(other.d)
1853{
1854 if (d)
1855 d->ref.ref();
1856}
1857
1858/*!
1859 Destructor; called immediately before the object is deleted.
1860*/
1861QUrl::~QUrl()
1862{
1863 if (d && !d->ref.deref())
1864 delete d;
1865}
1866
1867/*!
1868 Returns \c true if the URL is non-empty and valid; otherwise returns \c false.
1869
1870 The URL is run through a conformance test. Every part of the URL
1871 must conform to the standard encoding rules of the URI standard
1872 for the URL to be reported as valid.
1873
1874 \snippet code/src_corelib_io_qurl.cpp 2
1875*/
1876bool QUrl::isValid() const
1877{
1878 if (isEmpty()) {
1879 // also catches d == nullptr
1880 return false;
1881 }
1882 return d->validityError() == QUrlPrivate::NoError;
1883}
1884
1885/*!
1886 Returns \c true if the URL has no data; otherwise returns \c false.
1887
1888 \sa clear()
1889*/
1890bool QUrl::isEmpty() const
1891{
1892 if (!d) return true;
1893 return d->isEmpty();
1894}
1895
1896/*!
1897 Resets the content of the QUrl. After calling this function, the
1898 QUrl is equal to one that has been constructed with the default
1899 empty constructor.
1900
1901 \sa isEmpty()
1902*/
1903void QUrl::clear()
1904{
1905 if (d && !d->ref.deref())
1906 delete d;
1907 d = nullptr;
1908}
1909
1910/*!
1911 Parses \a url and sets this object to that value. QUrl will automatically
1912 percent encode all characters that are not allowed in a URL and decode the
1913 percent-encoded sequences that represent an unreserved character (letters,
1914 digits, hyphens, undercores, dots and tildes). All other characters are
1915 left in their original forms.
1916
1917 Parses the \a url using the parser mode \a parsingMode. In TolerantMode
1918 (the default), QUrl will correct certain mistakes, notably the presence of
1919 a percent character ('%') not followed by two hexadecimal digits, and it
1920 will accept any character in any position. In StrictMode, encoding mistakes
1921 will not be tolerated and QUrl will also check that certain forbidden
1922 characters are not present in unencoded form. If an error is detected in
1923 StrictMode, isValid() will return false. The parsing mode DecodedMode is
1924 not permitted in this context and will produce a run-time warning.
1925
1926 \sa url(), toString()
1927*/
1928void QUrl::setUrl(const QString &url, ParsingMode parsingMode)
1929{
1930 if (parsingMode == DecodedMode) {
1931 qWarning("QUrl: QUrl::DecodedMode is not permitted when parsing a full URL");
1932 } else {
1933 detach();
1934 d->parse(url, parsingMode);
1935 }
1936}
1937
1938/*!
1939 \fn void QUrl::setEncodedUrl(const QByteArray &encodedUrl, ParsingMode parsingMode)
1940 \deprecated
1941 Constructs a URL by parsing the contents of \a encodedUrl.
1942
1943 \a encodedUrl is assumed to be a URL string in percent encoded
1944 form, containing only ASCII characters.
1945
1946 The parsing mode \a parsingMode is used for parsing \a encodedUrl.
1947
1948 \obsolete Use setUrl(QString::fromUtf8(encodedUrl), parsingMode)
1949
1950 \sa setUrl()
1951*/
1952
1953/*!
1954 Sets the scheme of the URL to \a scheme. As a scheme can only
1955 contain ASCII characters, no conversion or decoding is done on the
1956 input. It must also start with an ASCII letter.
1957
1958 The scheme describes the type (or protocol) of the URL. It's
1959 represented by one or more ASCII characters at the start the URL.
1960
1961 A scheme is strictly \l {http://www.ietf.org/rfc/rfc3986.txt} {RFC 3986}-compliant:
1962 \tt {scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )}
1963
1964 The following example shows a URL where the scheme is "ftp":
1965
1966 \image qurl-authority2.png
1967
1968 To set the scheme, the following call is used:
1969 \snippet code/src_corelib_io_qurl.cpp 11
1970
1971 The scheme can also be empty, in which case the URL is interpreted
1972 as relative.
1973
1974 \sa scheme(), isRelative()
1975*/
1976void QUrl::setScheme(const QString &scheme)
1977{
1978 detach();
1979 d->clearError();
1980 if (scheme.isEmpty()) {
1981 // schemes are not allowed to be empty
1982 d->sectionIsPresent &= ~QUrlPrivate::Scheme;
1983 d->flags &= ~QUrlPrivate::IsLocalFile;
1984 d->scheme.clear();
1985 } else {
1986 d->setScheme(scheme, scheme.length(), /* do set error */ true);
1987 }
1988}
1989
1990/*!
1991 Returns the scheme of the URL. If an empty string is returned,
1992 this means the scheme is undefined and the URL is then relative.
1993
1994 The scheme can only contain US-ASCII letters or digits, which means it
1995 cannot contain any character that would otherwise require encoding.
1996 Additionally, schemes are always returned in lowercase form.
1997
1998 \sa setScheme(), isRelative()
1999*/
2000QString QUrl::scheme() const
2001{
2002 if (!d) return QString();
2003
2004 return d->scheme;
2005}
2006
2007/*!
2008 Sets the authority of the URL to \a authority.
2009
2010 The authority of a URL is the combination of user info, a host
2011 name and a port. All of these elements are optional; an empty
2012 authority is therefore valid.
2013
2014 The user info and host are separated by a '@', and the host and
2015 port are separated by a ':'. If the user info is empty, the '@'
2016 must be omitted; although a stray ':' is permitted if the port is
2017 empty.
2018
2019 The following example shows a valid authority string:
2020
2021 \image qurl-authority.png
2022
2023 The \a authority data is interpreted according to \a mode: in StrictMode,
2024 any '%' characters must be followed by exactly two hexadecimal characters
2025 and some characters (including space) are not allowed in undecoded form. In
2026 TolerantMode (the default), all characters are accepted in undecoded form
2027 and the tolerant parser will correct stray '%' not followed by two hex
2028 characters.
2029
2030 This function does not allow \a mode to be QUrl::DecodedMode. To set fully
2031 decoded data, call setUserName(), setPassword(), setHost() and setPort()
2032 individually.
2033
2034 \sa setUserInfo(), setHost(), setPort()
2035*/
2036void QUrl::setAuthority(const QString &authority, ParsingMode mode)
2037{
2038 detach();
2039 d->clearError();
2040
2041 if (mode == DecodedMode) {
2042 qWarning("QUrl::setAuthority(): QUrl::DecodedMode is not permitted in this function");
2043 return;
2044 }
2045
2046 d->setAuthority(authority, 0, authority.length(), mode);
2047 if (authority.isNull()) {
2048 // QUrlPrivate::setAuthority cleared almost everything
2049 // but it leaves the Host bit set
2050 d->sectionIsPresent &= ~QUrlPrivate::Authority;
2051 }
2052}
2053
2054/*!
2055 Returns the authority of the URL if it is defined; otherwise
2056 an empty string is returned.
2057
2058 This function returns an unambiguous value, which may contain that
2059 characters still percent-encoded, plus some control sequences not
2060 representable in decoded form in QString.
2061
2062 The \a options argument controls how to format the user info component. The
2063 value of QUrl::FullyDecoded is not permitted in this function. If you need
2064 to obtain fully decoded data, call userName(), password(), host() and
2065 port() individually.
2066
2067 \sa setAuthority(), userInfo(), userName(), password(), host(), port()
2068*/
2069QString QUrl::authority(ComponentFormattingOptions options) const
2070{
2071 QString result;
2072 if (!d)
2073 return result;
2074
2075 if (options == QUrl::FullyDecoded) {
2076 qWarning("QUrl::authority(): QUrl::FullyDecoded is not permitted in this function");
2077 return result;
2078 }
2079
2080 d->appendAuthority(result, options, QUrlPrivate::Authority);
2081 return result;
2082}
2083
2084/*!
2085 Sets the user info of the URL to \a userInfo. The user info is an
2086 optional part of the authority of the URL, as described in
2087 setAuthority().
2088
2089 The user info consists of a user name and optionally a password,
2090 separated by a ':'. If the password is empty, the colon must be
2091 omitted. The following example shows a valid user info string:
2092
2093 \image qurl-authority3.png
2094
2095 The \a userInfo data is interpreted according to \a mode: in StrictMode,
2096 any '%' characters must be followed by exactly two hexadecimal characters
2097 and some characters (including space) are not allowed in undecoded form. In
2098 TolerantMode (the default), all characters are accepted in undecoded form
2099 and the tolerant parser will correct stray '%' not followed by two hex
2100 characters.
2101
2102 This function does not allow \a mode to be QUrl::DecodedMode. To set fully
2103 decoded data, call setUserName() and setPassword() individually.
2104
2105 \sa userInfo(), setUserName(), setPassword(), setAuthority()
2106*/
2107void QUrl::setUserInfo(const QString &userInfo, ParsingMode mode)
2108{
2109 detach();
2110 d->clearError();
2111 QString trimmed = userInfo.trimmed();
2112 if (mode == DecodedMode) {
2113 qWarning("QUrl::setUserInfo(): QUrl::DecodedMode is not permitted in this function");
2114 return;
2115 }
2116
2117 d->setUserInfo(trimmed, 0, trimmed.length());
2118 if (userInfo.isNull()) {
2119 // QUrlPrivate::setUserInfo cleared almost everything
2120 // but it leaves the UserName bit set
2121 d->sectionIsPresent &= ~QUrlPrivate::UserInfo;
2122 } else if (mode == StrictMode && !d->validateComponent(QUrlPrivate::UserInfo, userInfo)) {
2123 d->sectionIsPresent &= ~QUrlPrivate::UserInfo;
2124 d->userName.clear();
2125 d->password.clear();
2126 }
2127}
2128
2129/*!
2130 Returns the user info of the URL, or an empty string if the user
2131 info is undefined.
2132
2133 This function returns an unambiguous value, which may contain that
2134 characters still percent-encoded, plus some control sequences not
2135 representable in decoded form in QString.
2136
2137 The \a options argument controls how to format the user info component. The
2138 value of QUrl::FullyDecoded is not permitted in this function. If you need
2139 to obtain fully decoded data, call userName() and password() individually.
2140
2141 \sa setUserInfo(), userName(), password(), authority()
2142*/
2143QString QUrl::userInfo(ComponentFormattingOptions options) const
2144{
2145 QString result;
2146 if (!d)
2147 return result;
2148
2149 if (options == QUrl::FullyDecoded) {
2150 qWarning("QUrl::userInfo(): QUrl::FullyDecoded is not permitted in this function");
2151 return result;
2152 }
2153
2154 d->appendUserInfo(result, options, QUrlPrivate::UserInfo);
2155 return result;
2156}
2157
2158/*!
2159 Sets the URL's user name to \a userName. The \a userName is part
2160 of the user info element in the authority of the URL, as described
2161 in setUserInfo().
2162
2163 The \a userName data is interpreted according to \a mode: in StrictMode,
2164 any '%' characters must be followed by exactly two hexadecimal characters
2165 and some characters (including space) are not allowed in undecoded form. In
2166 TolerantMode (the default), all characters are accepted in undecoded form
2167 and the tolerant parser will correct stray '%' not followed by two hex
2168 characters. In DecodedMode, '%' stand for themselves and encoded characters
2169 are not possible.
2170
2171 QUrl::DecodedMode should be used when setting the user name from a data
2172 source which is not a URL, such as a password dialog shown to the user or
2173 with a user name obtained by calling userName() with the QUrl::FullyDecoded
2174 formatting option.
2175
2176 \sa userName(), setUserInfo()
2177*/
2178void QUrl::setUserName(const QString &userName, ParsingMode mode)
2179{
2180 detach();
2181 d->clearError();
2182
2183 QString data = userName;
2184 if (mode == DecodedMode) {
2185 parseDecodedComponent(data);
2186 mode = TolerantMode;
2187 }
2188
2189 d->setUserName(data, 0, data.length());
2190 if (userName.isNull())
2191 d->sectionIsPresent &= ~QUrlPrivate::UserName;
2192 else if (mode == StrictMode && !d->validateComponent(QUrlPrivate::UserName, userName))
2193 d->userName.clear();
2194}
2195
2196/*!
2197 Returns the user name of the URL if it is defined; otherwise
2198 an empty string is returned.
2199
2200 The \a options argument controls how to format the user name component. All
2201 values produce an unambiguous result. With QUrl::FullyDecoded, all
2202 percent-encoded sequences are decoded; otherwise, the returned value may
2203 contain some percent-encoded sequences for some control sequences not
2204 representable in decoded form in QString.
2205
2206 Note that QUrl::FullyDecoded may cause data loss if those non-representable
2207 sequences are present. It is recommended to use that value when the result
2208 will be used in a non-URL context, such as setting in QAuthenticator or
2209 negotiating a login.
2210
2211 \sa setUserName(), userInfo()
2212*/
2213QString QUrl::userName(ComponentFormattingOptions options) const
2214{
2215 QString result;
2216 if (d)
2217 d->appendUserName(result, options);
2218 return result;
2219}
2220
2221/*!
2222 \fn void QUrl::setEncodedUserName(const QByteArray &userName)
2223 \deprecated
2224 \since 4.4
2225
2226 Sets the URL's user name to the percent-encoded \a userName. The \a
2227 userName is part of the user info element in the authority of the
2228 URL, as described in setUserInfo().
2229
2230 \obsolete Use setUserName(QString::fromUtf8(userName))
2231
2232 \sa setUserName(), encodedUserName(), setUserInfo()
2233*/
2234
2235/*!
2236 \fn QByteArray QUrl::encodedUserName() const
2237 \deprecated
2238 \since 4.4
2239
2240 Returns the user name of the URL if it is defined; otherwise
2241 an empty string is returned. The returned value will have its
2242 non-ASCII and other control characters percent-encoded, as in
2243 toEncoded().
2244
2245 \obsolete Use userName(QUrl::FullyEncoded).toLatin1()
2246
2247 \sa setEncodedUserName()
2248*/
2249
2250/*!
2251 Sets the URL's password to \a password. The \a password is part of
2252 the user info element in the authority of the URL, as described in
2253 setUserInfo().
2254
2255 The \a password data is interpreted according to \a mode: in StrictMode,
2256 any '%' characters must be followed by exactly two hexadecimal characters
2257 and some characters (including space) are not allowed in undecoded form. In
2258 TolerantMode, all characters are accepted in undecoded form and the
2259 tolerant parser will correct stray '%' not followed by two hex characters.
2260 In DecodedMode, '%' stand for themselves and encoded characters are not
2261 possible.
2262
2263 QUrl::DecodedMode should be used when setting the password from a data
2264 source which is not a URL, such as a password dialog shown to the user or
2265 with a password obtained by calling password() with the QUrl::FullyDecoded
2266 formatting option.
2267
2268 \sa password(), setUserInfo()
2269*/
2270void QUrl::setPassword(const QString &password, ParsingMode mode)
2271{
2272 detach();
2273 d->clearError();
2274
2275 QString data = password;
2276 if (mode == DecodedMode) {
2277 parseDecodedComponent(data);
2278 mode = TolerantMode;
2279 }
2280
2281 d->setPassword(data, 0, data.length());
2282 if (password.isNull())
2283 d->sectionIsPresent &= ~QUrlPrivate::Password;
2284 else if (mode == StrictMode && !d->validateComponent(QUrlPrivate::Password, password))
2285 d->password.clear();
2286}
2287
2288/*!
2289 Returns the password of the URL if it is defined; otherwise
2290 an empty string is returned.
2291
2292 The \a options argument controls how to format the user name component. All
2293 values produce an unambiguous result. With QUrl::FullyDecoded, all
2294 percent-encoded sequences are decoded; otherwise, the returned value may
2295 contain some percent-encoded sequences for some control sequences not
2296 representable in decoded form in QString.
2297
2298 Note that QUrl::FullyDecoded may cause data loss if those non-representable
2299 sequences are present. It is recommended to use that value when the result
2300 will be used in a non-URL context, such as setting in QAuthenticator or
2301 negotiating a login.
2302
2303 \sa setPassword()
2304*/
2305QString QUrl::password(ComponentFormattingOptions options) const
2306{
2307 QString result;
2308 if (d)
2309 d->appendPassword(result, options);
2310 return result;
2311}
2312
2313/*!
2314 \fn void QUrl::setEncodedPassword(const QByteArray &password)
2315 \deprecated
2316 \since 4.4
2317
2318 Sets the URL's password to the percent-encoded \a password. The \a
2319 password is part of the user info element in the authority of the
2320 URL, as described in setUserInfo().
2321
2322 \obsolete Use setPassword(QString::fromUtf8(password));
2323
2324 \sa setPassword(), encodedPassword(), setUserInfo()
2325*/
2326
2327/*!
2328 \fn QByteArray QUrl::encodedPassword() const
2329 \deprecated
2330 \since 4.4
2331
2332 Returns the password of the URL if it is defined; otherwise an
2333 empty string is returned. The returned value will have its
2334 non-ASCII and other control characters percent-encoded, as in
2335 toEncoded().
2336
2337 \obsolete Use password(QUrl::FullyEncoded).toLatin1()
2338
2339 \sa setEncodedPassword(), toEncoded()
2340*/
2341
2342/*!
2343 Sets the host of the URL to \a host. The host is part of the
2344 authority.
2345
2346 The \a host data is interpreted according to \a mode: in StrictMode,
2347 any '%' characters must be followed by exactly two hexadecimal characters
2348 and some characters (including space) are not allowed in undecoded form. In
2349 TolerantMode, all characters are accepted in undecoded form and the
2350 tolerant parser will correct stray '%' not followed by two hex characters.
2351 In DecodedMode, '%' stand for themselves and encoded characters are not
2352 possible.
2353
2354 Note that, in all cases, the result of the parsing must be a valid hostname
2355 according to STD 3 rules, as modified by the Internationalized Resource
2356 Identifiers specification (RFC 3987). Invalid hostnames are not permitted
2357 and will cause isValid() to become false.
2358
2359 \sa host(), setAuthority()
2360*/
2361void QUrl::setHost(const QString &host, ParsingMode mode)
2362{
2363 detach();
2364 d->clearError();
2365
2366 QString data = host;
2367 if (mode == DecodedMode) {
2368 parseDecodedComponent(data);
2369 mode = TolerantMode;
2370 }
2371
2372 if (d->setHost(data, 0, data.length(), mode)) {
2373 if (host.isNull())
2374 d->sectionIsPresent &= ~QUrlPrivate::Host;
2375 } else if (!data.startsWith(QLatin1Char('['))) {
2376 // setHost failed, it might be IPv6 or IPvFuture in need of bracketing
2377 Q_ASSERT(d->error);
2378
2379 data.prepend(QLatin1Char('['));
2380 data.append(QLatin1Char(']'));
2381 if (!d->setHost(data, 0, data.length(), mode)) {
2382 // failed again
2383 if (data.contains(QLatin1Char(':'))) {
2384 // source data contains ':', so it's an IPv6 error
2385 d->error->code = QUrlPrivate::InvalidIPv6AddressError;
2386 }
2387 } else {
2388 // succeeded
2389 d->clearError();
2390 }
2391 }
2392}
2393
2394/*!
2395 Returns the host of the URL if it is defined; otherwise
2396 an empty string is returned.
2397
2398 The \a options argument controls how the hostname will be formatted. The
2399 QUrl::EncodeUnicode option will cause this function to return the hostname
2400 in the ASCII-Compatible Encoding (ACE) form, which is suitable for use in
2401 channels that are not 8-bit clean or that require the legacy hostname (such
2402 as DNS requests or in HTTP request headers). If that flag is not present,
2403 this function returns the International Domain Name (IDN) in Unicode form,
2404 according to the list of permissible top-level domains (see
2405 idnWhitelist()).
2406
2407 All other flags are ignored. Host names cannot contain control or percent
2408 characters, so the returned value can be considered fully decoded.
2409
2410 \sa setHost(), idnWhitelist(), setIdnWhitelist(), authority()
2411*/
2412QString QUrl::host(ComponentFormattingOptions options) const
2413{
2414 QString result;
2415 if (d) {
2416 d->appendHost(result, options);
2417 if (result.startsWith(QLatin1Char('[')))
2418 result = result.mid(1, result.length() - 2);
2419 }
2420 return result;
2421}
2422
2423/*!
2424 \fn void QUrl::setEncodedHost(const QByteArray &host)
2425 \deprecated
2426 \since 4.4
2427
2428 Sets the URL's host to the ACE- or percent-encoded \a host. The \a
2429 host is part of the user info element in the authority of the
2430 URL, as described in setAuthority().
2431
2432 \obsolete Use setHost(QString::fromUtf8(host)).
2433
2434 \sa setHost(), encodedHost(), setAuthority(), fromAce()
2435*/
2436
2437/*!
2438 \fn QByteArray QUrl::encodedHost() const
2439 \deprecated
2440 \since 4.4
2441
2442 Returns the host part of the URL if it is defined; otherwise
2443 an empty string is returned.
2444
2445 Note: encodedHost() does not return percent-encoded hostnames. Instead,
2446 the ACE-encoded (bare ASCII in Punycode encoding) form will be
2447 returned for any non-ASCII hostname.
2448
2449 This function is equivalent to calling QUrl::toAce() on the return
2450 value of host().
2451
2452 \obsolete Use host(QUrl::FullyEncoded).toLatin1() or toAce(host()).
2453
2454 \sa setEncodedHost()
2455*/
2456
2457/*!
2458 Sets the port of the URL to \a port. The port is part of the
2459 authority of the URL, as described in setAuthority().
2460
2461 \a port must be between 0 and 65535 inclusive. Setting the
2462 port to -1 indicates that the port is unspecified.
2463*/
2464void QUrl::setPort(int port)
2465{
2466 detach();
2467 d->clearError();
2468
2469 if (port < -1 || port > 65535) {
2470 d->setError(QUrlPrivate::InvalidPortError, QString::number(port), 0);
2471 port = -1;
2472 }
2473
2474 d->port = port;
2475 if (port != -1)
2476 d->sectionIsPresent |= QUrlPrivate::Host;
2477}
2478
2479/*!
2480 \since 4.1
2481
2482 Returns the port of the URL, or \a defaultPort if the port is
2483 unspecified.
2484
2485 Example:
2486
2487 \snippet code/src_corelib_io_qurl.cpp 3
2488*/
2489int QUrl::port(int defaultPort) const
2490{
2491 if (!d) return defaultPort;
2492 return d->port == -1 ? defaultPort : d->port;
2493}
2494
2495/*!
2496 Sets the path of the URL to \a path. The path is the part of the
2497 URL that comes after the authority but before the query string.
2498
2499 \image qurl-ftppath.png
2500
2501 For non-hierarchical schemes, the path will be everything
2502 following the scheme declaration, as in the following example:
2503
2504 \image qurl-mailtopath.png
2505
2506 The \a path data is interpreted according to \a mode: in StrictMode,
2507 any '%' characters must be followed by exactly two hexadecimal characters
2508 and some characters (including space) are not allowed in undecoded form. In
2509 TolerantMode, all characters are accepted in undecoded form and the
2510 tolerant parser will correct stray '%' not followed by two hex characters.
2511 In DecodedMode, '%' stand for themselves and encoded characters are not
2512 possible.
2513
2514 QUrl::DecodedMode should be used when setting the path from a data source
2515 which is not a URL, such as a dialog shown to the user or with a path
2516 obtained by calling path() with the QUrl::FullyDecoded formatting option.
2517
2518 \sa path()
2519*/
2520void QUrl::setPath(const QString &path, ParsingMode mode)
2521{
2522 detach();
2523 d->clearError();
2524
2525 QString data = path;
2526 if (mode == DecodedMode) {
2527 parseDecodedComponent(data);
2528 mode = TolerantMode;
2529 }
2530
2531 d->setPath(data, 0, data.length());
2532
2533 // optimized out, since there is no path delimiter
2534// if (path.isNull())
2535// d->sectionIsPresent &= ~QUrlPrivate::Path;
2536// else
2537 if (mode == StrictMode && !d->validateComponent(QUrlPrivate::Path, path))
2538 d->path.clear();
2539}
2540
2541/*!
2542 Returns the path of the URL.
2543
2544 \snippet code/src_corelib_io_qurl.cpp 12
2545
2546 The \a options argument controls how to format the path component. All
2547 values produce an unambiguous result. With QUrl::FullyDecoded, all
2548 percent-encoded sequences are decoded; otherwise, the returned value may
2549 contain some percent-encoded sequences for some control sequences not
2550 representable in decoded form in QString.
2551
2552 Note that QUrl::FullyDecoded may cause data loss if those non-representable
2553 sequences are present. It is recommended to use that value when the result
2554 will be used in a non-URL context, such as sending to an FTP server.
2555
2556 An example of data loss is when you have non-Unicode percent-encoded sequences
2557 and use FullyDecoded (the default):
2558
2559 \snippet code/src_corelib_io_qurl.cpp 13
2560
2561 In this example, there will be some level of data loss because the \c %FF cannot
2562 be converted.
2563
2564 Data loss can also occur when the path contains sub-delimiters (such as \c +):
2565
2566 \snippet code/src_corelib_io_qurl.cpp 14
2567
2568 Other decoding examples:
2569
2570 \snippet code/src_corelib_io_qurl.cpp 15
2571
2572 \sa setPath()
2573*/
2574QString QUrl::path(ComponentFormattingOptions options) const
2575{
2576 QString result;
2577 if (d)
2578 d->appendPath(result, options, QUrlPrivate::Path);
2579 return result;
2580}
2581
2582/*!
2583 \fn void QUrl::setEncodedPath(const QByteArray &path)
2584 \deprecated
2585 \since 4.4
2586
2587 Sets the URL's path to the percent-encoded \a path. The path is
2588 the part of the URL that comes after the authority but before the
2589 query string.
2590
2591 \image qurl-ftppath.png
2592
2593 For non-hierarchical schemes, the path will be everything
2594 following the scheme declaration, as in the following example:
2595
2596 \image qurl-mailtopath.png
2597
2598 \obsolete Use setPath(QString::fromUtf8(path)).
2599
2600 \sa setPath(), encodedPath(), setUserInfo()
2601*/
2602
2603/*!
2604 \fn QByteArray QUrl::encodedPath() const
2605 \deprecated
2606 \since 4.4
2607
2608 Returns the path of the URL if it is defined; otherwise an
2609 empty string is returned. The returned value will have its
2610 non-ASCII and other control characters percent-encoded, as in
2611 toEncoded().
2612
2613 \obsolete Use path(QUrl::FullyEncoded).toLatin1().
2614
2615 \sa setEncodedPath(), toEncoded()
2616*/
2617
2618/*!
2619 \since 5.2
2620
2621 Returns the name of the file, excluding the directory path.
2622
2623 Note that, if this QUrl object is given a path ending in a slash, the name of the file is considered empty.
2624
2625 If the path doesn't contain any slash, it is fully returned as the fileName.
2626
2627 Example:
2628
2629 \snippet code/src_corelib_io_qurl.cpp 7
2630
2631 The \a options argument controls how to format the file name component. All
2632 values produce an unambiguous result. With QUrl::FullyDecoded, all
2633 percent-encoded sequences are decoded; otherwise, the returned value may
2634 contain some percent-encoded sequences for some control sequences not
2635 representable in decoded form in QString.
2636
2637 \sa path()
2638*/
2639QString QUrl::fileName(ComponentFormattingOptions options) const
2640{
2641 const QString ourPath = path(options);
2642 const int slash = ourPath.lastIndexOf(QLatin1Char('/'));
2643 if (slash == -1)
2644 return ourPath;
2645 return ourPath.mid(slash + 1);
2646}
2647
2648/*!
2649 \since 4.2
2650
2651 Returns \c true if this URL contains a Query (i.e., if ? was seen on it).
2652
2653 \sa setQuery(), query(), hasFragment()
2654*/
2655bool QUrl::hasQuery() const
2656{
2657 if (!d) return false;
2658 return d->hasQuery();
2659}
2660
2661/*!
2662 Sets the query string of the URL to \a query.
2663
2664 This function is useful if you need to pass a query string that
2665 does not fit into the key-value pattern, or that uses a different
2666 scheme for encoding special characters than what is suggested by
2667 QUrl.
2668
2669 Passing a value of QString() to \a query (a null QString) unsets
2670 the query completely. However, passing a value of QString("")
2671 will set the query to an empty value, as if the original URL
2672 had a lone "?".
2673
2674 The \a query data is interpreted according to \a mode: in StrictMode,
2675 any '%' characters must be followed by exactly two hexadecimal characters
2676 and some characters (including space) are not allowed in undecoded form. In
2677 TolerantMode, all characters are accepted in undecoded form and the
2678 tolerant parser will correct stray '%' not followed by two hex characters.
2679 In DecodedMode, '%' stand for themselves and encoded characters are not
2680 possible.
2681
2682 Query strings often contain percent-encoded sequences, so use of
2683 DecodedMode is discouraged. One special sequence to be aware of is that of
2684 the plus character ('+'). QUrl does not convert spaces to plus characters,
2685 even though HTML forms posted by web browsers do. In order to represent an
2686 actual plus character in a query, the sequence "%2B" is usually used. This
2687 function will leave "%2B" sequences untouched in TolerantMode or
2688 StrictMode.
2689
2690 \sa query(), hasQuery()
2691*/
2692void QUrl::setQuery(const QString &query, ParsingMode mode)
2693{
2694 detach();
2695 d->clearError();
2696
2697 QString data = query;
2698 if (mode == DecodedMode) {
2699 parseDecodedComponent(data);
2700 mode = TolerantMode;
2701 }
2702
2703 d->setQuery(data, 0, data.length());
2704 if (query.isNull())
2705 d->sectionIsPresent &= ~QUrlPrivate::Query;
2706 else if (mode == StrictMode && !d->validateComponent(QUrlPrivate::Query, query))
2707 d->query.clear();
2708}
2709
2710/*!
2711 \fn void QUrl::setEncodedQuery(const QByteArray &query)
2712 \deprecated
2713
2714 Sets the query string of the URL to \a query. The string is
2715 inserted as-is, and no further encoding is performed when calling
2716 toEncoded().
2717
2718 This function is useful if you need to pass a query string that
2719 does not fit into the key-value pattern, or that uses a different
2720 scheme for encoding special characters than what is suggested by
2721 QUrl.
2722
2723 Passing a value of QByteArray() to \a query (a null QByteArray) unsets
2724 the query completely. However, passing a value of QByteArray("")
2725 will set the query to an empty value, as if the original URL
2726 had a lone "?".
2727
2728 \obsolete Use setQuery, which has the same null / empty behavior.
2729
2730 \sa encodedQuery(), hasQuery()
2731*/
2732
2733/*!
2734 \overload
2735 \since 5.0
2736 Sets the query string of the URL to \a query.
2737
2738 This function reconstructs the query string from the QUrlQuery object and
2739 sets on this QUrl object. This function does not have parsing parameters
2740 because the QUrlQuery contains data that is already parsed.
2741
2742 \sa query(), hasQuery()
2743*/
2744void QUrl::setQuery(const QUrlQuery &query)
2745{
2746 detach();
2747 d->clearError();
2748
2749 // we know the data is in the right format
2750 d->query = query.toString();
2751 if (query.isEmpty())
2752 d->sectionIsPresent &= ~QUrlPrivate::Query;
2753 else
2754 d->sectionIsPresent |= QUrlPrivate::Query;
2755}
2756
2757/*!
2758 \fn void QUrl::setQueryItems(const QList<QPair<QString, QString> > &query)
2759 \deprecated
2760
2761 Sets the query string of the URL to an encoded version of \a
2762 query. The contents of \a query are converted to a string
2763 internally, each pair delimited by the character returned by
2764 \l {QUrlQuery::queryPairDelimiter()}{queryPairDelimiter()}, and the key and value are delimited by
2765 \l {QUrlQuery::queryValueDelimiter()}{queryValueDelimiter()}
2766
2767 \note This method does not encode spaces (ASCII 0x20) as plus (+) signs,
2768 like HTML forms do. If you need that kind of encoding, you must encode
2769 the value yourself and use QUrl::setEncodedQueryItems.
2770
2771 \obsolete Use QUrlQuery and setQuery().
2772
2773 \sa queryItems(), setEncodedQueryItems()
2774*/
2775
2776/*!
2777 \fn void QUrl::setEncodedQueryItems(const QList<QPair<QByteArray, QByteArray> > &query)
2778 \deprecated
2779 \since 4.4
2780
2781 Sets the query string of the URL to the encoded version of \a
2782 query. The contents of \a query are converted to a string
2783 internally, each pair delimited by the character returned by
2784 \l {QUrlQuery::queryPairDelimiter()}{queryPairDelimiter()}, and the key and value are delimited by
2785 \l {QUrlQuery::queryValueDelimiter()}{queryValueDelimiter()}.
2786
2787 \obsolete Use QUrlQuery and setQuery().
2788
2789 \sa encodedQueryItems(), setQueryItems()
2790*/
2791
2792/*!
2793 \fn void QUrl::addQueryItem(const QString &key, const QString &value)
2794 \deprecated
2795
2796 Inserts the pair \a key = \a value into the query string of the
2797 URL.
2798
2799 The key-value pair is encoded before it is added to the query. The
2800 pair is converted into separate strings internally. The \a key and
2801 \a value is first encoded into UTF-8 and then delimited by the
2802 character returned by \l {QUrlQuery::queryValueDelimiter()}{queryValueDelimiter()}.
2803 Each key-value pair is delimited by the character returned by
2804 \l {QUrlQuery::queryPairDelimiter()}{queryPairDelimiter()}
2805
2806 \note This method does not encode spaces (ASCII 0x20) as plus (+) signs,
2807 like HTML forms do. If you need that kind of encoding, you must encode
2808 the value yourself and use QUrl::addEncodedQueryItem.
2809
2810 \obsolete Use QUrlQuery and setQuery().
2811
2812 \sa addEncodedQueryItem()
2813*/
2814
2815/*!
2816 \fn void QUrl::addEncodedQueryItem(const QByteArray &key, const QByteArray &value)
2817 \deprecated
2818 \since 4.4
2819
2820 Inserts the pair \a key = \a value into the query string of the
2821 URL.
2822
2823 \obsolete Use QUrlQuery and setQuery().
2824
2825 \sa addQueryItem()
2826*/
2827
2828/*!
2829 \fn QList<QPair<QString, QString> > QUrl::queryItems() const
2830 \deprecated
2831
2832 Returns the query string of the URL, as a map of keys and values.
2833
2834 \note This method does not decode spaces plus (+) signs as spaces (ASCII
2835 0x20), like HTML forms do. If you need that kind of decoding, you must
2836 use QUrl::encodedQueryItems and decode the data yourself.
2837
2838 \obsolete Use QUrlQuery.
2839
2840 \sa setQueryItems(), setEncodedQuery()
2841*/
2842
2843/*!
2844 \fn QList<QPair<QByteArray, QByteArray> > QUrl::encodedQueryItems() const
2845 \deprecated
2846 \since 4.4
2847
2848 Returns the query string of the URL, as a map of encoded keys and values.
2849
2850 \obsolete Use QUrlQuery.
2851
2852 \sa setEncodedQueryItems(), setQueryItems(), setEncodedQuery()
2853*/
2854
2855/*!
2856 \fn bool QUrl::hasQueryItem(const QString &key) const
2857 \deprecated
2858
2859 Returns \c true if there is a query string pair whose key is equal
2860 to \a key from the URL.
2861
2862 \obsolete Use QUrlQuery.
2863
2864 \sa hasEncodedQueryItem()
2865*/
2866
2867/*!
2868 \fn bool QUrl::hasEncodedQueryItem(const QByteArray &key) const
2869 \deprecated
2870 \since 4.4
2871
2872 Returns \c true if there is a query string pair whose key is equal
2873 to \a key from the URL.
2874
2875 \obsolete Use QUrlQuery.
2876
2877 \sa hasQueryItem()
2878*/
2879
2880/*!
2881 \fn QString QUrl::queryItemValue(const QString &key) const
2882 \deprecated
2883
2884 Returns the first query string value whose key is equal to \a key
2885 from the URL.
2886
2887 \note This method does not decode spaces plus (+) signs as spaces (ASCII
2888 0x20), like HTML forms do. If you need that kind of decoding, you must
2889 use QUrl::encodedQueryItemValue and decode the data yourself.
2890
2891 \obsolete Use QUrlQuery.
2892
2893 \sa allQueryItemValues()
2894*/
2895
2896/*!
2897 \fn QByteArray QUrl::encodedQueryItemValue(const QByteArray &key) const
2898 \deprecated
2899 \since 4.4
2900
2901 Returns the first query string value whose key is equal to \a key
2902 from the URL.
2903
2904 \obsolete Use QUrlQuery.
2905
2906 \sa queryItemValue(), allQueryItemValues()
2907*/
2908
2909/*!
2910 \fn QStringList QUrl::allQueryItemValues(const QString &key) const
2911 \deprecated
2912
2913 Returns the a list of query string values whose key is equal to
2914 \a key from the URL.
2915
2916 \note This method does not decode spaces plus (+) signs as spaces (ASCII
2917 0x20), like HTML forms do. If you need that kind of decoding, you must
2918 use QUrl::allEncodedQueryItemValues and decode the data yourself.
2919
2920 \obsolete Use QUrlQuery.
2921
2922 \sa queryItemValue()
2923*/
2924
2925/*!
2926 \fn QList<QByteArray> QUrl::allEncodedQueryItemValues(const QByteArray &key) const
2927 \deprecated
2928 \since 4.4
2929
2930 Returns the a list of query string values whose key is equal to
2931 \a key from the URL.
2932
2933 \obsolete Use QUrlQuery.
2934
2935 \sa allQueryItemValues(), queryItemValue(), encodedQueryItemValue()
2936*/
2937
2938/*!
2939 \fn void QUrl::removeQueryItem(const QString &key)
2940 \deprecated
2941
2942 Removes the first query string pair whose key is equal to \a key
2943 from the URL.
2944
2945 \obsolete Use QUrlQuery.
2946
2947 \sa removeAllQueryItems()
2948*/
2949
2950/*!
2951 \fn void QUrl::removeEncodedQueryItem(const QByteArray &key)
2952 \deprecated
2953 \since 4.4
2954
2955 Removes the first query string pair whose key is equal to \a key
2956 from the URL.
2957
2958 \obsolete Use QUrlQuery.
2959
2960 \sa removeQueryItem(), removeAllQueryItems()
2961*/
2962
2963/*!
2964 \fn void QUrl::removeAllQueryItems(const QString &key)
2965 \deprecated
2966
2967 Removes all the query string pairs whose key is equal to \a key
2968 from the URL.
2969
2970 \obsolete Use QUrlQuery.
2971
2972 \sa removeQueryItem()
2973*/
2974
2975/*!
2976 \fn void QUrl::removeAllEncodedQueryItems(const QByteArray &key)
2977 \deprecated
2978 \since 4.4
2979
2980 Removes all the query string pairs whose key is equal to \a key
2981 from the URL.
2982
2983 \obsolete Use QUrlQuery.
2984
2985 \sa removeQueryItem()
2986*/
2987
2988/*!
2989 \fn QByteArray QUrl::encodedQuery() const
2990 \deprecated
2991
2992 Returns the query string of the URL in percent encoded form.
2993
2994 \obsolete Use query(QUrl::FullyEncoded).toLatin1()
2995
2996 \sa setEncodedQuery(), query()
2997*/
2998
2999/*!
3000 Returns the query string of the URL if there's a query string, or an empty
3001 result if not. To determine if the parsed URL contained a query string, use
3002 hasQuery().
3003
3004 The \a options argument controls how to format the query component. All
3005 values produce an unambiguous result. With QUrl::FullyDecoded, all
3006 percent-encoded sequences are decoded; otherwise, the returned value may
3007 contain some percent-encoded sequences for some control sequences not
3008 representable in decoded form in QString.
3009
3010 Note that use of QUrl::FullyDecoded in queries is discouraged, as queries
3011 often contain data that is supposed to remain percent-encoded, including
3012 the use of the "%2B" sequence to represent a plus character ('+').
3013
3014 \sa setQuery(), hasQuery()
3015*/
3016QString QUrl::query(ComponentFormattingOptions options) const
3017{
3018 QString result;
3019 if (d) {
3020 d->appendQuery(result, options, QUrlPrivate::Query);
3021 if (d->hasQuery() && result.isNull())
3022 result.detach();
3023 }
3024 return result;
3025}
3026
3027/*!
3028 Sets the fragment of the URL to \a fragment. The fragment is the
3029 last part of the URL, represented by a '#' followed by a string of
3030 characters. It is typically used in HTTP for referring to a
3031 certain link or point on a page:
3032
3033 \image qurl-fragment.png
3034
3035 The fragment is sometimes also referred to as the URL "reference".
3036
3037 Passing an argument of QString() (a null QString) will unset the fragment.
3038 Passing an argument of QString("") (an empty but not null QString) will set the
3039 fragment to an empty string (as if the original URL had a lone "#").
3040
3041 The \a fragment data is interpreted according to \a mode: in StrictMode,
3042 any '%' characters must be followed by exactly two hexadecimal characters
3043 and some characters (including space) are not allowed in undecoded form. In
3044 TolerantMode, all characters are accepted in undecoded form and the
3045 tolerant parser will correct stray '%' not followed by two hex characters.
3046 In DecodedMode, '%' stand for themselves and encoded characters are not
3047 possible.
3048
3049 QUrl::DecodedMode should be used when setting the fragment from a data
3050 source which is not a URL or with a fragment obtained by calling
3051 fragment() with the QUrl::FullyDecoded formatting option.
3052
3053 \sa fragment(), hasFragment()
3054*/
3055void QUrl::setFragment(const QString &fragment, ParsingMode mode)
3056{
3057 detach();
3058 d->clearError();
3059
3060 QString data = fragment;
3061 if (mode == DecodedMode) {
3062 parseDecodedComponent(data);
3063 mode = TolerantMode;
3064 }
3065
3066 d->setFragment(data, 0, data.length());
3067 if (fragment.isNull())
3068 d->sectionIsPresent &= ~QUrlPrivate::Fragment;
3069 else if (mode == StrictMode && !d->validateComponent(QUrlPrivate::Fragment, fragment))
3070 d->fragment.clear();
3071}
3072
3073/*!
3074 Returns the fragment of the URL. To determine if the parsed URL contained a
3075 fragment, use hasFragment().
3076
3077 The \a options argument controls how to format the fragment component. All
3078 values produce an unambiguous result. With QUrl::FullyDecoded, all
3079 percent-encoded sequences are decoded; otherwise, the returned value may
3080 contain some percent-encoded sequences for some control sequences not
3081 representable in decoded form in QString.
3082
3083 Note that QUrl::FullyDecoded may cause data loss if those non-representable
3084 sequences are present. It is recommended to use that value when the result
3085 will be used in a non-URL context.
3086
3087 \sa setFragment(), hasFragment()
3088*/
3089QString QUrl::fragment(ComponentFormattingOptions options) const
3090{
3091 QString result;
3092 if (d) {
3093 d->appendFragment(result, options, QUrlPrivate::Fragment);
3094 if (d->hasFragment() && result.isNull())
3095 result.detach();
3096 }
3097 return result;
3098}
3099
3100/*!
3101 \fn void QUrl::setEncodedFragment(const QByteArray &fragment)
3102 \deprecated
3103 \since 4.4
3104
3105 Sets the URL's fragment to the percent-encoded \a fragment. The fragment is the
3106 last part of the URL, represented by a '#' followed by a string of
3107 characters. It is typically used in HTTP for referring to a
3108 certain link or point on a page:
3109
3110 \image qurl-fragment.png
3111
3112 The fragment is sometimes also referred to as the URL "reference".
3113
3114 Passing an argument of QByteArray() (a null QByteArray) will unset the fragment.
3115 Passing an argument of QByteArray("") (an empty but not null QByteArray)
3116 will set the fragment to an empty string (as if the original URL
3117 had a lone "#").
3118
3119 \obsolete Use setFragment(), which has the same behavior of null / empty.
3120
3121 \sa setFragment(), encodedFragment()
3122*/
3123
3124/*!
3125 \fn QByteArray QUrl::encodedFragment() const
3126 \deprecated
3127 \since 4.4
3128
3129 Returns the fragment of the URL if it is defined; otherwise an
3130 empty string is returned. The returned value will have its
3131 non-ASCII and other control characters percent-encoded, as in
3132 toEncoded().
3133
3134 \obsolete Use query(QUrl::FullyEncoded).toLatin1().
3135
3136 \sa setEncodedFragment(), toEncoded()
3137*/
3138
3139/*!
3140 \since 4.2
3141
3142 Returns \c true if this URL contains a fragment (i.e., if # was seen on it).
3143
3144 \sa fragment(), setFragment()
3145*/
3146bool QUrl::hasFragment() const
3147{
3148 if (!d) return false;
3149 return d->hasFragment();
3150}
3151
3152#if QT_CONFIG(topleveldomain)
3153/*!
3154 \since 4.8
3155
3156 Returns the TLD (Top-Level Domain) of the URL, (e.g. .co.uk, .net).
3157 Note that the return value is prefixed with a '.' unless the
3158 URL does not contain a valid TLD, in which case the function returns
3159 an empty string.
3160
3161 Note that this function considers a TLD to be any domain that allows users
3162 to register subdomains under, including many home, dynamic DNS websites and
3163 blogging providers. This is useful for determining whether two websites
3164 belong to the same infrastructure and communication should be allowed, such
3165 as browser cookies: two domains should be considered part of the same
3166 website if they share at least one label in addition to the value
3167 returned by this function.
3168
3169 \list
3170 \li \c{foo.co.uk} and \c{foo.com} do not share a top-level domain
3171 \li \c{foo.co.uk} and \c{bar.co.uk} share the \c{.co.uk} domain, but the next label is different
3172 \li \c{www.foo.co.uk} and \c{ftp.foo.co.uk} share the same top-level domain and one more label,
3173 so they are considered part of the same site
3174 \endlist
3175
3176 If \a options includes EncodeUnicode, the returned string will be in
3177 ASCII Compatible Encoding.
3178*/
3179QString QUrl::topLevelDomain(ComponentFormattingOptions options) const
3180{
3181 QString tld = qTopLevelDomain(host());
3182 if (options & EncodeUnicode) {
3183 return qt_ACE_do(tld, ToAceOnly, AllowLeadingDot);
3184 }
3185 return tld;
3186}
3187#endif
3188
3189/*!
3190 Returns the result of the merge of this URL with \a relative. This
3191 URL is used as a base to convert \a relative to an absolute URL.
3192
3193 If \a relative is not a relative URL, this function will return \a
3194 relative directly. Otherwise, the paths of the two URLs are
3195 merged, and the new URL returned has the scheme and authority of
3196 the base URL, but with the merged path, as in the following
3197 example:
3198
3199 \snippet code/src_corelib_io_qurl.cpp 5
3200
3201 Calling resolved() with ".." returns a QUrl whose directory is
3202 one level higher than the original. Similarly, calling resolved()
3203 with "../.." removes two levels from the path. If \a relative is
3204 "/", the path becomes "/".
3205
3206 \sa isRelative()
3207*/
3208QUrl QUrl::resolved(const QUrl &relative) const
3209{
3210 if (!d) return relative;
3211 if (!relative.d) return *this;
3212
3213 QUrl t;
3214 if (!relative.d->scheme.isEmpty()) {
3215 t = relative;
3216 t.detach();
3217 } else {
3218 if (relative.d->hasAuthority()) {
3219 t = relative;
3220 t.detach();
3221 } else {
3222 t.d = new QUrlPrivate;
3223
3224 // copy the authority
3225 t.d->userName = d->userName;
3226 t.d->password = d->password;
3227 t.d->host = d->host;
3228 t.d->port = d->port;
3229 t.d->sectionIsPresent = d->sectionIsPresent & QUrlPrivate::Authority;
3230
3231 if (relative.d->path.isEmpty()) {
3232 t.d->path = d->path;
3233 if (relative.d->hasQuery()) {
3234 t.d->query = relative.d->query;
3235 t.d->sectionIsPresent |= QUrlPrivate::Query;
3236 } else if (d->hasQuery()) {
3237 t.d->query = d->query;
3238 t.d->sectionIsPresent |= QUrlPrivate::Query;
3239 }
3240 } else {
3241 t.d->path = relative.d->path.startsWith(QLatin1Char('/'))
3242 ? relative.d->path
3243 : d->mergePaths(relative.d->path);
3244 if (relative.d->hasQuery()) {
3245 t.d->query = relative.d->query;
3246 t.d->sectionIsPresent |= QUrlPrivate::Query;
3247 }
3248 }
3249 }
3250 t.d->scheme = d->scheme;
3251 if (d->hasScheme())
3252 t.d->sectionIsPresent |= QUrlPrivate::Scheme;
3253 else
3254 t.d->sectionIsPresent &= ~QUrlPrivate::Scheme;
3255 t.d->flags |= d->flags & QUrlPrivate::IsLocalFile;
3256 }
3257 t.d->fragment = relative.d->fragment;
3258 if (relative.d->hasFragment())
3259 t.d->sectionIsPresent |= QUrlPrivate::Fragment;
3260 else
3261 t.d->sectionIsPresent &= ~QUrlPrivate::Fragment;
3262
3263 removeDotsFromPath(&t.d->path);
3264
3265#if defined(QURL_DEBUG)
3266 qDebug("QUrl(\"%ls\").resolved(\"%ls\") = \"%ls\"",
3267 qUtf16Printable(url()),
3268 qUtf16Printable(relative.url()),
3269 qUtf16Printable(t.url()));
3270#endif
3271 return t;
3272}
3273
3274/*!
3275 Returns \c true if the URL is relative; otherwise returns \c false. A URL is
3276 relative reference if its scheme is undefined; this function is therefore
3277 equivalent to calling scheme().isEmpty().
3278
3279 Relative references are defined in RFC 3986 section 4.2.
3280
3281 \sa {Relative URLs vs Relative Paths}
3282*/
3283bool QUrl::isRelative() const
3284{
3285 if (!d) return true;
3286 return !d->hasScheme();
3287}
3288
3289/*!
3290 Returns a string representation of the URL. The output can be customized by
3291 passing flags with \a options. The option QUrl::FullyDecoded is not
3292 permitted in this function since it would generate ambiguous data.
3293
3294 The resulting QString can be passed back to a QUrl later on.
3295
3296 Synonym for toString(options).
3297
3298 \sa FormattingOptions, toEncoded(), toString()
3299*/
3300QString QUrl::url(FormattingOptions options) const
3301{
3302 return toString(options);
3303}
3304
3305/*!
3306 Returns a string representation of the URL. The output can be customized by
3307 passing flags with \a options. The option QUrl::FullyDecoded is not
3308 permitted in this function since it would generate ambiguous data.
3309
3310 The default formatting option is \l{QUrl::FormattingOptions}{PrettyDecoded}.
3311
3312 \sa FormattingOptions, url(), setUrl()
3313*/
3314QString QUrl::toString(FormattingOptions options) const
3315{
3316 QString url;
3317 if (!isValid()) {
3318 // also catches isEmpty()
3319 return url;
3320 }
3321 if (options == QUrl::FullyDecoded) {
3322 qWarning("QUrl: QUrl::FullyDecoded is not permitted when reconstructing the full URL");
3323 options = QUrl::PrettyDecoded;
3324 }
3325
3326 // return just the path if:
3327 // - QUrl::PreferLocalFile is passed
3328 // - QUrl::RemovePath isn't passed (rather stupid if the user did...)
3329 // - there's no query or fragment to return
3330 // that is, either they aren't present, or we're removing them
3331 // - it's a local file
3332 if (options.testFlag(QUrl::PreferLocalFile) && !options.testFlag(QUrl::RemovePath)
3333 && (!d->hasQuery() || options.testFlag(QUrl::RemoveQuery))
3334 && (!d->hasFragment() || options.testFlag(QUrl::RemoveFragment))
3335 && isLocalFile()) {
3336 url = d->toLocalFile(options);
3337 return url;
3338 }
3339
3340 // for the full URL, we consider that the reserved characters are prettier if encoded
3341 if (options & DecodeReserved)
3342 options &= ~EncodeReserved;
3343 else
3344 options |= EncodeReserved;
3345
3346 if (!(options & QUrl::RemoveScheme) && d->hasScheme())
3347 url += d->scheme + QLatin1Char(':');
3348
3349 bool pathIsAbsolute = d->path.startsWith(QLatin1Char('/'));
3350 if (!((options & QUrl::RemoveAuthority) == QUrl::RemoveAuthority) && d->hasAuthority()) {
3351 url += QLatin1String("//");
3352 d->appendAuthority(url, options, QUrlPrivate::FullUrl);
3353 } else if (isLocalFile() && pathIsAbsolute) {
3354 // Comply with the XDG file URI spec, which requires triple slashes.
3355 url += QLatin1String("//");
3356 }
3357
3358 if (!(options & QUrl::RemovePath))
3359 d->appendPath(url, options, QUrlPrivate::FullUrl);
3360
3361 if (!(options & QUrl::RemoveQuery) && d->hasQuery()) {
3362 url += QLatin1Char('?');
3363 d->appendQuery(url, options, QUrlPrivate::FullUrl);
3364 }
3365 if (!(options & QUrl::RemoveFragment) && d->hasFragment()) {
3366 url += QLatin1Char('#');
3367 d->appendFragment(url, options, QUrlPrivate::FullUrl);
3368 }
3369
3370 return url;
3371}
3372
3373/*!
3374 \since 5.0
3375
3376 Returns a human-displayable string representation of the URL.
3377 The output can be customized by passing flags with \a options.
3378 The option RemovePassword is always enabled, since passwords
3379 should never be shown back to users.
3380
3381 With the default options, the resulting QString can be passed back
3382 to a QUrl later on, but any password that was present initially will
3383 be lost.
3384
3385 \sa FormattingOptions, toEncoded(), toString()
3386*/
3387
3388QString QUrl::toDisplayString(FormattingOptions options) const
3389{
3390 return toString(options | RemovePassword);
3391}
3392
3393/*!
3394 \since 5.2
3395
3396 Returns an adjusted version of the URL.
3397 The output can be customized by passing flags with \a options.
3398
3399 The encoding options from QUrl::ComponentFormattingOption don't make
3400 much sense for this method, nor does QUrl::PreferLocalFile.
3401
3402 This is always equivalent to QUrl(url.toString(options)).
3403
3404 \sa FormattingOptions, toEncoded(), toString()
3405*/
3406QUrl QUrl::adjusted(QUrl::FormattingOptions options) const
3407{
3408 if (!isValid()) {
3409 // also catches isEmpty()
3410 return QUrl();
3411 }
3412 QUrl that = *this;
3413 if (options & RemoveScheme)
3414 that.setScheme(QString());
3415 if ((options & RemoveAuthority) == RemoveAuthority) {
3416 that.setAuthority(QString());
3417 } else {
3418 if ((options & RemoveUserInfo) == RemoveUserInfo)
3419 that.setUserInfo(QString());
3420 else if (options & RemovePassword)
3421 that.setPassword(QString());
3422 if (options & RemovePort)
3423 that.setPort(-1);
3424 }
3425 if (options & RemoveQuery)
3426 that.setQuery(QString());
3427 if (options & RemoveFragment)
3428 that.setFragment(QString());
3429 if (options & RemovePath) {
3430 that.setPath(QString());
3431 } else if (options & (StripTrailingSlash | RemoveFilename | NormalizePathSegments)) {
3432 that.detach();
3433 QString path;
3434 d->appendPath(path, options | FullyEncoded, QUrlPrivate::Path);
3435 that.d->setPath(path, 0, path.length());
3436 }
3437 return that;
3438}
3439
3440/*!
3441 Returns the encoded representation of the URL if it's valid;
3442 otherwise an empty QByteArray is returned. The output can be
3443 customized by passing flags with \a options.
3444
3445 The user info, path and fragment are all converted to UTF-8, and
3446 all non-ASCII characters are then percent encoded. The host name
3447 is encoded using Punycode.
3448*/
3449QByteArray QUrl::toEncoded(FormattingOptions options) const
3450{
3451 options &= ~(FullyDecoded | FullyEncoded);
3452 return toString(options | FullyEncoded).toLatin1();
3453}
3454
3455/*!
3456 \fn QUrl QUrl::fromEncoded(const QByteArray &input, ParsingMode parsingMode)
3457
3458 Parses \a input and returns the corresponding QUrl. \a input is
3459 assumed to be in encoded form, containing only ASCII characters.
3460
3461 Parses the URL using \a parsingMode. See setUrl() for more information on
3462 this parameter. QUrl::DecodedMode is not permitted in this context.
3463
3464 \sa toEncoded(), setUrl()
3465*/
3466QUrl QUrl::fromEncoded(const QByteArray &input, ParsingMode mode)
3467{
3468 return QUrl(QString::fromUtf8(input.constData(), input.size()), mode);
3469}
3470
3471/*!
3472 Returns a decoded copy of \a input. \a input is first decoded from
3473 percent encoding, then converted from UTF-8 to unicode.
3474
3475 \note Given invalid input (such as a string containing the sequence "%G5",
3476 which is not a valid hexadecimal number) the output will be invalid as
3477 well. As an example: the sequence "%G5" could be decoded to 'W'.
3478*/
3479QString QUrl::fromPercentEncoding(const QByteArray &input)
3480{
3481 QByteArray ba = QByteArray::fromPercentEncoding(input);
3482 return QString::fromUtf8(ba, ba.size());
3483}
3484
3485/*!
3486 Returns an encoded copy of \a input. \a input is first converted
3487 to UTF-8, and all ASCII-characters that are not in the unreserved group
3488 are percent encoded. To prevent characters from being percent encoded
3489 pass them to \a exclude. To force characters to be percent encoded pass
3490 them to \a include.
3491
3492 Unreserved is defined as:
3493 \tt {ALPHA / DIGIT / "-" / "." / "_" / "~"}
3494
3495 \snippet code/src_corelib_io_qurl.cpp 6
3496*/
3497QByteArray QUrl::toPercentEncoding(const QString &input, const QByteArray &exclude, const QByteArray &include)
3498{
3499 return input.toUtf8().toPercentEncoding(exclude, include);
3500}
3501
3502/*!
3503 \internal
3504 \since 5.0
3505 Used in the setEncodedXXX compatibility functions. Converts \a ba to
3506 QString form.
3507*/
3508QString QUrl::fromEncodedComponent_helper(const QByteArray &ba)
3509{
3510 return qt_urlRecodeByteArray(ba);
3511}
3512
3513/*!
3514 \fn QByteArray QUrl::toPunycode(const QString &uc)
3515 \obsolete
3516 Returns a \a uc in Punycode encoding.
3517
3518 Punycode is a Unicode encoding used for internationalized domain
3519 names, as defined in RFC3492. If you want to convert a domain name from
3520 Unicode to its ASCII-compatible representation, use toAce().
3521*/
3522
3523/*!
3524 \fn QString QUrl::fromPunycode(const QByteArray &pc)
3525 \obsolete
3526 Returns the Punycode decoded representation of \a pc.
3527
3528 Punycode is a Unicode encoding used for internationalized domain
3529 names, as defined in RFC3492. If you want to convert a domain from
3530 its ASCII-compatible encoding to the Unicode representation, use
3531 fromAce().
3532*/
3533
3534/*!
3535 \since 4.2
3536
3537 Returns the Unicode form of the given domain name
3538 \a domain, which is encoded in the ASCII Compatible Encoding (ACE).
3539 The result of this function is considered equivalent to \a domain.
3540
3541 If the value in \a domain cannot be encoded, it will be converted
3542 to QString and returned.
3543
3544 The ASCII Compatible Encoding (ACE) is defined by RFC 3490, RFC 3491
3545 and RFC 3492. It is part of the Internationalizing Domain Names in
3546 Applications (IDNA) specification, which allows for domain names
3547 (like \c "example.com") to be written using international
3548 characters.
3549*/
3550QString QUrl::fromAce(const QByteArray &domain)
3551{
3552 return qt_ACE_do(QString::fromLatin1(domain), NormalizeAce, ForbidLeadingDot /*FIXME: make configurable*/);
3553}
3554
3555/*!
3556 \since 4.2
3557
3558 Returns the ASCII Compatible Encoding of the given domain name \a domain.
3559 The result of this function is considered equivalent to \a domain.
3560
3561 The ASCII-Compatible Encoding (ACE) is defined by RFC 3490, RFC 3491
3562 and RFC 3492. It is part of the Internationalizing Domain Names in
3563 Applications (IDNA) specification, which allows for domain names
3564 (like \c "example.com") to be written using international
3565 characters.
3566
3567 This function returns an empty QByteArray if \a domain is not a valid
3568 hostname. Note, in particular, that IPv6 literals are not valid domain
3569 names.
3570*/
3571QByteArray QUrl::toAce(const QString &domain)
3572{
3573 return qt_ACE_do(domain, ToAceOnly, ForbidLeadingDot /*FIXME: make configurable*/).toLatin1();
3574}
3575
3576/*!
3577 \internal
3578
3579 Returns \c true if this URL is "less than" the given \a url. This
3580 provides a means of ordering URLs.
3581*/
3582bool QUrl::operator <(const QUrl &url) const
3583{
3584 if (!d || !url.d) {
3585 bool thisIsEmpty = !d || d->isEmpty();
3586 bool thatIsEmpty = !url.d || url.d->isEmpty();
3587
3588 // sort an empty URL first
3589 return thisIsEmpty && !thatIsEmpty;
3590 }
3591
3592 int cmp;
3593 cmp = d->scheme.compare(url.d->scheme);
3594 if (cmp != 0)
3595 return cmp < 0;
3596
3597 cmp = d->userName.compare(url.d->userName);
3598 if (cmp != 0)
3599 return cmp < 0;
3600
3601 cmp = d->password.compare(url.d->password);
3602 if (cmp != 0)
3603 return cmp < 0;
3604
3605 cmp = d->host.compare(url.d->host);
3606 if (cmp != 0)
3607 return cmp < 0;
3608
3609 if (d->port != url.d->port)
3610 return d->port < url.d->port;
3611
3612 cmp = d->path.compare(url.d->path);
3613 if (cmp != 0)
3614 return cmp < 0;
3615
3616 if (d->hasQuery() != url.d->hasQuery())
3617 return url.d->hasQuery();
3618
3619 cmp = d->query.compare(url.d->query);
3620 if (cmp != 0)
3621 return cmp < 0;
3622
3623 if (d->hasFragment() != url.d->hasFragment())
3624 return url.d->hasFragment();
3625
3626 cmp = d->fragment.compare(url.d->fragment);
3627 return cmp < 0;
3628}
3629
3630/*!
3631 Returns \c true if this URL and the given \a url are equal;
3632 otherwise returns \c false.
3633*/
3634bool QUrl::operator ==(const QUrl &url) const
3635{
3636 if (!d && !url.d)
3637 return true;
3638 if (!d)
3639 return url.d->isEmpty();
3640 if (!url.d)
3641 return d->isEmpty();
3642
3643 // First, compare which sections are present, since it speeds up the
3644 // processing considerably. We just have to ignore the host-is-present flag
3645 // for local files (the "file" protocol), due to the requirements of the
3646 // XDG file URI specification.
3647 int mask = QUrlPrivate::FullUrl;
3648 if (isLocalFile())
3649 mask &= ~QUrlPrivate::Host;
3650 return (d->sectionIsPresent & mask) == (url.d->sectionIsPresent & mask) &&
3651 d->scheme == url.d->scheme &&
3652 d->userName == url.d->userName &&
3653 d->password == url.d->password &&
3654 d->host == url.d->host &&
3655 d->port == url.d->port &&
3656 d->path == url.d->path &&
3657 d->query == url.d->query &&
3658 d->fragment == url.d->fragment;
3659}
3660
3661/*!
3662 \since 5.2
3663
3664 Returns \c true if this URL and the given \a url are equal after
3665 applying \a options to both; otherwise returns \c false.
3666
3667 This is equivalent to calling adjusted(options) on both URLs
3668 and comparing the resulting urls, but faster.
3669
3670*/
3671bool QUrl::matches(const QUrl &url, FormattingOptions options) const
3672{
3673 if (!d && !url.d)
3674 return true;
3675 if (!d)
3676 return url.d->isEmpty();
3677 if (!url.d)
3678 return d->isEmpty();
3679
3680 // First, compare which sections are present, since it speeds up the
3681 // processing considerably. We just have to ignore the host-is-present flag
3682 // for local files (the "file" protocol), due to the requirements of the
3683 // XDG file URI specification.
3684 int mask = QUrlPrivate::FullUrl;
3685 if (isLocalFile())
3686 mask &= ~QUrlPrivate::Host;
3687
3688 if (options.testFlag(QUrl::RemoveScheme))
3689 mask &= ~QUrlPrivate::Scheme;
3690 else if (d->scheme != url.d->scheme)
3691 return false;
3692
3693 if (options.testFlag(QUrl::RemovePassword))
3694 mask &= ~QUrlPrivate::Password;
3695 else if (d->password != url.d->password)
3696 return false;
3697
3698 if (options.testFlag(QUrl::RemoveUserInfo))
3699 mask &= ~QUrlPrivate::UserName;
3700 else if (d->userName != url.d->userName)
3701 return false;
3702
3703 if (options.testFlag(QUrl::RemovePort))
3704 mask &= ~QUrlPrivate::Port;
3705 else if (d->port != url.d->port)
3706 return false;
3707
3708 if (options.testFlag(QUrl::RemoveAuthority))
3709 mask &= ~QUrlPrivate::Host;
3710 else if (d->host != url.d->host)
3711 return false;
3712
3713 if (options.testFlag(QUrl::RemoveQuery))
3714 mask &= ~QUrlPrivate::Query;
3715 else if (d->query != url.d->query)
3716 return false;
3717
3718 if (options.testFlag(QUrl::RemoveFragment))
3719 mask &= ~QUrlPrivate::Fragment;
3720 else if (d->fragment != url.d->fragment)
3721 return false;
3722
3723 if ((d->sectionIsPresent & mask) != (url.d->sectionIsPresent & mask))
3724 return false;
3725
3726 if (options.testFlag(QUrl::RemovePath))
3727 return true;
3728
3729 // Compare paths, after applying path-related options
3730 QString path1;
3731 d->appendPath(path1, options, QUrlPrivate::Path);
3732 QString path2;
3733 url.d->appendPath(path2, options, QUrlPrivate::Path);
3734 return path1 == path2;
3735}
3736
3737/*!
3738 Returns \c true if this URL and the given \a url are not equal;
3739 otherwise returns \c false.
3740*/
3741bool QUrl::operator !=(const QUrl &url) const
3742{
3743 return !(*this == url);
3744}
3745
3746/*!
3747 Assigns the specified \a url to this object.
3748*/
3749QUrl &QUrl::operator =(const QUrl &url)
3750{
3751 if (!d) {
3752 if (url.d) {
3753 url.d->ref.ref();
3754 d = url.d;
3755 }
3756 } else {
3757 if (url.d)
3758 qAtomicAssign(d, url.d);
3759 else
3760 clear();
3761 }
3762 return *this;
3763}
3764
3765/*!
3766 Assigns the specified \a url to this object.
3767*/
3768QUrl &QUrl::operator =(const QString &url)
3769{
3770 if (url.isEmpty()) {
3771 clear();
3772 } else {
3773 detach();
3774 d->parse(url, TolerantMode);
3775 }
3776 return *this;
3777}
3778
3779/*!
3780 \fn void QUrl::swap(QUrl &other)
3781 \since 4.8
3782
3783 Swaps URL \a other with this URL. This operation is very
3784 fast and never fails.
3785*/
3786
3787/*!
3788 \internal
3789
3790 Forces a detach.
3791*/
3792void QUrl::detach()
3793{
3794 if (!d)
3795 d = new QUrlPrivate;
3796 else
3797 qAtomicDetach(d);
3798}
3799
3800/*!
3801 \internal
3802*/
3803bool QUrl::isDetached() const
3804{
3805 return !d || d->ref.loadRelaxed() == 1;
3806}
3807
3808
3809/*!
3810 Returns a QUrl representation of \a localFile, interpreted as a local
3811 file. This function accepts paths separated by slashes as well as the
3812 native separator for this platform.
3813
3814 This function also accepts paths with a doubled leading slash (or
3815 backslash) to indicate a remote file, as in
3816 "//servername/path/to/file.txt". Note that only certain platforms can
3817 actually open this file using QFile::open().
3818
3819 An empty \a localFile leads to an empty URL (since Qt 5.4).
3820
3821 \snippet code/src_corelib_io_qurl.cpp 16
3822
3823 In the first line in snippet above, a file URL is constructed from a
3824 local, relative path. A file URL with a relative path only makes sense
3825 if there is a base URL to resolve it against. For example:
3826
3827 \snippet code/src_corelib_io_qurl.cpp 17
3828
3829 To resolve such a URL, it's necessary to remove the scheme beforehand:
3830
3831 \snippet code/src_corelib_io_qurl.cpp 18
3832
3833 For this reason, it is better to use a relative URL (that is, no scheme)
3834 for relative file paths:
3835
3836 \snippet code/src_corelib_io_qurl.cpp 19
3837
3838 \sa toLocalFile(), isLocalFile(), QDir::toNativeSeparators()
3839*/
3840QUrl QUrl::fromLocalFile(const QString &localFile)
3841{
3842 QUrl url;
3843 if (localFile.isEmpty())
3844 return url;
3845 QString scheme = fileScheme();
3846 QString deslashified = QDir::fromNativeSeparators(localFile);
3847
3848 // magic for drives on windows
3849 if (deslashified.length() > 1 && deslashified.at(1) == QLatin1Char(':') && deslashified.at(0) != QLatin1Char('/')) {
3850 deslashified.prepend(QLatin1Char('/'));
3851 } else if (deslashified.startsWith(QLatin1String("//"))) {
3852 // magic for shared drive on windows
3853 int indexOfPath = deslashified.indexOf(QLatin1Char('/'), 2);
3854 QStringRef hostSpec = deslashified.midRef(2, indexOfPath - 2);
3855 // Check for Windows-specific WebDAV specification: "//host@SSL/path".
3856 if (hostSpec.endsWith(webDavSslTag(), Qt::CaseInsensitive)) {
3857 hostSpec.truncate(hostSpec.size() - 4);
3858 scheme = webDavScheme();
3859 }
3860 url.setHost(hostSpec.toString());
3861
3862 if (indexOfPath > 2)
3863 deslashified = deslashified.right(deslashified.length() - indexOfPath);
3864 else
3865 deslashified.clear();
3866 }
3867
3868 url.setScheme(scheme);
3869 url.setPath(deslashified, DecodedMode);
3870 return url;
3871}
3872
3873/*!
3874 Returns the path of this URL formatted as a local file path. The path
3875 returned will use forward slashes, even if it was originally created
3876 from one with backslashes.
3877
3878 If this URL contains a non-empty hostname, it will be encoded in the
3879 returned value in the form found on SMB networks (for example,
3880 "//servername/path/to/file.txt").
3881
3882 \snippet code/src_corelib_io_qurl.cpp 20
3883
3884 Note: if the path component of this URL contains a non-UTF-8 binary
3885 sequence (such as %80), the behaviour of this function is undefined.
3886
3887 \sa fromLocalFile(), isLocalFile()
3888*/
3889QString QUrl::toLocalFile() const
3890{
3891 // the call to isLocalFile() also ensures that we're parsed
3892 if (!isLocalFile())
3893 return QString();
3894
3895 return d->toLocalFile(QUrl::FullyDecoded);
3896}
3897
3898/*!
3899 \since 4.8
3900 Returns \c true if this URL is pointing to a local file path. A URL is a
3901 local file path if the scheme is "file".
3902
3903 Note that this function considers URLs with hostnames to be local file
3904 paths, even if the eventual file path cannot be opened with
3905 QFile::open().
3906
3907 \sa fromLocalFile(), toLocalFile()
3908*/
3909bool QUrl::isLocalFile() const
3910{
3911 return d && d->isLocalFile();
3912}
3913
3914/*!
3915 Returns \c true if this URL is a parent of \a childUrl. \a childUrl is a child
3916 of this URL if the two URLs share the same scheme and authority,
3917 and this URL's path is a parent of the path of \a childUrl.
3918*/
3919bool QUrl::isParentOf(const QUrl &childUrl) const
3920{
3921 QString childPath = childUrl.path();
3922
3923 if (!d)
3924 return ((childUrl.scheme().isEmpty())
3925 && (childUrl.authority().isEmpty())
3926 && childPath.length() > 0 && childPath.at(0) == QLatin1Char('/'));
3927
3928 QString ourPath = path();
3929
3930 return ((childUrl.scheme().isEmpty() || d->scheme == childUrl.scheme())
3931 && (childUrl.authority().isEmpty() || authority() == childUrl.authority())
3932 && childPath.startsWith(ourPath)
3933 && ((ourPath.endsWith(QLatin1Char('/')) && childPath.length() > ourPath.length())
3934 || (!ourPath.endsWith(QLatin1Char('/'))
3935 && childPath.length() > ourPath.length() && childPath.at(ourPath.length()) == QLatin1Char('/'))));
3936}
3937
3938
3939#ifndef QT_NO_DATASTREAM
3940/*! \relates QUrl
3941
3942 Writes url \a url to the stream \a out and returns a reference
3943 to the stream.
3944
3945 \sa{Serializing Qt Data Types}{Format of the QDataStream operators}
3946*/
3947QDataStream &operator<<(QDataStream &out, const QUrl &url)
3948{
3949 QByteArray u;
3950 if (url.isValid())
3951 u = url.toEncoded();
3952 out << u;
3953 return out;
3954}
3955
3956/*! \relates QUrl
3957
3958 Reads a url into \a url from the stream \a in and returns a
3959 reference to the stream.
3960
3961 \sa{Serializing Qt Data Types}{Format of the QDataStream operators}
3962*/
3963QDataStream &operator>>(QDataStream &in, QUrl &url)
3964{
3965 QByteArray u;
3966 in >> u;
3967 url.setUrl(QString::fromLatin1(u));
3968 return in;
3969}
3970#endif // QT_NO_DATASTREAM
3971
3972#ifndef QT_NO_DEBUG_STREAM
3973QDebug operator<<(QDebug d, const QUrl &url)
3974{
3975 QDebugStateSaver saver(d);
3976 d.nospace() << "QUrl(" << url.toDisplayString() << ')';
3977 return d;
3978}
3979#endif
3980
3981static QString errorMessage(QUrlPrivate::ErrorCode errorCode, const QString &errorSource, int errorPosition)
3982{
3983 QChar c = uint(errorPosition) < uint(errorSource.length()) ?
3984 errorSource.at(errorPosition) : QChar(QChar::Null);
3985
3986 switch (errorCode) {
3987 case QUrlPrivate::NoError:
3988 Q_ASSERT_X(false, "QUrl::errorString",
3989 "Impossible: QUrl::errorString should have treated this condition");
3990 Q_UNREACHABLE();
3991 return QString();
3992
3993 case QUrlPrivate::InvalidSchemeError: {
3994 auto msg = QLatin1String("Invalid scheme (character '%1' not permitted)");
3995 return msg.arg(c);
3996 }
3997
3998 case QUrlPrivate::InvalidUserNameError:
3999 return QLatin1String("Invalid user name (character '%1' not permitted)")
4000 .arg(c);
4001
4002 case QUrlPrivate::InvalidPasswordError:
4003 return QLatin1String("Invalid password (character '%1' not permitted)")
4004 .arg(c);
4005
4006 case QUrlPrivate::InvalidRegNameError:
4007 if (errorPosition != -1)
4008 return QLatin1String("Invalid hostname (character '%1' not permitted)")
4009 .arg(c);
4010 else
4011 return QStringLiteral("Invalid hostname (contains invalid characters)");
4012 case QUrlPrivate::InvalidIPv4AddressError:
4013 return QString(); // doesn't happen yet
4014 case QUrlPrivate::InvalidIPv6AddressError:
4015 return QStringLiteral("Invalid IPv6 address");
4016 case QUrlPrivate::InvalidCharacterInIPv6Error:
4017 return QLatin1String("Invalid IPv6 address (character '%1' not permitted)").arg(c);
4018 case QUrlPrivate::InvalidIPvFutureError:
4019 return QLatin1String("Invalid IPvFuture address (character '%1' not permitted)").arg(c);
4020 case QUrlPrivate::HostMissingEndBracket:
4021 return QStringLiteral("Expected ']' to match '[' in hostname");
4022
4023 case QUrlPrivate::InvalidPortError:
4024 return QStringLiteral("Invalid port or port number out of range");
4025 case QUrlPrivate::PortEmptyError:
4026 return QStringLiteral("Port field was empty");
4027
4028 case QUrlPrivate::InvalidPathError:
4029 return QLatin1String("Invalid path (character '%1' not permitted)")
4030 .arg(c);
4031
4032 case QUrlPrivate::InvalidQueryError:
4033 return QLatin1String("Invalid query (character '%1' not permitted)")
4034 .arg(c);
4035
4036 case QUrlPrivate::InvalidFragmentError:
4037 return QLatin1String("Invalid fragment (character '%1' not permitted)")
4038 .arg(c);
4039
4040 case QUrlPrivate::AuthorityPresentAndPathIsRelative:
4041 return QStringLiteral("Path component is relative and authority is present");
4042 case QUrlPrivate::AuthorityAbsentAndPathIsDoubleSlash:
4043 return QStringLiteral("Path component starts with '//' and authority is absent");
4044 case QUrlPrivate::RelativeUrlPathContainsColonBeforeSlash:
4045 return QStringLiteral("Relative URL's path component contains ':' before any '/'");
4046 }
4047
4048 Q_ASSERT_X(false, "QUrl::errorString", "Cannot happen, unknown error");
4049 Q_UNREACHABLE();
4050 return QString();
4051}
4052
4053static inline void appendComponentIfPresent(QString &msg, bool present, const char *componentName,
4054 const QString &component)
4055{
4056 if (present) {
4057 msg += QLatin1String(componentName);
4058 msg += QLatin1Char('"');
4059 msg += component;
4060 msg += QLatin1String("\",");
4061 }
4062}
4063
4064/*!
4065 \since 4.2
4066
4067 Returns an error message if the last operation that modified this QUrl
4068 object ran into a parsing error. If no error was detected, this function
4069 returns an empty string and isValid() returns \c true.
4070
4071 The error message returned by this function is technical in nature and may
4072 not be understood by end users. It is mostly useful to developers trying to
4073 understand why QUrl will not accept some input.
4074
4075 \sa QUrl::ParsingMode
4076*/
4077QString QUrl::errorString() const
4078{
4079 QString msg;
4080 if (!d)
4081 return msg;
4082
4083 QString errorSource;
4084 int errorPosition = 0;
4085 QUrlPrivate::ErrorCode errorCode = d->validityError(&errorSource, &errorPosition);
4086 if (errorCode == QUrlPrivate::NoError)
4087 return msg;
4088
4089 msg += errorMessage(errorCode, errorSource, errorPosition);
4090 msg += QLatin1String("; source was \"");
4091 msg += errorSource;
4092 msg += QLatin1String("\";");
4093 appendComponentIfPresent(msg, d->sectionIsPresent & QUrlPrivate::Scheme,
4094 " scheme = ", d->scheme);
4095 appendComponentIfPresent(msg, d->sectionIsPresent & QUrlPrivate::UserInfo,
4096 " userinfo = ", userInfo());
4097 appendComponentIfPresent(msg, d->sectionIsPresent & QUrlPrivate::Host,
4098 " host = ", d->host);
4099 appendComponentIfPresent(msg, d->port != -1,
4100 " port = ", QString::number(d->port));
4101 appendComponentIfPresent(msg, !d->path.isEmpty(),
4102 " path = ", d->path);
4103 appendComponentIfPresent(msg, d->sectionIsPresent & QUrlPrivate::Query,
4104 " query = ", d->query);
4105 appendComponentIfPresent(msg, d->sectionIsPresent & QUrlPrivate::Fragment,
4106 " fragment = ", d->fragment);
4107 if (msg.endsWith(QLatin1Char(',')))
4108 msg.chop(1);
4109 return msg;
4110}
4111
4112/*!
4113 \since 5.1
4114
4115 Converts a list of \a urls into a list of QString objects, using toString(\a options).
4116*/
4117QStringList QUrl::toStringList(const QList<QUrl> &urls, FormattingOptions options)
4118{
4119 QStringList lst;
4120 lst.reserve(urls.size());
4121 for (const QUrl &url : urls)
4122 lst.append(url.toString(options));
4123 return lst;
4124
4125}
4126
4127/*!
4128 \since 5.1
4129
4130 Converts a list of strings representing \a urls into a list of urls, using QUrl(str, \a mode).
4131 Note that this means all strings must be urls, not for instance local paths.
4132*/
4133QList<QUrl> QUrl::fromStringList(const QStringList &urls, ParsingMode mode)
4134{
4135 QList<QUrl> lst;
4136 lst.reserve(urls.size());
4137 for (const QString &str : urls)
4138 lst.append(QUrl(str, mode));
4139 return lst;
4140}
4141
4142/*!
4143 \typedef QUrl::DataPtr
4144 \internal
4145*/
4146
4147/*!
4148 \fn DataPtr &QUrl::data_ptr()
4149 \internal
4150*/
4151
4152/*!
4153 Returns the hash value for the \a url. If specified, \a seed is used to
4154 initialize the hash.
4155
4156 \relates QHash
4157 \since 5.0
4158*/
4159uint qHash(const QUrl &url, uint seed) noexcept
4160{
4161 if (!