Donner 0.8.0-pre
Embeddable browser-grade SVG2 engine
Loading...
Searching...
No Matches
donner::xml Namespace Reference

XML parsing and document model support, top-level objects are donner::xml::XMLParser and donner::xml::XMLDocument. More...

Classes

struct  XMLAttributeAtSourceOffset
 Result of locating an attribute at a source offset. More...
struct  XMLEditIntent
 Source edit request from an editor/source view. More...
struct  XMLMutation
 XML DOM mutation emitted by an incremental source edit. More...
struct  ApplySourceEditResult
 Result from XMLDocument::applySourceEdit. More...
class  XMLDocument
 Represents an XML document, which holds a collection of XMLNode as the document tree. More...
class  XMLIncrementalParser
 Parser entry points for source-edit fragments. More...
struct  XMLAttributeSourceLocation
 Resolved source metadata for one serialized XML attribute. More...
class  XMLNode
 Represents an XML element belonging to an donner::xml::XMLDocument. More...
class  XMLParser
 Parses an XML document from a string. More...
struct  XMLQualifiedName
 Represents an XML attribute name with an optional namespace. More...
struct  XMLQualifiedNameRef
 Reference type for XMLQualifiedName, to pass the value to APIs without needing to allocate an RcString. More...
struct  SourceAnchorId
 Stable identifier for a source anchor stored in XMLSourceStore. More...
struct  ResolvedSourceSpan
 Current resolved byte span between two source anchors. More...
struct  SourceAnchorSpan
 Pair of anchors representing a source span. More...
struct  XMLSourceDelta
 Describes one applied source edit. More...
class  XMLSourceStore
 Owns XML source bytes and mutable source anchors. More...
struct  XMLToken
 A single token emitted by the XML tokenizer. More...

Enumerations

enum class  ReparseScope : std::uint8_t {
  AttributeValue ,
  OpeningTag ,
  TextNode ,
  ElementSubtree ,
  Document
}
 Local XML reparse scope chosen for a source edit. More...
enum class  SourceAnchorBias : std::uint8_t {
  Before ,
  After
}
 Controls how an anchor behaves when text is inserted exactly at its offset. More...
enum class  XMLTokenType : std::uint8_t {
  TagOpen ,
  TagName ,
  TagClose ,
  TagSelfClose ,
  AttributeName ,
  AttributeValue ,
  Comment ,
  CData ,
  TextContent ,
  XmlDeclaration ,
  Doctype ,
  EntityRef ,
  ProcessingInstruction ,
  Whitespace ,
  ErrorRecovery
}
 Token types emitted by the XML tokenizer (Tokenize). More...

Functions

std::ostream & operator<< (std::ostream &os, ReparseScope scope)
 Print a ReparseScope.
std::optional< RcStringEscapeAttributeValue (std::string_view value, char quoteChar='"')
 Escape a string for use as an XML attribute value, producing text that round-trips through donner::xml::XMLParser::Parse to recover the original bytes.
template<typename TokenSink>
void Tokenize (std::string_view source, TokenSink &&sink)
 Tokenize an XML source string, emitting XMLToken values to sink.
std::ostream & operator<< (std::ostream &os, XMLTokenType type)
 Ostream output operator for XMLTokenType.

Detailed Description

XML parsing and document model support, top-level objects are donner::xml::XMLParser and donner::xml::XMLDocument.


Class Documentation

◆ donner::xml::XMLAttributeAtSourceOffset

struct donner::xml::XMLAttributeAtSourceOffset

Result of locating an attribute at a source offset.

Collaboration diagram for donner::xml::XMLAttributeAtSourceOffset:
[legend]
Class Members
SourceRange location Current full source range of the attribute.
XMLQualifiedName name Attribute name.
XMLNode node Element node that owns the attribute.
char quote = '"' Quote delimiter used for the value.
SourceRange valueLocation Current unquoted value source range.

◆ donner::xml::XMLEditIntent

struct donner::xml::XMLEditIntent

Source edit request from an editor/source view.

Collaboration diagram for donner::xml::XMLEditIntent:
[legend]
Class Members
SourceRange range Source byte range to replace.
string_view replacement Replacement source bytes.
uint64_t sourceVersion = 0 Source version observed by the caller.

◆ donner::xml::ApplySourceEditResult

struct donner::xml::ApplySourceEditResult
Collaboration diagram for donner::xml::ApplySourceEditResult:
[legend]
Class Members
bool applied = false True if source bytes were changed.
optional< ParseDiagnostic > diagnostic Diagnostic if local reparsing failed.
vector< XMLMutation > mutations DOM mutations emitted by this operation.
ReparseScope scope = ReparseScope::Document Reparse scope selected for the edit.
vector< XMLSourceDelta > sourceDeltas Source edits applied by this operation.

◆ donner::xml::XMLAttributeSourceLocation

struct donner::xml::XMLAttributeSourceLocation

Resolved source metadata for one serialized XML attribute.

Collaboration diagram for donner::xml::XMLAttributeSourceLocation:
[legend]
Class Members
SourceRange fullRange Full serialized attribute range, e.g. fill="red".
char quote = '"' Quote delimiter used for the value.
SourceRange valueRange Unquoted value range, e.g. red.

◆ donner::xml::SourceAnchorSpan

struct donner::xml::SourceAnchorSpan

Pair of anchors representing a source span.

Collaboration diagram for donner::xml::SourceAnchorSpan:
[legend]
Class Members
SourceAnchorId end End anchor.
SourceAnchorId start Start anchor.

Enumeration Type Documentation

◆ ReparseScope

enum class donner::xml::ReparseScope : std::uint8_t
strong

Local XML reparse scope chosen for a source edit.

Enumerator
AttributeValue 

Edit was contained inside one quoted attribute value.

OpeningTag 

Edit touched an element opening tag outside one attribute value.

TextNode 

Edit was contained inside a text-like node.

ElementSubtree 

Edit touched one element subtree.

Document 

Edit requires whole-document fallback.

◆ SourceAnchorBias

enum class donner::xml::SourceAnchorBias : std::uint8_t
strong

Controls how an anchor behaves when text is inserted exactly at its offset.

Enumerator
Before 

Anchor remains before inserted text.

After 

Anchor moves after inserted text.

◆ XMLTokenType

enum class donner::xml::XMLTokenType : std::uint8_t
strong

Token types emitted by the XML tokenizer (Tokenize).

The token stream is gap-free: the concatenation of every token's source range recovers the original input byte-for-byte. No byte is covered by two tokens, and no byte is uncovered (except trailing whitespace after the last element, which is emitted as TextContent).

Enumerator
TagOpen 

< (element open) or </ (closing tag).

TagName 

Element name, e.g. rect, svg.

TagClose 

> (end of opening/closing tag).

TagSelfClose 

/> (self-closing element).

AttributeName 

Attribute name, e.g. fill, xmlns:xlink.

AttributeValue 

Quoted attribute value including delimiters, e.g. "red".

Comment 

<!-- ... --> (entire comment including delimiters).

CData 

<![CDATA[ ... ]]> (entire CDATA section).

TextContent 

Raw text between tags.

XmlDeclaration 

<?xml ... ?> (entire declaration).

Doctype 

<!DOCTYPE ...> (entire doctype).

EntityRef 

&amp;, &#x20;, etc. (within text content).

ProcessingInstruction 

<?name ...?> (entire PI).

Whitespace 

Whitespace inside a tag (between attributes, around =).

ErrorRecovery 

Emitted for regions the tokenizer cannot parse; error recovery skips to the next < or > and continues.

Function Documentation

◆ EscapeAttributeValue()

std::optional< RcString > donner::xml::EscapeAttributeValue ( std::string_view value,
char quoteChar = '"' )

Escape a string for use as an XML attribute value, producing text that round-trips through donner::xml::XMLParser::Parse to recover the original bytes.

The output is suitable for splicing between two delimiter characters of the requested quoteChar: the returned text does not include the surrounding quote chars, only the escaped value. Caller is responsible for emitting the delimiters.

Escape rules:

  • <&lt;, &&amp;, >&gt;
  • "&quot; when quoteChar is ", otherwise passthrough
  • '&apos; when quoteChar is ', otherwise passthrough
  • \t, \n, \r → numeric character references (&#9;, &#10;, &#13;), so the parser's attribute-value whitespace normalization does not collapse them into plain spaces on round-trip.
  • Valid multi-byte UTF-8 sequences pass through unchanged (we do not percent-encode non-ASCII bytes, XML attribute values carry UTF-8 natively).

Returns std::nullopt for input that cannot be represented in a well-formed XML attribute value at all:

  • The NUL byte (\0).
  • C0 control characters other than \t, \n, \r (i.e. U+0001U+0008, U+000B, U+000C, U+000EU+001F) — these are forbidden in XML 1.0.
  • Lone surrogates (U+D800U+DFFF) encoded in UTF-8.
  • The non-characters U+FFFE and U+FFFF.
  • Overlong UTF-8 sequences or truncated multi-byte starts.

This function is total on the input space it accepts — any input that makes it through the reject-list above produces a valid escaped string.

Parameters
valueThe raw attribute value bytes.
quoteCharThe quote delimiter the caller will surround the escaped value with. Must be '"' (double quote) or '\'' (single quote); any other value is treated as '"'.
Returns
The escaped value, or std::nullopt if value contains characters that cannot be represented in a well-formed XML attribute value.

◆ Tokenize()

template<typename TokenSink>
void donner::xml::Tokenize ( std::string_view source,
TokenSink && sink )

Tokenize an XML source string, emitting XMLToken values to sink.

The sink must be callable as sink(XMLToken) — typically a lambda, a functor, or a std::vector<XMLToken>::push_back wrapper.

Template Parameters
TokenSinkCallable with signature void(XMLToken).
Parameters
sourceThe XML source text.
sinkThe token consumer.