Gumbo
0.9.2
A C library for parsing HTML.
|
#include <gumbo.h>
Data Fields | |
GumboAllocatorFunction | allocator |
GumboDeallocatorFunction | deallocator |
void * | userdata |
int | tab_stop |
bool | stop_on_first_error |
int | max_errors |
GumboTag | fragment_context |
GumboNamespaceEnum | fragment_namespace |
Input struct containing configuration options for the parser. These let you specify alternate memory managers, provide different error handling, etc. Use kGumboDefaultOptions for sensible defaults, and only set what you need.
GumboAllocatorFunction GumboOptions::allocator |
A memory allocator function. Default: malloc.
GumboDeallocatorFunction GumboOptions::deallocator |
A memory deallocator function. Default: free.
void* GumboOptions::userdata |
An opaque object that's passed in as the first argument to all callbacks used by this library. Default: NULL.
int GumboOptions::tab_stop |
The tab-stop size, for computing positions in source code that uses tabs. Default: 8.
bool GumboOptions::stop_on_first_error |
Whether or not to stop parsing when the first error is encountered. Default: false.
int GumboOptions::max_errors |
The maximum number of errors before the parser stops recording them. This is provided so that if the page is totally borked, we don't completely fill up the errors vector and exhaust memory with useless redundant errors. Set to -1 to disable the limit. Default: -1
GumboTag GumboOptions::fragment_context |
The fragment context for parsing: https://html.spec.whatwg.org/multipage/syntax.html#parsing-html-fragments
If GUMBO_TAG_LAST is passed here, it is assumed to be "no fragment", i.e. the regular parsing algorithm. Otherwise, pass the tag enum for the intended parent of the parsed fragment. We use just the tag enum rather than a full node because that's enough to set all the parsing context we need, and it provides some additional flexibility for client code to act as if parsing a fragment even when a full HTML tree isn't available.
Default: GUMBO_TAG_LAST
GumboNamespaceEnum GumboOptions::fragment_namespace |
The namespace for the fragment context. This lets client code differentiate between, say, parsing a <title> tag in SVG vs. parsing it in HTML. Default: GUMBO_NAMESPACE_HTML