.. -*- mode: rst -*- ======================== Write your own formatter ======================== As well as creating `your own lexer `_, writing a new formatter for Pygments is easy and straightforward. A formatter is a class that is initialized with some keyword arguments (the formatter options) and that must provides a `format()` method. Additionally a formatter should provide a `get_style_defs()` method that returns the style definitions from the style in a form usable for the formatter's output format. Quickstart ========== The most basic formatter shipped with Pygments is the `NullFormatter`. It just sends the value of a token to the output stream: .. sourcecode:: python from pygments.formatter import Formatter class NullFormatter(Formatter): def format(self, tokensource, outfile): for ttype, value in tokensource: outfile.write(value) As you can see, the `format()` method is passed two parameters: `tokensource` and `outfile`. The first is an iterable of ``(token_type, value)`` tuples, the latter a file like object with a `write()` method. Because the formatter is that basic it doesn't overwrite the `get_style_defs()` method. Styles ====== Styles aren't instantiated but their metaclass provides some class functions so that you can access the style definitions easily. Styles are iterable and yield tuples in the form ``(ttype, d)`` where `ttype` is a token and `d` is a dict with the following keys: ``'color'`` Hexadecimal color value (eg: ``'ff0000'`` for red) or `None` if not defined. ``'bold'`` `True` if the value should be bold ``'italic'`` `True` if the value should be italic ``'underline'`` `True` if the value should be underlined ``'bgcolor'`` Hexadecimal color value for the background (eg: ``'eeeeeee'`` for light gray) or `None` if not defined. ``'border'`` Hexadecimal color value for the border (eg: ``'0000aa'`` for a dark blue) or `None` for no border. Additional keys might appear in the future, formatters should ignore all keys they don't support. HTML 3.2 Formatter ================== For an more complex example, let's implement a HTML 3.2 Formatter. We don't use CSS but inline markup (````, ````, etc). Because this isn't good style this formatter isn't in the standard library ;-) .. sourcecode:: python from pygments.formatter import Formatter class OldHtmlFormatter(Formatter): def __init__(self, **options): Formatter.__init__(self, **options) # create a dict of (start, end) tuples that wrap the # value of a token so that we can use it in the format # method later self.styles = {} # we iterate over the `_styles` attribute of a style item # that contains the parsed style values. for token, style in self.style: start = end = '' # a style item is a tuple in the following form: # colors are readily specified in hex: 'RRGGBB' if style['color']: start += '' % style['color'] end = '' + end if style['bold']: start += '' end = '' + end if style['italic']: start += '' end = '' + end if style['underline']: start += '' end = '' + end self.styles[token] = (start, end) def format(self, tokensource, outfile): # lastval is a string we use for caching # because it's possible that an lexer yields a number # of consecutive tokens with the same token type. # to minimize the size of the generated html markup we # try to join the values of same-type tokens here lastval = '' lasttype = None # wrap the whole output with
            outfile.write('
')

            for ttype, value in tokensource:
                # if the token type doesn't exist in the stylemap
                # we try it with the parent of the token type
                # eg: parent of Token.Literal.String.Double is
                # Token.Literal.String
                while ttype not in self.styles:
                    ttype = ttype.parent
                if ttype == lasttype:
                    # the current token type is the same of the last
                    # iteration. cache it
                    lastval += value
                else:
                    # not the same token as last iteration, but we
                    # have some data in the buffer. wrap it with the
                    # defined style and write it to the output file
                    if lastval:
                        stylebegin, styleend = self.styles[lasttype]
                        outfile.write(stylebegin + lastval + styleend)
                    # set lastval/lasttype to current values
                    lastval = value
                    lasttype = ttype

            # if something is left in the buffer, write it to the
            # output file, then close the opened 
 tag
            if lastval:
                stylebegin, styleend = self.styles[lasttype]
                outfile.write(stylebegin + lastval + styleend)
            outfile.write('
\n') The comments should explain it. Again, this formatter doesn't override the `get_style_defs()` method. If we would have used CSS classes instead of inline HTML markup, we would need to generate the CSS first. For that purpose the `get_style_defs()` method exists: Generating Style Definitions ============================ Some formatters like the `LatexFormatter` and the `HtmlFormatter` don't output inline markup but reference either macros or css classes. Because the definitions of those are not part of the output, the `get_style_defs()` method exists. It is passed one parameter (if it's used and how it's used is up to the formatter) and has to return a string or ``None``.