1 | # Chroma — A general purpose syntax highlighter in pure Go
|
---|
2 | [](https://godoc.org/github.com/alecthomas/chroma) [](https://github.com/alecthomas/chroma/actions/workflows/ci.yml) [](https://invite.slack.golangbridge.org/)
|
---|
3 |
|
---|
4 | > **NOTE:** As Chroma has just been released, its API is still in flux. That said, the high-level interface should not change significantly.
|
---|
5 |
|
---|
6 | Chroma takes source code and other structured text and converts it into syntax
|
---|
7 | highlighted HTML, ANSI-coloured text, etc.
|
---|
8 |
|
---|
9 | Chroma is based heavily on [Pygments](http://pygments.org/), and includes
|
---|
10 | translators for Pygments lexers and styles.
|
---|
11 |
|
---|
12 | <a id="markdown-table-of-contents" name="table-of-contents"></a>
|
---|
13 | ## Table of Contents
|
---|
14 |
|
---|
15 | <!-- TOC -->
|
---|
16 |
|
---|
17 | 1. [Table of Contents](#table-of-contents)
|
---|
18 | 2. [Supported languages](#supported-languages)
|
---|
19 | 3. [Try it](#try-it)
|
---|
20 | 4. [Using the library](#using-the-library)
|
---|
21 | 1. [Quick start](#quick-start)
|
---|
22 | 2. [Identifying the language](#identifying-the-language)
|
---|
23 | 3. [Formatting the output](#formatting-the-output)
|
---|
24 | 4. [The HTML formatter](#the-html-formatter)
|
---|
25 | 5. [More detail](#more-detail)
|
---|
26 | 1. [Lexers](#lexers)
|
---|
27 | 2. [Formatters](#formatters)
|
---|
28 | 3. [Styles](#styles)
|
---|
29 | 6. [Command-line interface](#command-line-interface)
|
---|
30 | 7. [What's missing compared to Pygments?](#whats-missing-compared-to-pygments)
|
---|
31 |
|
---|
32 | <!-- /TOC -->
|
---|
33 |
|
---|
34 | <a id="markdown-supported-languages" name="supported-languages"></a>
|
---|
35 | ## Supported languages
|
---|
36 |
|
---|
37 | Prefix | Language
|
---|
38 | :----: | --------
|
---|
39 | A | ABAP, ABNF, ActionScript, ActionScript 3, Ada, Angular2, ANTLR, ApacheConf, APL, AppleScript, Arduino, Awk
|
---|
40 | B | Ballerina, Base Makefile, Bash, Batchfile, BibTeX, Bicep, BlitzBasic, BNF, Brainfuck
|
---|
41 | C | C, C#, C++, Caddyfile, Caddyfile Directives, Cap'n Proto, Cassandra CQL, Ceylon, CFEngine3, cfstatement, ChaiScript, Chapel, Cheetah, Clojure, CMake, COBOL, CoffeeScript, Common Lisp, Coq, Crystal, CSS, Cython
|
---|
42 | D | D, Dart, Diff, Django/Jinja, Docker, DTD, Dylan
|
---|
43 | E | EBNF, Elixir, Elm, EmacsLisp, Erlang
|
---|
44 | F | Factor, Fish, Forth, Fortran, FSharp
|
---|
45 | G | GAS, GDScript, Genshi, Genshi HTML, Genshi Text, Gherkin, GLSL, Gnuplot, Go, Go HTML Template, Go Text Template, GraphQL, Groff, Groovy
|
---|
46 | H | Handlebars, Haskell, Haxe, HCL, Hexdump, HLB, HTML, HTTP, Hy
|
---|
47 | I | Idris, Igor, INI, Io
|
---|
48 | J | J, Java, JavaScript, JSON, Julia, Jungle
|
---|
49 | K | Kotlin
|
---|
50 | L | Lighttpd configuration file, LLVM, Lua
|
---|
51 | M | Mako, markdown, Mason, Mathematica, Matlab, MiniZinc, MLIR, Modula-2, MonkeyC, MorrowindScript, Myghty, MySQL
|
---|
52 | N | NASM, Newspeak, Nginx configuration file, Nim, Nix
|
---|
53 | O | Objective-C, OCaml, Octave, OnesEnterprise, OpenEdge ABL, OpenSCAD, Org Mode
|
---|
54 | P | PacmanConf, Perl, PHP, PHTML, Pig, PkgConfig, PL/pgSQL, plaintext, Pony, PostgreSQL SQL dialect, PostScript, POVRay, PowerShell, Prolog, PromQL, Properties, Protocol Buffer, Puppet, Python 2, Python
|
---|
55 | Q | QBasic
|
---|
56 | R | R, Racket, Ragel, Raku, react, ReasonML, reg, reStructuredText, Rexx, Ruby, Rust
|
---|
57 | S | SAS, Sass, Scala, Scheme, Scilab, SCSS, Smalltalk, Smarty, Snobol, Solidity, SPARQL, SQL, SquidConf, Standard ML, Stylus, Svelte, Swift, SYSTEMD, systemverilog
|
---|
58 | T | TableGen, TASM, Tcl, Tcsh, Termcap, Terminfo, Terraform, TeX, Thrift, TOML, TradingView, Transact-SQL, Turing, Turtle, Twig, TypeScript, TypoScript, TypoScriptCssData, TypoScriptHtmlData
|
---|
59 | V | VB.net, verilog, VHDL, VimL, vue
|
---|
60 | W | WDTE
|
---|
61 | X | XML, Xorg
|
---|
62 | Y | YAML, YANG
|
---|
63 | Z | Zig
|
---|
64 |
|
---|
65 |
|
---|
66 | _I will attempt to keep this section up to date, but an authoritative list can be
|
---|
67 | displayed with `chroma --list`._
|
---|
68 |
|
---|
69 | <a id="markdown-try-it" name="try-it"></a>
|
---|
70 | ## Try it
|
---|
71 |
|
---|
72 | Try out various languages and styles on the [Chroma Playground](https://swapoff.org/chroma/playground/).
|
---|
73 |
|
---|
74 | <a id="markdown-using-the-library" name="using-the-library"></a>
|
---|
75 | ## Using the library
|
---|
76 |
|
---|
77 | Chroma, like Pygments, has the concepts of
|
---|
78 | [lexers](https://github.com/alecthomas/chroma/tree/master/lexers),
|
---|
79 | [formatters](https://github.com/alecthomas/chroma/tree/master/formatters) and
|
---|
80 | [styles](https://github.com/alecthomas/chroma/tree/master/styles).
|
---|
81 |
|
---|
82 | Lexers convert source text into a stream of tokens, styles specify how token
|
---|
83 | types are mapped to colours, and formatters convert tokens and styles into
|
---|
84 | formatted output.
|
---|
85 |
|
---|
86 | A package exists for each of these, containing a global `Registry` variable
|
---|
87 | with all of the registered implementations. There are also helper functions
|
---|
88 | for using the registry in each package, such as looking up lexers by name or
|
---|
89 | matching filenames, etc.
|
---|
90 |
|
---|
91 | In all cases, if a lexer, formatter or style can not be determined, `nil` will
|
---|
92 | be returned. In this situation you may want to default to the `Fallback`
|
---|
93 | value in each respective package, which provides sane defaults.
|
---|
94 |
|
---|
95 | <a id="markdown-quick-start" name="quick-start"></a>
|
---|
96 | ### Quick start
|
---|
97 |
|
---|
98 | A convenience function exists that can be used to simply format some source
|
---|
99 | text, without any effort:
|
---|
100 |
|
---|
101 | ```go
|
---|
102 | err := quick.Highlight(os.Stdout, someSourceCode, "go", "html", "monokai")
|
---|
103 | ```
|
---|
104 |
|
---|
105 | <a id="markdown-identifying-the-language" name="identifying-the-language"></a>
|
---|
106 | ### Identifying the language
|
---|
107 |
|
---|
108 | To highlight code, you'll first have to identify what language the code is
|
---|
109 | written in. There are three primary ways to do that:
|
---|
110 |
|
---|
111 | 1. Detect the language from its filename.
|
---|
112 |
|
---|
113 | ```go
|
---|
114 | lexer := lexers.Match("foo.go")
|
---|
115 | ```
|
---|
116 |
|
---|
117 | 3. Explicitly specify the language by its Chroma syntax ID (a full list is available from `lexers.Names()`).
|
---|
118 |
|
---|
119 | ```go
|
---|
120 | lexer := lexers.Get("go")
|
---|
121 | ```
|
---|
122 |
|
---|
123 | 3. Detect the language from its content.
|
---|
124 |
|
---|
125 | ```go
|
---|
126 | lexer := lexers.Analyse("package main\n\nfunc main()\n{\n}\n")
|
---|
127 | ```
|
---|
128 |
|
---|
129 | In all cases, `nil` will be returned if the language can not be identified.
|
---|
130 |
|
---|
131 | ```go
|
---|
132 | if lexer == nil {
|
---|
133 | lexer = lexers.Fallback
|
---|
134 | }
|
---|
135 | ```
|
---|
136 |
|
---|
137 | At this point, it should be noted that some lexers can be extremely chatty. To
|
---|
138 | mitigate this, you can use the coalescing lexer to coalesce runs of identical
|
---|
139 | token types into a single token:
|
---|
140 |
|
---|
141 | ```go
|
---|
142 | lexer = chroma.Coalesce(lexer)
|
---|
143 | ```
|
---|
144 |
|
---|
145 | <a id="markdown-formatting-the-output" name="formatting-the-output"></a>
|
---|
146 | ### Formatting the output
|
---|
147 |
|
---|
148 | Once a language is identified you will need to pick a formatter and a style (theme).
|
---|
149 |
|
---|
150 | ```go
|
---|
151 | style := styles.Get("swapoff")
|
---|
152 | if style == nil {
|
---|
153 | style = styles.Fallback
|
---|
154 | }
|
---|
155 | formatter := formatters.Get("html")
|
---|
156 | if formatter == nil {
|
---|
157 | formatter = formatters.Fallback
|
---|
158 | }
|
---|
159 | ```
|
---|
160 |
|
---|
161 | Then obtain an iterator over the tokens:
|
---|
162 |
|
---|
163 | ```go
|
---|
164 | contents, err := ioutil.ReadAll(r)
|
---|
165 | iterator, err := lexer.Tokenise(nil, string(contents))
|
---|
166 | ```
|
---|
167 |
|
---|
168 | And finally, format the tokens from the iterator:
|
---|
169 |
|
---|
170 | ```go
|
---|
171 | err := formatter.Format(w, style, iterator)
|
---|
172 | ```
|
---|
173 |
|
---|
174 | <a id="markdown-the-html-formatter" name="the-html-formatter"></a>
|
---|
175 | ### The HTML formatter
|
---|
176 |
|
---|
177 | By default the `html` registered formatter generates standalone HTML with
|
---|
178 | embedded CSS. More flexibility is available through the `formatters/html` package.
|
---|
179 |
|
---|
180 | Firstly, the output generated by the formatter can be customised with the
|
---|
181 | following constructor options:
|
---|
182 |
|
---|
183 | - `Standalone()` - generate standalone HTML with embedded CSS.
|
---|
184 | - `WithClasses()` - use classes rather than inlined style attributes.
|
---|
185 | - `ClassPrefix(prefix)` - prefix each generated CSS class.
|
---|
186 | - `TabWidth(width)` - Set the rendered tab width, in characters.
|
---|
187 | - `WithLineNumbers()` - Render line numbers (style with `LineNumbers`).
|
---|
188 | - `LinkableLineNumbers()` - Make the line numbers linkable and be a link to themselves.
|
---|
189 | - `HighlightLines(ranges)` - Highlight lines in these ranges (style with `LineHighlight`).
|
---|
190 | - `LineNumbersInTable()` - Use a table for formatting line numbers and code, rather than spans.
|
---|
191 |
|
---|
192 | If `WithClasses()` is used, the corresponding CSS can be obtained from the formatter with:
|
---|
193 |
|
---|
194 | ```go
|
---|
195 | formatter := html.New(html.WithClasses(true))
|
---|
196 | err := formatter.WriteCSS(w, style)
|
---|
197 | ```
|
---|
198 |
|
---|
199 | <a id="markdown-more-detail" name="more-detail"></a>
|
---|
200 | ## More detail
|
---|
201 |
|
---|
202 | <a id="markdown-lexers" name="lexers"></a>
|
---|
203 | ### Lexers
|
---|
204 |
|
---|
205 | See the [Pygments documentation](http://pygments.org/docs/lexerdevelopment/)
|
---|
206 | for details on implementing lexers. Most concepts apply directly to Chroma,
|
---|
207 | but see existing lexer implementations for real examples.
|
---|
208 |
|
---|
209 | In many cases lexers can be automatically converted directly from Pygments by
|
---|
210 | using the included Python 3 script `pygments2chroma.py`. I use something like
|
---|
211 | the following:
|
---|
212 |
|
---|
213 | ```sh
|
---|
214 | python3 _tools/pygments2chroma.py \
|
---|
215 | pygments.lexers.jvm.KotlinLexer \
|
---|
216 | > lexers/k/kotlin.go \
|
---|
217 | && gofmt -s -w lexers/k/kotlin.go
|
---|
218 | ```
|
---|
219 |
|
---|
220 | See notes in [pygments-lexers.txt](https://github.com/alecthomas/chroma/blob/master/pygments-lexers.txt)
|
---|
221 | for a list of lexers, and notes on some of the issues importing them.
|
---|
222 |
|
---|
223 | <a id="markdown-formatters" name="formatters"></a>
|
---|
224 | ### Formatters
|
---|
225 |
|
---|
226 | Chroma supports HTML output, as well as terminal output in 8 colour, 256 colour, and true-colour.
|
---|
227 |
|
---|
228 | A `noop` formatter is included that outputs the token text only, and a `tokens`
|
---|
229 | formatter outputs raw tokens. The latter is useful for debugging lexers.
|
---|
230 |
|
---|
231 | <a id="markdown-styles" name="styles"></a>
|
---|
232 | ### Styles
|
---|
233 |
|
---|
234 | Chroma styles use the [same syntax](http://pygments.org/docs/styles/) as Pygments.
|
---|
235 |
|
---|
236 | All Pygments styles have been converted to Chroma using the `_tools/style.py` script.
|
---|
237 |
|
---|
238 | When you work with one of [Chroma's styles](https://github.com/alecthomas/chroma/tree/master/styles), know that the `chroma.Background` token type provides the default style for tokens. It does so by defining a foreground color and background color.
|
---|
239 |
|
---|
240 | For example, this gives each token name not defined in the style a default color of `#f8f8f8` and uses `#000000` for the highlighted code block's background:
|
---|
241 |
|
---|
242 | ~~~go
|
---|
243 | chroma.Background: "#f8f8f2 bg:#000000",
|
---|
244 | ~~~
|
---|
245 |
|
---|
246 | Also, token types in a style file are hierarchical. For instance, when `CommentSpecial` is not defined, Chroma uses the token style from `Comment`. So when several comment tokens use the same color, you'll only need to define `Comment` and override the one that has a different color.
|
---|
247 |
|
---|
248 | For a quick overview of the available styles and how they look, check out the [Chroma Style Gallery](https://xyproto.github.io/splash/docs/).
|
---|
249 |
|
---|
250 | <a id="markdown-command-line-interface" name="command-line-interface"></a>
|
---|
251 | ## Command-line interface
|
---|
252 |
|
---|
253 | A command-line interface to Chroma is included.
|
---|
254 |
|
---|
255 | Binaries are available to install from [the releases page](https://github.com/alecthomas/chroma/releases).
|
---|
256 |
|
---|
257 | The CLI can be used as a preprocessor to colorise output of `less(1)`,
|
---|
258 | see documentation for the `LESSOPEN` environment variable.
|
---|
259 |
|
---|
260 | The `--fail` flag can be used to suppress output and return with exit status
|
---|
261 | 1 to facilitate falling back to some other preprocessor in case chroma
|
---|
262 | does not resolve a specific lexer to use for the given file. For example:
|
---|
263 |
|
---|
264 | ```shell
|
---|
265 | export LESSOPEN='| p() { chroma --fail "$1" || cat "$1"; }; p "%s"'
|
---|
266 | ```
|
---|
267 |
|
---|
268 | Replace `cat` with your favourite fallback preprocessor.
|
---|
269 |
|
---|
270 | When invoked as `.lessfilter`, the `--fail` flag is automatically turned
|
---|
271 | on under the hood for easy integration with [lesspipe shipping with
|
---|
272 | Debian and derivatives](https://manpages.debian.org/lesspipe#USER_DEFINED_FILTERS);
|
---|
273 | for that setup the `chroma` executable can be just symlinked to `~/.lessfilter`.
|
---|
274 |
|
---|
275 | <a id="markdown-whats-missing-compared-to-pygments" name="whats-missing-compared-to-pygments"></a>
|
---|
276 | ## What's missing compared to Pygments?
|
---|
277 |
|
---|
278 | - Quite a few lexers, for various reasons (pull-requests welcome):
|
---|
279 | - Pygments lexers for complex languages often include custom code to
|
---|
280 | handle certain aspects, such as Raku's ability to nest code inside
|
---|
281 | regular expressions. These require time and effort to convert.
|
---|
282 | - I mostly only converted languages I had heard of, to reduce the porting cost.
|
---|
283 | - Some more esoteric features of Pygments are omitted for simplicity.
|
---|
284 | - Though the Chroma API supports content detection, very few languages support them.
|
---|
285 | I have plans to implement a statistical analyser at some point, but not enough time.
|
---|