Roll Your Own Glyph Data

  • by Rainer Erich Scheichelbauer
  • Tutorial
  • – Modified on

Ever wondered why Glyphs knows so much about the glyphs you create? Like, when you enter the name of a glyph, it sets the Unicode and when you choose Glyph > Set Anchors (Cmd-U) or Glyph > Reset Anchors (Cmd-Opt-U), it sets the right diacritical anchors. Where does it store all that information?

Tucked away inside the Glyphs application, there is a file called GlyphData.xml. You are not supposed to touch the one inside the app, but you can make a copy of it in the Application Support folder and keep your personal customizations there. Glyphs will use your glyph info to override the built-in settings.

Make a Copy of GlyphData.xml

Okay, so we have to dig into the application and fish out the XML file. To do that, you first make sure that your Glyphs app is called Glyphs and resides in your Applications folder. Then, copy this line into the clipboard:


Then, in Finder, choose Go > Go to Folder… (Cmd-Shift-G). Paste the line we just copied into the appearing dialog:

Hit Go and the Finder will show you the GlyphData.xml, buried somewhere in the Glyphs application:

While the XML file is selected, choose Edit > Copy GlyphData.xml (Cmd-C). You can close the Finder window now.

Now, we have to navigate to the Application Support folder. The easiest way to get there is to choose Script > Open Scripts Folder (Cmd-Shift-Y) in Glyphs. This will take you to the Scripts folder inside the Application Support folder. All you need to do is make sure you are in the enclosing folder. Next to the Scripts folder, you may see Temp and Plugins folders as well as a CustomFilter.plist file. If you do not see an Info folder there yet, it’s time to create it (Cmd-Shift-N). Inside the Info folder, paste the GlyphData.xml:

Editing GlyphData.xml

Open the XML file with your favorite XML or plaintext editor. Personally, I recommend TextMate, many people also like SublimeText and Atom, and you may also be happy with BBEdit or its free counterpart TextWrangler.

Both XML files, the one buried in the app as well as the one we just copied into the Info folder, contain all relevant glyph info. They compliment each other, so you can limit your copy of the XML file to just the letters you need. To do so, remove everything between <glyphData> (should be located around line 20) and </glyphData> (the last line). So you end up with something like this:

Now, an entry for a letter must adhere to a certain form. Let’s take an example. Imagine you want to encode a letter of your favourite script Tengwar, what about U+E000 TENGWAR LETTER TINCO, and maybe throw in U+E046 TENGWAR SIGN ACUTE as well. We would add these lines to our copy of GlyphData.xml:

<glyph unicode="E000" name="tinco-tengwar" decompose="longCarrier-tengwar, ooreStemless-tengwar" category="Letter" subCategory="Primary" script="tengwar" altNames="tincoTengwar, tengwarTinco" production="uniE000" description="TENGWAR LETTER TINCO" anchors="top, bottom" accents="threeDotsAbove-tengwar, threeDotsBelow-tengwar, twoDotsAbove-tengwar, twoDotsBelow-tengwar, dotAbove-tengwar, dotBelow-tengwar, acute-tengwar, doubleAcute-tengwar, rightCurl-tengwar, doubleRightCurl-tengwar, leftCurl-tengwar, doubleLeftCurl-tengwar, nasalizer-tengwar, doubler-tengwar, tilde-tengwar, breve-tengwar"  />
<glyph unicode="E046" name="acute-tengwar" category="Mark" subCategory="Nonspacing" script="tengwar" altNames="acuteTengwar, tengwarAcute, andaith" production="uniE046" description="TENGWAR SIGN ACUTE" />

Now, it should look like this (I admit I cheated a bit with the indentation here):

Save it, restart Glyphs, take a look in Window > Glyph Info, and search for tengwar to see if Glyphs accepted your addition. And if you did everything right, you will see something like this:

Mission accomplished. Or wait a minute, not quite. It still lacks all the other Tengwar glyphs. But don’t strain yourself, Toshi Omagari has already beaten you to it.

XML Specification

You see, if you want to add a new glyph to the database, you have to add an XML element called glyph. Its basic structure is:

<glyph attribute="value" />

Each glyph element can take various attributes of the structure attribute="value". And every glyph entry needs these three required attributes:

  • name is the name of the glyph. Glyphs recognizes your glyph by its name, so this must be set to a valid glyph name, and it must be unique all throughout your glyph data.
  • description is the Unicode-style descriptive name of your glyph. If you have an encoded glyph, you can find the official name with Unicode Checker or the unofficial name from the ConScript Unicode Registry.
  • category is the category or group of the glyph. Possible values are: Letter (letters like ‘x’ or ‘ä’), Number (figures like ‘3’), Mark (e.g. the acute mark), Punctuation (like the period or the comma), Separator (like the wordspace), Symbol (like the Emoji signs, arrows or the Radioactive symbol).

You can (and where possible, should) make use of these optional attributes:

  • unicode is the hexadecimal UTF16 value. Leave it out if you want to create an unencoded glyph like a ligature.
  • subCategory helps you further define the kind of the glyph. This, of course, depends on the category. This could be Fraction for a Number or Lowercase for a Letter etc. Take a look at Window > Glyph Info for possible subcategories. Roughly put, categories and subcategories are what you see in the sidebar of the Font tab, or as separators in grid view.
  • script defines the scripting system the glyph belongs to. Can be left out if it doesn’t belong to any script (e.g. for math symbols). Possible values include latin, arabic, cyrillic, devanagari, ethiopic, greek, han etc. You get the idea.
  • anchors is comma-separated list of possible diacritical anchors for the glyph. The usual suspects are top, bottom, center, ogonek, topleft, topright, bottomleft, bottomright, left, right . Corresponding mark anchors need preceding underscores, e.g. _top. Stackable combining marks can have both kinds of anchors. Omit this attribute if your glyph cannot be a base for a diacritic or vice versa.
  • accents defines the possible accents the glyph can take. This mainly helps Glyphs draw the mark cloud when you click on an anchor.
  • altNames is a comma-separated list of alternate glyph names that are recognised by the application, so the glyph can be sorted or renamed correctly. E.g., oslash was sometimes called ostroke. When you open a legacy font that uses this weird name, Glyphs can update it to oslash when you run Glyph > Update Glyph Info.
  • production is what the glyph is renamed to at export time. Usually describes the legacy Adobe Glyph List name. You probably want to make use of this attribute wherever the AGL uses uni followed by the 4-digit Unicode or u and the 5-digit code. E.g. the glyph element for Romanian and Moldovan Tcommaaccent has both a name="Tcommaaccent" and an production="uni021A" attribute.
  • decompose defines the components of a compound glyph. In other words, the parts that make up the glyph. This information is used when you construct such a letter using the Glyph > Make Component Glyph command (Cmd-Opt-Shift-C). Make sure the base letter comes first and all the diacritics follow. This is also useful for ligatures. In that case, you add the names of the glyphs that comprise the ligature, e.g. decompose="f, f, k" for an f_f_k ligature.
  • sortName: by default, Glyphs orders the glyphs alphabetically within their category. If you want to manipulate the display order, add this attribute. For instance, to make sure that AE comes after all the A diacritics instead of between Adieresis and Agrave, there’s a sortName="Az" attribute in it. This is very important for figures, where the sortName can look like Number.dnom.4 etc.

There’s an App for That!

Since May 2017, there is a smarter and more convenient way to edit your glyph data: it is called EditGlyphData and you will find it in the Tools section of our website. Use it also to merge the data from multiple XML files into one, and, maybe best of all, export the data to a tab-separated text file, so you can edit it in your favourite spreadsheet app. Plus, avoid all XML inconsistency problems (see below). Cool.

Potential Pitfalls

Be careful and precise. If you mess up your glyph data, you will run into problems. Here are a few common problems, so you can avoid them right from the start:

First, make sure you always fill out the three required attributes, especially category. Always.

Secondly, it is seductive to to create your own naming schemes with this trick. But keep in mind that Glyphs expects certain names for automatically building OpenType features. So, you may have to write your own feature code as well, once you roll your own glyph data.

Thirdly, and this is important: Glyphs will ignore your custom glyph data if your GlyphData.xml contains broken XML. So, make sure you properly validate your XML from time to time. Many tools like TextMate sport a built-in validator. You can, of course, also copy and paste your XML into a web-based validator such as the W3 Markup Validation Service. Or, use the EditGlyphData app (see above).

That’s all there is to injecting your custom info into Glyphs. If you feel that other people could also profit from your additions, you can put your GlyphData.xml file on Github, post a suggestion in the Glyphs Forum, or, if you think your changes should be the default for every Glyphs user, file a feature request in the forum or get in touch with us otherwise.

Update 2013-02-03: added note on how to navigate to GlyphData.xml in 10.6 (thanks @typefacts); two minor text improvements.
Update 2013-02-12: added ‘Potential pitfalls’.
Update 2013-03-25: corrected XML file name typo (thanks George Thomas).
Update 2014-12-11: Updated to new notation for dotless glyphs, changed the sortName example to AE.
Update 2015-07-08: Partial rewrite. Updated to Glyphs 2, new and updated links. Removed outdated passages. Added Tengwar and links (thanks @Tosche_E).
Update 2017-05-30: Added reference to EditGlyphData app, removed deprecated bugreport link.