Nerdy internals of an Apple text editor 👨🏻‍🔧

🍿  Fascinating engineering details behind Paper.
  20 min read

In this article, we dive into the details of the way Paper functions as a TextView-based text editor.

The first article was just a warm-up — here is where we get to truly geek out! 🤓

Before we start, I’ll add that for the time being Paper is built on the older TextKit 1 framework, so the article is relative to TextKit 1. That said, all of the concepts, abstractions, and principles discussed here still exist in TextKit 2 either unchanged or under a better API.

Text view

To get an understanding of how text editing works in native Apple text editors, we need to first discuss the centerpiece of the whole system — the TextView class. Technically, NSTextView and UITextView have their differences, but the API is similar enough that we can treat them as a single TextView class. I will highlight the differences where necessary.

TextView is a massive component that only grows in complexity with each release of respective operating systems. The TextEdit app consists almost entirely of a single TextView. When a single class can be used to build an entire text editor, you know it’s a beast.

Luckily, TextView is not just a one huge pile of code. Apple tried to subdivide it into a system of layers — each represented by a flagship class. The layers build on top of each other to create a text editing experience.

A diagram showing the classes that make up the text view. NSTextStorage and NSTextContainer flow into NSLayoutManager which then flows into TextView. Finally, TextView flows into ScrollView. Each next class in the diagram uses the information from the previous one to, in the end, construct a full text editor.


  • Stores the raw text string.
  • Stores the attributes (string-value pairs) that are applied to various ranges of text.

    • Visual ones like font or color (defined by the framework).
    • Any string-value pair that acts as metadata for your needs.
  • Emits events about text and attribute changes.


  • Defines the shape and dimensions of the area in which text symbols (glyphs) can be placed.
  • Most of the time it’s a rectangle, but can be any shape.


  • Figures out the dimensions and spacings of everything by looking at ranges of attributes applied to the text string in NSTextStorage.

    • Extracts vector glyphs from the font.
    • Converts each text character to one or more glyphs. Some symbols and languages need more than one.
    • Calculates the size of each glyph.
    • Calculates the distances between glyphs.
    • Calculates the distance between lines of glyphs.
  • Lays out each glyph, line by line, into the shape defined by NSTextContainer.

    • Calculates where every line of text starts and ends.
    • Calculates how many lines there are and what is the total height of the text.


  • Draws the glyph layout generated by NSLayoutManager.
  • Syncs the height of the view with the current height of laid out text.
  • Manages text selection and caret.
  • Manages typing attributes.

    • Attributes that are applied to the newly inserted text.
  • Can define margins (textContainerInset) around the NSTextContainer.
  • Manages all the additonal bells and whistles such as dictation, copy-paste, spell check, etc.


  • Shows the visible portion of TextView.
  • Manages scrolling, scroll bars, and zooming.
  • Can have it’s own margins (contentInset) in addition to textContainerInset defined by the TextView.
  • Implementation details:

    • AppKit

      • NSScrollView contains NSClipView and two instances of NSScroller.
      • NSClipView contains NSTextView.
      • So many separate classes work together to make the scrolling effect.
    • UIKit

      • UITextView extends from UIScrollView.
      • So UITextView holds everything, including the scrolling logic.
      • Another notable detail is that moving the caret outside the visible area of UITextView bounded by contentInset causes autoscrolling to fit the caret into the visible area, so the bottom part of contentInset should always be set to the height of the keyboard to make sure caret does not get lost behind the keyboard.
A diagram breaking down the interface of the Mac app. Areas of the interface are outlined with different colors to show what classes are resposible for them. A diagram breaking down the interface of the iPad app. Areas of the interface are outlined with different colors to show what classes are resposible for them. A diagram breaking down the interface of the iPhone app. Areas of the interface are outlined with different colors to show what classes are resposible for them.


It’s worth having a closer look at NSTextStorage or rather it’s parent class NSAttributedString as it is the foundation of rich text editing in Apple’s frameworks.

NSAttributedString consists of two main parts:

  1. A regular text string.
  2. String-value pairs of attributes attached to ranges within the string.

And this is how you would make an attributed string via the API:

NSMutableAttributedString *string = [NSMutableAttributedString.alloc
  initWithString:@"The quick brown fox jumps over the lazy dog."];

NSMutableParagraphStyle *style =;
style.firstLineHeadIndent = 30.0;

[string addAttribute:NSParagraphStyleAttributeName
               range:NSMakeRange(0, string.length)];
[string addAttribute:NSFontAttributeName
               value:[NSFont systemFontOfSize:25.0]
               range:NSMakeRange(0, string.length)];

[string addAttribute:NSForegroundColorAttributeName
               range:NSMakeRange(10, 5)];

[string addAttribute:NSFontAttributeName
               value:[NSFont boldSystemFontOfSize:25.0]
               range:NSMakeRange(20, 5)];

[string addAttribute:NSBackgroundColorAttributeName
               range:NSMakeRange(26, 4)];

[string addAttribute:NSUnderlineStyleAttributeName
               range:NSMakeRange(35, 4)];
[string addAttribute:NSFontAttributeName
                      convertFont:[NSFont boldSystemFontOfSize:25.0]
               range:NSMakeRange(35, 4)];

NSRange is a structure consisting of a location and a lenth. So NSMakeRange(10, 5) means a range of 5 characters starting from position 10 or an inclusive range between positions 10 and 14. In case different ranges define the same attribute under the same position, then the last applied range takes precedence. In the example above, the bold and italic fonts overwrite the default font that is applied to the whole string.

The code above can be easily visualized in TextEdit as it is basically an NSTextView with some buttons.

TextEdit app window with the text “The quick brown fox jumps over the lazy dog.” The text is styled with different fonts, colors, and background colors. Every style is labelled with the names of attributes that are applied to it.

A big part of the attributed string is the API that allows to check what attributes are applied to certain ranges, and to iterate over all attributes within a particular range. The API itself is quite peculiar. A lot of thought has been put into making it fast and efficient, but as a result the usage can bit a bit of a pain.

For example if you want to check whether a certain attribute exists at a certain position you would use this method:

id value = [string attribute:NSFontAttributeName

If the value is nil then it does not exist. Otherwise it is the value of the attribute, which in this case is an NSFont object. So this method can be used both to query the value and to check the existence of the attribute.

But it gets better. You can pass a pointer to the NSRange structure as the last argument (the good old C technique to return multiple values from a single function call):

NSRange effectiveRange;
id value = [string attribute:NSFontAttributeName

And it will return either:

Two diagrams with text “The quick brown”. The word “brown” is in brown color. In the first diagram the NSFontAttributeName attribute is sampled at index 6. The result is “nil” and the effective range is between indexes 0 and 9 inclusive. In the second diagram the NSFontAttributeName attribute is sampled at index 11. The result is an NSFont.brownColor object and the effective range is between indexes 10 and 14 inclusive.

But don’t get your hopes too high yet. You see the effectiveRange is actually not what you think. Quoting the documentation:

The range isn’t necessarily the maximum range covered by the attribute, and its extent is implementation-dependent.

This basically says that it can be the maximum accurate range… but might not be — just perfect. 😫

To get the accurate range you need to use another method.

NSRange effectiveRange;
id value = [string attribute:NSFontAttributeName
                     inRange:NSMakeRange(0, string.length)];

Same methods exist to query the NSDictionary of all attributes at a position, and the effectiveRange for which this unique combination of attributes spans.

NSRange effectiveRange;
NSDictionary<NSAttributedStringKey, id> attributes =

  [string attributesAtIndex:6
                    inRange:NSMakeRange(0, string.length)];

Finally, there is a convenience method to iterate over attributes within a range. With the longest constant name ever existed to tell it that you don’t want it to waste time merging ranges.

[string enumerateAttribute:NSFontAttributeName
                   inRange:NSMakeRange(0, string.length)
  usingBlock:^(id value, NSRange range, BOOL *stop) {
  // do something

That’s it for now. We’ll get into the actual use-cases of the NSAttributedString API in the following chapters.


Now it’s time to talk about how Paper applies the concepts mentioned above to style the text inside the editor.

Like discussed above styling means applying attributes to ranges of text. Paper has two types of attributes:

  1. Meta attributes — defined by the Markdown parser to identify individual parts of the Markdown syntax.
  2. Styling attributes — the visual attributes applied on top of the parts identified by meta attributes.
The Mac app with the text “The quick brown fox **jumps** ==over== the ~_lazy_~ dog.” in the center. Meta and styling attributes are labelled.

There are a total of three events that cause the attributes to be updated:

  1. Document opened — full update of meta attributes and styling attributes.
  2. Text changed — the affected part update meta attributes and styling attributes. Usually just the edited text. Sometimes the whole paragraph. More on that in the next chapter.
  3. Setting changed — full update of styling attributes, but not meta attributes.
A diagram showing the three events and how much they update the meta and styling attributes.

Every attribute update consists of the same steps:

  1. Start text editing transaction
  2. Parse the Markdown structure

    • Skipped for setting change.
  3. Apply layout-affecting attributes
  4. End text editing transaction
  5. Apply decorative attributes
A diagram showing the five steps of the attribute update process on text “**jumps** ==over”. During step 3 the font attribute is applied to “jumps”. During step 5 the light gray color is applied to Markdown tags and the background color to “over”.

The reason why a transaction is needed is because otherwise every attribute change would trigger layout recalc by the NSLayoutManager. We want to batch all the changes and then trigger the layout only once at the end.

Step two is simply looking at the text structure and try to spot the patterns inside the text that look like Markdown syntax. The parser then applies the invisible attributes to denote various pieces.

Next step is looking at the parsed meta attributes and applying the layout-affecting attributes. This is every attribute that can influence the position or size of the glyphs in the text view. The most significant of those attributes is the NSParagraphStyle. It defines the bulk of the settings that affect the layout of lines and paragraphs.

A diagram breaking down which parts of text and in which way are affected by the NSParagraphStyle attribute in the Mac app.

Finally, the decorative attributes (or rendering attributes in Apple’s terminology) are applied outside the transaction. The reason is simple: they don’t affect the layout, so updating them is not expensive. And they are not even aware of the transaction since they live in the NSLayoutManager itself, not in the NSTextStorage.

The last bit of attributes is the typing attributes. They are tied to the attributes preceding the caret or inside the selection. Once you type a character, the typing attributes are automatically applied to the newly inserted text. In a Markdown editor they are not that important as the styling is completely derived from the Markdown syntax, but they are crucial for rich text editors where the styles “stick” to the caret until you turn them off or move the caret to a new location. However, although Paper is a Markdown editor, it does have an editable Markdown preview called Preview Mode. It this mode the editor behaves just like a rich text editor with “togglable” typing attributes being highlighted, for example, on the toolbar in the iOS app.


The separation of meta, layout, and decoration attributes plays nicely into keeping certain thing fast. For instance, toggling between light and dark mode requires updating only decorative attributes, which is very fast as it does not trigger any layout changes. Setting changes like font size though requires a relayout of the whole document is still reasonably fast compared to doing that plus the full reparsing of Markdown.

Text change is the most crucial piece to get right. Due to how Markdown works, any text change has the potential to affect the styling of the whole paragraph.

Thus the logical thing to do is to re-parse and re-style the whole paragraph on every keystroke. The problem is that while this is technically the most correct approach, it can slow down your editor for longer paragraphs. At the same time, you if you’re just typing out a longer sentence, the Markdown structure does not change. There is really no need to restyle everything all the time and to slow your editing experience.

To make things speedy I’ve built an algorithm that looks at the next character being typed, as well as what characters are around it, and then decides whether to just apply the typing attributes or to re-parse and re-style the whole paragraph. The gist of the logic is that if you’re typing or surrounded by a special Markdown symbol then you should reparse, otherwise there is no need. It’s a simple algorithm that does marvels for the speed of the editor.

Two diagrams. In the first diagram the letter “p” is inserted to text “**jums** ==over=” between “m” and “s”. The newly inserted letter “p” is restyled as a result. In the second diagram the letter “*” is deleted from “**jumps** ==over”. The whole paragraph is restyled as a result and “jumps” is no longer bold.

The only nasty exception to the above is when you have code blocks in the document. Code blocks are the only multi-paragraph Markdown constructs in Paper. A keystroke has the potential to restyle the whole document.

For now I just decided not to ignore code blocks in documents beyond a certain size. It keeps the editor fast for the majority of users who don’t care about code, at the same time making Paper more useful for dev-adjacent audience.

The final technique I use to speed things up is to cache every complex object that acts as the value in the key-value attribute pair.

They are being re-assigned on every keystroke and never change unless a setting in changed, so makes sense to reuse them instead of creating new instances every time.

Meta attributes

Besides the visual styling, meta attributes also play a crucial role various features that need to know the structure of the text.

Formatting shortcuts

  • Toggling styles on a selected piece of Markdown text requires a lot of information about the existing Markdown styles inside the selection.
  • If the selection completely encloses the same style then the style is removed.
  • If the selection partially encloses the same style then the style is extended.
  • If the selection does not have the same style then the new style is just added.
  • You also need to be careful not to mix together styles that cannot be mixed. The conflicting styles need to be removed first before a new style can be added. For example, styles that define the type of the paragraph such as heading and blockquote cannot be mixed.

Jumping between chapters

  • Paper has a feature which allows you to jump to the previous or to the next chapter edge.
  • To implement it you need to be able to find the headings relative to the position of the caret.


  • The outline feature relies on being able to traverse every heading.
  • Pressing on the item in the outline jumps the caret there.

Rearranging chapters

  • Paper also has a feature that allows rearranging chapters in the outline.
  • Shuffling text back and forth is so much easier when you know where stuff is.

Converting formats

  • Converting the Markdown content to RTF, HTML, and DOCX all rely on knowing the structure of the text.
  • Since Paper does not rely on any external libraries having a pre-parsed model of the text allow me to traverse the structure, building the respective output format in the process.
- (NSString *)toHtml:(NSMutableAttributedString *)string {
  [self encloseInHtmlTags:string
    MdStrong: @[ @"<strong>", @"</strong>" ]
  [self encloseInHtmlTags:string
    MdEmphasis: @[ @"<em>", @"</em>" ]
  [self encloseInHtmlTags:string
    MdUnderline: @[ @"<u>", @"</u>" ]
  [self encloseInHtmlTags:string
    MdStrikethrough: @[ @"<s>", @"</s>" ]
  [self encloseInHtmlTags:string
    MdHighlight: @[ @"<mark>", @"</mark>" ]
  [self encloseInHtmlTags:string
    MdCode: @[ @"<code>", @"</code>" ]
  [self encloseInHtmlTags:string
    MdHeading1: @[ @"<h1>", @"</h1>" ],
    MdHeading2: @[ @"<h2>", @"</h2>" ],
    MdHeading3: @[ @"<h3>", @"</h3>" ],
    MdHeading4: @[ @"<h4>", @"</h4>" ],
    MdHeading5: @[ @"<h5>", @"</h5>" ],
    MdHeading6: @[ @"<h6>", @"</h6>" ]
  [self encloseInHtmlTags:string
    Paragraph: @[ @"<p>", @"</p>" ]

  [self encloseInBlockquoteHtmlTags:string];
  [self encloseInListHtmlTags:string];
  [self transformFootnotesForHtml:string];
  [self deleteCharactersWithAttributes:string :MetaAttributes.tags];
  [self insertHtmlBreaksOnEmptyLines:string];

  return string;


I am obsessed with symmetry in Paper. Whenever possible I try to keep the text container visually symmetrical. Sometimes this means making it actually asymmetrical and then faking the symmetry with paragraph indentation. Like in the case when the headings tags are placed visually to the left of the regular flow of text.

A diagram breaking down the interface of the Mac app. Various gaps and dimensions are labelled.

While there is enough space it tries to keep margins visually symmetrical. If there is no space, then it breaks the symmetry if favor of keeping the max content width. But only until padding on the right is eaten up. Minimal insets take precedence over keeping the max content width.

You can achieve this gradual collapsing on the insets with a combinations of min and max functions. It takes a second or two to get your head around this. But once you do, it feels quite elegant. A love this kind of simple mathy code that leads to beautiful visual results.

- (CGFloat)leftInset {
  return (self.availableInsetWidth - fmin(
    self.availableInsetWidth - self.totalMinInset,
  )) / 2.0;

- (CGFloat)rightInset {
  return self.availableInsetWidth - self.leftInset;

- (CGFloat)availableInsetWidth {
  return self.availableWidth - self.textContainerWidth;

- (CGFloat)textContainerWidth {
  return fmin(
    self.availableWidth - self.totalMinInset

- (CGFloat)maxContentWidth {
  return self.lineLength * self.characterWidth + self.leftPadding;

- (CGFloat)availableWidth {
  return CGRectGetWidth(self.clipView.bounds);

- (CGFloat)totalMinInset {
  return self.minInset * 2.0;

- (CGFloat)minInset {
    CGRectGetMinX(self.window.titlebarButtonGroupBoundingRect_) +

- (CGFloat)leftPadding {
  return [@"### " sizeWithAttributes:@{
    NSFontAttributeName: Font.body

Selection anchoring

Text selection always has an anchor point. It’s something we are so used to that we never stop to think about.

On the Mac we click and drag to select text, and instinstively we know that the selection will increase if we drag to the right and decrease if we drag to the left. But only until we hit the point of the click. Then the opposite happens.

On iOS the selection is a bit more interactive. We can drag one edge and then the other one becomes the anchor, and vice versa.

Same logic applies when we extend the selection with the keyboard. Hold the Option key plus a left or a right arrow and you can jump between the edges of words. Do the same while holding the Shift key in addition to the Option key, and you can select with word increments. And again, it remembers where you started.

It even works naturally when you first click and drag and then continue extending or shrinking the selection with the keyboard. The initial point of the click remains the anchor.

Selection affinity

Another fascinating concept of text editing that you definitely don’t know about is selection affinity. Quoting Apple’s documentation:

Selection affinity determines whether, for example, the insertion point appears after the last character on a line or before the first character on the following line in cases where text wraps across line boundaries.

My guess is you still have no clue what it means, so let’s see it in action.

Pay attention in the screencast below. When I move the caret with arrow keys it just switches lines when moving around the wrapping point denoted by the space character. However, if I move the caret to the end of the line with the shortcut, it attaches itself to the right side of the wrapping space while staying on the same line.

There are also other instances where TextView decides to play this trick. It’s a tiny detail and sort of makes sense when you think about it, but quite hard to actually notice.

Uniform Type Identifiers

The next chapter will focus on cross-app data exchange, but first we need to discuss the system that underpins it — UTIs. It’s a hierarchical system where data types conform to (inherit from) parent data types.

A diagram showing the hierarchical structure of UTIs. At the top is “”, below it “public.text”. Then it splits to “public.plain-text” and “public.rtf”. Below “public.plain-text” is “net.daringfireball.markdown”.

The benefit of the hierarchical system is that, for example, if you app can open text formats, then you don’t need to list all of them. You can just say that it can work with public.text and it will cover all text formats. And indeed, Paper declares that it can open any text file and, although you won’t get any highlighting, you can still open .html, .rtf or any other text format.

When exchanging data via a programmatic interface such as the clipboard, UTIs can be used directly. Files, however, are a bit trickier. File is a cross-platform concept and de-facto identifiers for files in the cross-platform realm are file extensions. Even if Apple would redo their systems to rely on some file-level UTI metadata field instead of the file extension, other systems would not know anything about it. So in other to stay compatible, every UTI can define one of more file extensions that are associated with it.

Now most of the time you work with either public UTIs or private ones that you’ve create specifically for your app. Things are relatively straightforward in this scenario. The harder case is when you have a format that’s widely accepted, but not defined by Apple. This is the exactly the case with Markdown. I will explain some of the annoying edge-cases with these semi-public UTIs in the next chapter.


UTIs transition nicely into the topic of cross-app exchange driven primarily by the clipboard or in Apple’s technical terms — the pasteboard.

The pasteboard is nothing more than a dictionary where UTIs are mapped to serialized data — in either textual or binary format. In fact, if you download Additional Tools for Xcode, you can inspect the pasteboard with Clipboard Viewer.

As you can see a single copy action writes multiple representations of the same data at once (for backward-compatibility some apps also write legacy non-UTI identifiers like NeXT Rich Text Format v1.0 pasteboard type). That’s how, for instance, if you copy from Pages and paste it into MarkEdit, you get just the text, but if you paste it into TextEdit you get the whole shebang.

As a general rule, editors pick whatever is the richest format they can handle. Some apps provide ways to force a specific format to be used. For example, a common menu item in the Edit menu of rich text editors is Paste and Match Style or Paste as Plain Text. It tells the app to use the plain text format from the pasteboard. The styles applied to the pasted text are usually the ones surrounding the text selection (the typing attributes).

Fun fact is that drag and drop is also powered by the pasteboard, but a different one. The standard one is called the general pasteboard and it’s used for copy-paste. You can even create for custom cross-app interactions.

Another fun fact is that RTF is basically the serialized form of NSAttributedString. Or vice versa, NSAttributedString is the programmatic interface for RTF.

NSAttributedString *string = [NSAttributedString.alloc initWithString:
  @"The quick brown fox jumps over the lazy dog."];
NSData *data = [string dataFromRange:NSMakeRange(0, string.length)
  NSDocumentTypeDocumentOption: NSRTFTextDocumentType
} error:nil];

// {\rtf1\ansi\ansicpg1252\cocoartf2759…
NSLog(@"%@", [NSString.alloc initWithData:data]);

This means that TextView is out-of-the-box compatible with pasteboard, since it works on top of NSTextStorage — the child class of NSAttributedString. No extra coding is required to copy the contents to the pasteboard.

Now, as I mentioned in the last chapter, this is all great for public UTIs. But what about semi-public ones like Markdown? From my experience, the cross-app exchange is a mixed bag… Imagine you want to copy from one Markdown editor and paste into the another one. Let’s say both have implemented the standard protocol to export formats with various levels of richness, and to import the richest format given. Copying from the first editor exports Markdown as public.text and the rich text representation as public.rtf. When pasting to the second editor, it will pick public.rtf instead of the native Markdown format, since there is no indication that the text is indeed Markdown. You end up with this weird double conversion that leads to all sorts of small formatting issues such as extra newlines due to slight variations in the way Markdown↔RTF translation works in both apps, as well as just fundamental styling differences between Markdown and RTF. For the user it is obvious — I copy Markdown from here and paste it here — it should just copy 1:1, but under the hood there is a lot of needless conversion.

For this to work nicely both apps should magically agree to export the net.daringfireball.markdown UTI as well, and to prefer it over public.rtf. If only one of the apps does it — it won’t make a difference. Paper tried to be a good citizen and export the Markdown UTI, but none of the other apps seem to prefer it over rich text. In addition that, Pages has a weird behaviour where it does prefer net.daringfireball.markdown over public.rtf, but in doing so it just inserts the raw Markdown string as is without converting it to rich text (why-y-y??? 😫). Because of these reasons I had to drop the Markdown UTI.

But why export RTF at all? Markdown is all about plain text — drop RTF and problem solved — you might think. Well, that’s true, but I want to provide a seamless copy-paste experience from Paper to rich text editors as well. And, being a good OS citizen, you should provide many formats that represent the copied data, so that the receiving application could pick the richest one it can handle. In Paper you can copy the Markdown text in the editor and paste it to the Mail app, and it would paste as nicely formatted rich text, not as some variant of Markdown. This is a great experience in my opinion. The only problem is that it often leads to less than ideal UX in other cases.

A concept closely related to the pasteboard is sharing on iOS. It’s quite similar to copy-paste, only with a bit more UI on top. Your app exports data in various formats and the receiving app decides what formats it wants to grab. Strangely enough UTIs are not used to identify the data. Rather classes such as NSAttributedString and NSURL are directly used to represent the type. Probably because there is no need for serialization — objects are being passed directly from one app to another.

The apps also have to explicitly declare that they want to be in that list of apps by providing a share extension with a custom UI.

PS — I like sweating the details. If you think I might be useful to you → reach out. 😉