NSRegularExpression capture groups: Chord Symbols

Swift Language


How to use NSRegularExpression with capture groups in Swift.


Table of Contents

I want to parse chord symbols using a regular expression. The first step is to define a valid regular expression and the second is to determine how to use NSRegularExpression to retrieve matches from input.

The Regular Expression

Table of Contents

Of course, we need to define a valid regular expression to use NSRegularExpression.

I want to match chord symbols using Weber’s roman numeral root chacha.

Here’s a few examples:

The tonic chord in a scale: I
The dominant seventh chord in a scale: V7
A Neapolitan seventh: biimaj
Typical jazz chord: Vm7b5

There are a few things I need from each chord specification:
What is the root? Is the root altered? What is the symbol if it exists?
So let’s break that up into a few groups.

The root is going to be a roman numeral. I don’t need all possible numerals, just the ones used in music.
The character I can occur consecutively at most 3 times. In some numbers it doesn’t appear.
Let’s make I, or II, or III match.

I can be preceded by V.
Let’s make any of I, II, III, V, VI, VII, VIII match by adding V zero or more times (using ?).

That’s most of them! We’re missing IV, so let’s add that using an “or” (|).

So there’s a regexp to match roman numerals I to VIII – which are the only ones we need.

Capture Groups

Table of Contents

I want to be able to alter any of those roots with a flat or sharp. So, bII or #IV would be grokked. So begin the regexp with an optional b|#. But remember what we’re going to do with the matches: get the root, then alter it (and then get the chord symbol to create a chord). So there are two separate bits of info to retrieve to get the entire root: the degree specified by the roman numeral and any alteration.

A regular expression capture group allows you to get parts of a match. They are delineated by parenthesis (). For example, the root alteration (optional) group would be (b|#)?. Even if the input does not contain an alteration, the parser will still refer to the first group as 1 (and it would be empty if there is no alteration).

Here are two groups. Note that the alteration and the root are both contained in parenthesis. The alteration is optional, but the root is not.
let regexp = “(b|#)?(IV|V?I{0,3})”

That leaves the chord symbol. Yup, one more group to add. You can go nuts and try to specify actual chord symbols, or simply say “any alphanumeric character or sharp zero or more times”.

So there it is. You can check it out with an online regular expression tester like the one at regex101.com. Back in the 80s when I was first learning regexps, I had to use pencil and paper – and I also had to walk to school 5 miles in the snow, uphill, both ways.


Table of Contents

The init function for NSRegularExpression will throw an exception if the pattern you specify is invalid. I use fatalError to handle this – and then go and fix the pattern by using a validator probably. The init allows you to specify some options. Here I specify that matches are case insensitive. But then my pattern contains [A-Za-z] so it’s case insensitive for chord symbols. It’s up to you: do you want to bother with using II for major supertonic and ii for minor supertonic? I’m just going with symbols: iimaj or iimin (or IImaj, IImin).

Retrieve an array of matches from an input string using the NSRegularExpression. Here I’m specifying that the entire input string be searched.

In most cases, the first element in the matches array will contain what you’re looking for. This is another place where using an online regexp checker is very helpful; you can see the matches.

Using the first match, I access the capture groups using match.range(at: X) where X is the number of the capture group (starting from 1 not zero!). The match does have a range variable, but this range matches the entire thing. Useful sometimes, but not here.
In my chord regexp, the accidental is capture group 1, the roman root is group 2, and the symbol is group 3. So for each, retrieve the appropriate range and then check to see if it’s valid by comparing its location to NSNotFound. This is an NSRange instance. To create a Swift String though, we are going to need a Swift Range. So we create one from the NSRange and the input string. Then you can use that range to extract the matching substring from the input string. In Swift 4, string[range] returns a Substring instance and not a String. So to get a String, you have to use String’s init(Range).

Here’s my final func for parsing the chord symbol. I return the values of the three capture groups as a tuple for convenience. Each value is initialized to an empty string, so the caller needs to check them.


Table of Contents

NSRegularExpression isn’t horrible. It works. It makes sense after you’ve seen it work, but getting there is a bit of a pita.


Table of Contents

Share These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Facebook
  • Twitter
  • LinkedIn
  • email
  • DZone
  • Slashdot
  • Reddit
  • Google Bookmarks
  • Digg
  • StumbleUpon
  • del.icio.us
This entry was posted in Swift and tagged , . Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

Post a Comment

Your email is never published nor shared.

You may use these HTML tags and attributes <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">