An xAPI adopter asks:

“What happens when you have a Verb in mind, but can’t find it in the xAPI Registry or other Communities of Practice (CoP)? However, there are, other verbs already defined that mean approximately the same thing. Still, you want to use your verb because it’s used in your application and is, therefore, more recognizable and meaningful to your use case. You want your Statements to reflect an Activity both syntactically as well as semantically.”

So, how should you proceed? Do you coin your own verb or do you use a verb that already exists in the public domain? The answer to this question is not a short one and requires diving deeper into semantics and how that plays into making sure the data you generate is useful and usable across systems. We need to consider natural language and how that impacts semantic interoperability. Natural language uses different words to mean more or less the same thing, and we sometimes use them interchangeably. In terms of semantic interoperability, how does xAPI accommodate related concepts?

We can sum this up as two distinct questions to tackle:

  • From a best-practice standpoint, is it appropriate to create your own verb with a similar or matching definition to an existing one in the public domain?
  • Does the xAPI specification allow an LRS or other system to correlate or equate different terms for the sake of shared data, reporting and analytics?

I’m going to venture into very pedantic territory here because I feel these questions require a level of precision in answering that gets us there. In the end, the simple answer is that you should do what’s practical and allows you to get your job done. So with that caveat in place, let’s take on each question in more detail.

Part 1: From a best-practice standpoint, is it appropriate to create your own verb with a similar or matching definition to an existing one in the public domain?

Note: when discussing verbs in the public domain, I’m referring to existing URIs. Verb IDs as URIs mean that many verbs can be resolved to a location (URL), and when they do they can provide additional meta information. Several years ago, I wrote about some of these concepts in this blog post “Deep Dive: Verbs” in case you care to learn more about this.

Generally I would say no to the concept of a “matching” definition and maybe to a “similar” definition. But it is extremely important to determine the degree to which something is “similar” in both definition and intent.

To help illustrate the complexity and implications of this question, let’s use this example:

“Bob uploaded a video.”

The verb “uploaded” isn’t found among the ADL Profiles or Tin Can Registry. Existing, related verbs are “imported” and “added” (and there are two instances of “added” with nearly the same definition).

This is a good example because it addresses both similar and matching verbs that exist in the public domain. Both “imported” and “added” exist in The Registry. I would argue that the definitions between “imported” and “added” are not similar (or at least not similar enough) to be used interchangeably. So a good case can be made that while similar it is still valid to have each verb ID exist in the registry. The fact that there are two instances of “added” is a perfect example of why you shouldn’t coin two terms that are semantically the same.

Now having said that, even though these 2 IDs (themselves being unique) use the same “added” descriptor, the verb usage might be distinguished by the activity type of the `object` that the statement captures—in which case they aren’t “semantically the same.” The second instance of “added” is not the exact same as the first because the first clearly states that it is an action adding the object to the target, but the second indicates specifically that it is being added to a collection specifically (the activity definition type might indicate a “collection”).

Additionally, the indication that the latter specifies adding “one or more” is important because it implies that a system will have to be able to handle a set being added to a set that is not provided for in the first case. This also is where the concept of a “profile” becomes very important. The first verb is specifically in the collection of “version 1 of Activity Streams,” which itself provided for both the concept of a “target” and an “object.” So this tells you a bit about how that verb is to be interpreted, which is itself somewhat difficult in the context of xAPI since it does not specifically outline these two separate concepts.

Going back to “imported” with its definition as “the act of moving an object into another location or system”. It is far more specific about how it works and that it is specifically about an object going from one system to another. Considering this definition, “imported” shouldn’t be interchanged with either of the “added” verbs that give no indication about crossing system boundaries.

Now, you could “import” an electronic photo into a digital photo album on another system, but the key to watch for here is what information you care about capturing in this instance. Is it important (meaningful) information that a photo was “imported” (in other words, do we care about the mechanism used), or is it important that a specific photo is being “added” to a photo album?

Here’s another example. When talking with people about xAPI, I often hear them wanting to use the verb “clicked”. But this is rarely the correct verb choice. In most cases, the important action is not the actual “clicking” by a user. Rather, “clicking” is more often a mechanism to achieve a goal, and it’s information about that achievement we’re interested in capturing.

For instance “clicking” on a link isn’t interesting, but “opening” or “visiting” that link is the important information. After all, maybe a user was navigating with their keyboard, using touch on a mobile device, or using voice command software, etc. So the act of “clicking” a button on a mouse-like device isn’t the correct expression.

So for the photo album example, the meaningful action is most likely that a photo was “added” to an album. Again, having said that, it might be meaningful that a file was imported to a system because the system needs to track storage space, bandwidth use, etc. So in that use case, the meaningful information is that it was an import.

Takeaway: Verb definition and the meaning of the verb for your specific use case are both critical. When it comes to using an existing verb ID that is similar, be careful that it’s not just ‘close enough’. The ramifications of settling for using an existing verb ID rather than adding a new verb to the public domain will be shown below. On the flip side, using an existing verb ID when appropriate can be very beneficial as you will also see.

Part 2: “Does the xAPI specification allow an LRS or other system to correlate or equate different terms for the sake of shared data, reporting and analytics?”

Now that I’ve provided some insight on “semantic” within this context, “interoperability” is the other side of that coin to help us answer the question around correlation or equating different terms.

The xAPI specification is intentionally very narrow in its definition. To answer directly, the specification itself doesn’t address this concept at all. It is (and has always) intended that additional systems that “consume” the xAPI data will necessarily have to correlate or equate disparate terms in a semantic manner. The important concept here is that the xAPI specification was intentional in its use of verb identifiers that are IRIs and not a single English (or any language) word. So the only thing that can be said is that when two statements have the exact same identifier then they should be considered semantically equivalent. Beyond that, how different identifiers are used is really up to the context and system consuming them.

The key to “interoperability” is that disparate systems that may have come across similar data can in fact derive some value from seeing data from a new system that is the same. So the primary reason to use a shared term, in this case the verb ID, is to reduce the amount of work a new system in an environment has to do to understand that existing environment.

A key tenet of xAPI is data portability. Data generated by xAPI activity providers is not intended to be siloed; so to really take advantage of this, you have to think in these terms. Coining all new terms (e.g. creating new verb IDs) may make the data “look” better, but ultimately means doing more work to incorporate other systems that may not yet be familiar with that kind of data. It ultimately silos the data and potentially calls into question why one would bother to use xAPI at all, as there are more efficient mechanisms for capturing event data that will only ever work with a single system.

Takeaway: If the Statement you are designing will potentially be consumed by multiple systems, it is worthwhile to be specific and intentional in your choices when it comes to coining new IDs or utilizing existing IDs. If the data from your learning record provider will never interact with other systems, then using xAPI may not be the right choice.

The importance of being pedantic

The key with “natural language” is that the system intended to use it (aka the human brain) is so far more advanced at interconnecting related data than anything that we’ve produced. Computers are exceptional at handling large quantities of heavily specified data, which is how we end up at different verb identifiers for the same natural language verbs. Even as natural language processing systems have become more advanced, they’re still so far behind what a toddler is capable of that it’s staggering—at least in my opinion.

And while we use verbs interchangeably (i.e. “imported” and “added”) how well that works in practice also depends on context (which includes tone, physical communication, etc.). The degree to which someone is “pedantic” matters a great deal, depending on their environment.

For instance, I’m very pedantic when working with the xAPI specification group because the very thing we’re defining is a specification for how others should interpret a document to build a system. At most other times, that level of pedantry is either unnecessary or at times aggravating to other participants in a conversation. The irony is (and the length of this message shows) the clearer our communication is, the less misinterpretation and resulting trouble we’ll run into later.

Hopefully this helps some, at least with the intentions of the xAPI specification.

Brian

Brian Miller is the Senior Director of Engineering and is also the most pedantic person at the office, which is saying something. That skill makes him great at ensuring our products support the standards, which is precisely what he spends his days doing. Brian is a IEEE LTSC voting member working on the advancement of learning standards, like xAPI and cmi5.