title: Data Validation in Typescript Using the Either Pattern published: true description: You can go with this, or you can go with that tags: typescript, haskell, webdev, javascript


This summer, I worked on an internship project which involved creating a CRUD (Create, Read, Update, Destroy) application which handles Hackathons. During this project, my team and I discovered that we had no comprehensive solution or pattern for validating data coming into the application with the Create and Update actions.

In the end, our API methods would always consist of checking for the presence of a field, then checking some value based on that field, and so on. Instead of using the strong type checking abilities of Typescript, we resorted to frequent use of any and optional fields on our models. It was a mess (as an internship project might be).

```typescript interface Hackathon { name: string; endDate?: number; // UNIX epoch timestamp startDate?: number; ... }

validateHackathon(hackathon: any) : void { if (hackathon['endDate'] && hackathon['startDate']) { if (hackathon['endDate'] < 0) { throw new Error("End date cannot be negative!"); } if (hackathon['startDate']) < 0) { throw new Error("Start date cannot be negative!"); } if (hackathon['startDate'] > hackathon['endDate']) { throw new Error("Start date must be before end date!"); } } // ... various property checks and data validation steps ... }

async updateHackathon(hackathon: any) : void { validateHackathon(hackathon); // If the program gets to this step, then the object must have correct data and the correct type await this.repository.updateItem(hackathon as Hackathon); } ```

At the same time as I was working on this project, I have been learning Haskell, a powerful purely functional programming language. Since this post isn't meant to convince you to learn Haskell, I'll just introduce one powerful pattern which can be found in the language's base library: Either. Or, more specifically, Either a b. We'll discuss how this pattern can be introduced into Typescript, and how, with some setup and background, it can make data validation a lot simpler.

What is Either?

Essentially, Either is a type which can represent one of two other types. In Haskell, this idea is written as Either a b, where a and b represent the two other types. But only one type can be represented at a time. So, as its name suggests, at runtime, Either a b can only be a or b, but not both. Either Int String will either be an Integer or a String.

In order to determine which form Either is taking at any given time, the two options of types will be wrapped in a special value. In Haskell, these options are called Left and Right. So an Either Int String can be a Left Int or a Right String. In general, this pattern is known as a Tagged or Discriminated Union (Wikipedia). The two separate types have been combined into one type through the use of an object which "tags," or indicates, which type is in use.

In Haskell, the definition for Either takes the form of a general algebraic datatype:

haskell data Either a b = Left a | Right b

Here, the vertical bar | refers to a logical OR, where, again, Either a b can be Left a OR Right b. We'll reuse this syntax when we write Either in Typescript.

The power of Either comes from its use in error handling. By convention, the Left type is the "error" type, and the Right type is the "value" type. As an Either value is passed through a program, operations are performed on the Right value. If an error occurs, the error's information can be "stored" in the Left type. The program will then continue, checking if an error is present, and passing the error's information along, performing no other computation in the process.

Therefore, a sequence of operations, such as data validation, can be written such that each validation step can throw its own error, and the first error found will be propagated through the operation sequence, rather than branching out from the normal logic of the program.

Either in Typescript

We can see that the Either pattern is really powerful just from its theoretical definitions. But can we write it in Typescript? Yes! Luckily, Typescript includes support for discriminated unions, as long as we write a few other methods which help the Typescript compiler infer which tagged type is actually in use. So let's write Either in Typescript.

First, we want to define interfaces which have the shared (tagged) property (also known as the "discriminant"). We'll need to leverage Generics, as well, so that any type can be held within our union objects. Since we are working with Left and Right, we'll make those our interface names, and we'll use two properties in each interface to create the structure of the union: value will hold the actual typed value of the object, and tag will purely refer to which type of container is in use.

```typescript interface Left { value: A; tag: 'left' }

interface Right { value: B; tag: 'right' } `` (Both interfaces could have usedA` to refer to the generic type, but it can be confusing to see the same letter.)

Now that we have our separate interfaces, we need to declare a type alias which will refer to either Left or Right:

typescript type Either<A,B> = Left<A> | Right<B>;

If we had written just Either<A>, we wouldn't have gotten the behavior we wanted: Both sides of the Either would have had to hold the same type, not two different types.

Finally, we can write the helper functions that Typescript requires to translate the tagged value into a type inference.

```typescript function isLeft(val: any): val is Left { if ((val as Left).tag === 'left') return true; return false; }

function isRight(val: any): val is Right { if ((val as Right).tag === 'right') return true; return false; } ```

These functions, simply put, cast their incoming value as a Left or Right, and then check the value of the tag field. The strange return value of val is Left<A> is the annotation for the compiler that, in the coming context, the type of val is Left<A>.

Finally, we're going to write some constructors for the Left and Right types. Whereas the interface definitions above tell us what a Left and Right value might look like, we can write a method which acts like a constructor to make creating these objects explicit:

```typescript function Left(val: A) : Left { return { value: val, tag: 'left' }; }

function Right(val: B) : Right { return { value: val, tag: 'right' }; } ``` When we wrote the interfaces above, we essentially defined a type called "Left" and "Right." Here, we are writing functions with the same name, and Typescript can figure it out because the function names and the type names are separate.

What does this have to do with Hackathons?

Let's actually put this together to do some data validation! Say that the only information we need about an error that occurs during validation is a string. Let's make a quick type alias to make that clear in our method signatures:

typescript type MyError = string;

Super simple. Now, we can write the validateHackathon method from above, but using Either:

typescript validateHackathon(h: Hackathon) : Either<MyError, Hackathon> { if (h.endDate < 0) { return Left<MyError>("End date cannot be negative!"); } if (h.startDate < 0) { return Left<MyError>("Start date cannot be negative!"); } if (h.startDate > h.endDate) { return Left<MyError>("Start date must be before end date!"); } // etc return Right<Hackathon>(h); } You might be asking yourself, how can we return Left at one point and Right at another? This comes from the logical OR aspect of our definition of Either. Either can be a Left or a Right type, so as long as the return value is a Left OR Right, the type signature holds.

Also, notice here that we are requiring the incoming value to be of type Hackathon, whereas in the function above it was an any type and we casted to Hackathon at the end. Part of cleaning up the validation is separating the structure of the incoming data from any limits that we might have on its values. Validating the structure of the data can be something done with a JSON Schema and validator. Validating the limits that we have on the values of the incoming data is what will be addressed with our Either methods.

So, this method is interesting, but it isn't really that different from what we had before. Now we just have a funky method signature, and we use these Left and Right constructors instead of just throwing an error or returning a value. What's so special?

Creating Predicate functions

If we squint hard enough at our existing validation function, we can see that it has a repetitive structure: Using an if statement, we check some property of the incoming value. If the condition doesn't hold, we throw the corresponding error. We do this over and over again for different properties and their errors.

Any function which takes a value and returns true or false is called a predicate. Using Either, we can write a function that evaluates some object against the predicate, and if the predicate doesn't pass, the resulting Either takes the Left error form. We can call this method predicateEither. We'll also create a type alias for a predicate function, so I don't have to re-write these predicate signatures in each helper method signature:

```typescript type Predicate = (val: N) => boolean;

function predicateEither(value: B, error: A, predicate: Predicate) : Either { if (!predicate(value)) return Left(error); return Right(value); } ```

So now, for example, we can validate on negative dates with a predicate:

```typescript const StartDateMustBePositive = (h: Hackathon) => h.startDate > 0;

let badHackathon : Hackathon = { name: "Bad", startDate: -10, endDate: -10 };

let result = predicateEither(badHackathon, "Start Date must be positive!", StartDateMustBePositive);

// Result = Left "Start Date must be positive!"

let goodHackathon : Hackathon = { name: "Good", startDate: 10, endDate: -10 };

result = predicateEither(goodHackathon, "Start Date must be positive!", StartDateMustBePositive);

// Result = Right (goodHackathon) ``` Notice that we don't need to include Generic type indicators anywhere because Typescript can fill in the blanks for us!

Combining Predicates

But wait, you might be saying. "Good Hackathon" isn't actually good, it still has a negative end date!

You're right, and so we should write another predicate function for that. But how do we combine that with the first predicate? We don't want to check the result value each time we use predicateEither, since then we might as well be doing manual error handling, and we'll create a lot of branches in our program:

```typescript const EndDateMustBePositive = (h: Hackathon) => h.endDate > 0;

function validateHackathon(h: Hackathon) : Either { let result = predicateEither(h, "Start Date must be positive!", StartDateMustBePositive); if (isLeft(result)) return result; // Branch! result = predicateEither(h, "End Date must be positive!", EndDateMustBePositive); if (isLeft(result)) return result; // Repetitive! return result; } ``` One of my favorite programming principles is DRY (Don't Repeat Yourself), and we are certainly violating that here. So let's write one final helper function which will make this whole endeavor worth it.

This method is called firstLeft. It takes an initial value, a list of predicates, and a list of errors. The value is tested against each predicate until one fails, in which case the corresponding error is returned. If no predicates fail, the value will be returned.

typescript function firstLeft<A, B>(val: B, predicates: Predicate<B>[], errors: A[]) : Either<A, B> { for (let i = 0; i < predicates.length; i++) { let p = predicates[i]; if (!p(val)) return Left(errors[i]); } return Right(val); } With this structure, we can create a list of predicates and their errors, and trust that the first error found will be the one that we are alerted to: ```typescript let predicates = [ StartDateMustBePositive, EndDateMustBePositive ]; let messages = [ "Start Date must be positive!", "End Date must be positive!" ];

function validateHackathon(h: Hackathon) : Either { return firstLeft(h, predicates, messages); }

async updateHackathon(h: Hackathon) : void { let result = validateHackathon(h); if (isLeft(result)) { console.error(result.value); return; } await this.repository.updateItem(h); }

``` Dope! We've just transformed our repetitive, branching mess into a single line, and we've ensured that, at the first sign of a validation error, the original logic won't continue.

A "Spec" for Validation

I could stop here, but I want to change our firstLeft method just a bit. Having the predicates and messages as two separate arrays feels wrong; what if someone added a predicate but forgot to add a corresponding error message? The program would suddenly break on correct inputs due to indexOutOfBounds issues.

In this case I want to take advantage of tuples, or rather, what we have to use in place of tuples in Java-/Typescript. If we use a tuple-style object, we can effectively create a big list of predicates and their corresponding error messages. This big list can act as a "spec" for the object: any property that the object must satisfy can be found in the list.

Let's make a little "Pair" type and use it to create such a spec:

```typescript interface Pair { first: A; second: B; }

function firstLeft(val: B, predicatePairs: Pair, A>[]): Either { for (let i = 0; i < predicatePairs.length; i++) { let p = predicatePairs[i].first; let e = predicatePairs[i].second; if (!p(val)) return Left(e); } return Right(val); }

const HackathonSpec : Pair, MyError>[] = [ { first: StartDateMustBePositive, second: "Start Date must be positive!" }, { first: EndDateMustBePositive, second: "End Date must be positive!" } ];

function validateHackathon(h: Hackathon) : Either { return firstLeft(h, HackathonSpec); } ```

More complicated predicates

This pattern is really cool when you're using simple predicates, but business logic is hardly ever simple. How can we adapt this pattern for more complicated predicates, which require more than one input?

The answer is that we can write any kind of complex logic in our predicates, as long as we find a way to ensure they take one input and return a boolean. For example, in our internship project, we had to ensure that the dates for an incoming Hackathon didn't overlap with any existing Hackathon dates.

To test this predicate, we have to examine the incoming Hackathon against every other Hackathon. You might imagine that this would mean our predicate must have two inputs: (incomingHackathon: Hackathon, existingHackathons: Hackathon[]). But we can instead use closures to introduce the existing Hackathons inside of the predicate function:

```typescript class HackathonController { getAllHackathons(): Hackathon[];

DatesMustNotOverlap = (h: Hackathon) => {
    return this.getAllHackathons()
                 .map<boolean>(v => v.endDate >= h.startDate 
                                 || v.startDate <= h.endDate )
                 .reduce((p, c) => p && c);
};
// etc

} ```

In Conclusion

Overall, using Either in this way creates a powerful pattern that allows for data validation steps to become much clearer and for their error messages to be more helpful. There are a lot of other things that can be done with Either, Pairs, and discriminated unions, which I hope to explore and discuss more in the future!

Footnote for those of you who know what you are talking about

I should say: I'm still very much new to Haskell and its powerful ideas, like Monads, Functors, Applicative, and Transformers. I'm still working on learning and fully understanding these ideas. Either is an interesting concept that I have found I can much more fully understand through implementation in Typescript (after all, Javascript was the first language I learned).

Because Typescript lacks a few powerful aspects of functional programming that truly elevate Either and other Monadic patterns to a new level (most notably partial function application), this implementation isn't nearly as powerful as Haskell's! But that's okay.