Sawtooth Software: The Survey Software of Choice

Validating Email Address Formats

I have a question in my survey that asks for the respondent's email address. But I want to add something that makes sure they type in a valid email address. How do I do that?

While we can't check to see if an email address is valid (without sending and tracking an email to see if that email address actual exists), we can check to see whether or not it is properly formatted and follows the patterns established by Internet standards. That is something we can do with a little help from JavaScript and a feature called regular expressions.

Proper Email Address Formats

First, let's talk about what a properly-formatted email address looks like. It must have a local part and a domain part. The local part usually consists of a user name and is everything to the left of the @ symbol, and the domain part is everything to the right of that symbol, i.e. username@domain. Both the local part and the domain part may be broken down into smaller parts that are often separated by one or more periods, i.e. This email address is being protected from spambots. You need JavaScript enabled to view it.. The very last part of the domain name is called the top-level domain, and typically ends with .com, .net, .mil, .edu, .gov, .org, etc. The top-level domain may be a country code designation such as .uk, .ca, or .mx. Domains also must follow rules, like the following:

  • Spaces are not allowed
  • It is not case sensitive
  • Dashes or numbers are allowed
  • These special characters are not allowed: ! @ # $ % ^ & < * ( ) { } | [ ] or >
  • It cannot exceed a certain length, normally 63-69 characters

But there are lots and lots of extra rules that govern the formatting of email addresses. Thousands of rules in fact. To see some of them, do an Internet serach for RFC 5322, RFC 2822 or RFC 3696, which were created by the Internet Engineering Task Force (IETF) whose prime directive is to make the Internet run smoother.

For example, here are some valid email addresses:

  • This email address is being protected from spambots. You need JavaScript enabled to view it.
  • This email address is being protected from spambots. You need JavaScript enabled to view it.
  • This email address is being protected from spambots. You need JavaScript enabled to view it.
  • This email address is being protected from spambots. You need JavaScript enabled to view it.
  • user@[IPv6:2001:db8:1ff::a0b:dbd0]
  • "much.more unusual"@example.com
  • "This email address is being protected from spambots. You need JavaScript enabled to view it."@example.com
  • "very.(),:;<>[]\".VERY.\"very@\\ \"very\".unusual"@strange.example.com
  • postbox@com
  • admin@mailserver1
  • !#$%&'*+-/=?^_`{}|~@example.org
  • "()<>[]:,;@\\\"!#$%&'*+-/=?^_`{}| ~.a"@example.org
  • " "@example.org
  • üñîçøðé@example.com

Solving the Problem

Now with that history behind us, let's figure out how to actually solve the problem at hand and validate the format of an email address. We want to check to make sure it contains a user name, the @ symbol, and a domain. We can accomplish this with some JavaScript error validation that uses something called a regular expression. Basically what it does is check the input to see if it conforms to a specific format or pattern. If it doesn't then it throws an error on the screen that says something like, "Hey, Dummy. I meant a VALID email address. Try again." And it won't let you proceed to another page until you enter in a valid email address.

So let's say you wanted to add this functionally equivalent option to an open-end question called "UserEmail". In the question dialog box, you would click the "Advanced..." button, and then select the "Custom JavaScript Verification" tab. Then, after checking the "after" checkbox at the top of the dialog, you'd paste in the following code:

function emailvalidate(address, pattern) {
  if (pattern.test(address) == false) {
    return "You entered an invalid email address.";
  } else {
    return "";
  }
}

{
  address = SSI_GetValue("UserEmail");
  pattern = /^([A-Za-z])+([A-Za-z0-9_\.])+\@([A-Za-z0-9_\-\.])+\.([A-Za-z]{2,4})$/;
  strErrorMessage = emailvalidate(address, pattern);
}

This particular regex string ^([A-Za-z])+([A-Za-z0-9_\.])+\@([A-Za-z0-9_\-\.])+\.([A-Za-z]{2,4})$ was designed to allow 99% of all emails to pass through its pattern test. However, it won't allow domain extensions that are longer than four characters, such as the .museum top level domain.

There are myriad regex patterns which you may use within this code. Of course, each one comes with trade offs. You'll need to pick the pattern that is most likely to match the emails of your panel or potential respondents. Just make sure you thoroughly test a pattern before you use it, because it will do exactly what you tell it to do.

For example the following pattern can be used to allow any two-letter country code top level domain, and only specific generic top level domains, like .museum. ^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.(?:[A-Z]{2}|com|org|net|edu|gov|mil|biz|info|mobi|name|aero|asia|jobs|museum)$.

Or you could go with something that is more loose. For instance, this pattern ^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,6}$ would allow top level domains like .museum, but it would also allow something like john@mail.office> which may not be what you want. It is far more likely that John just forgot to type in the .com top level domain.

For more information about regular expressions and validating emails, please visit the following link: http://www.regular-expressions.info/email.html.

Sawtooth Software

6:30 AM to 5:30 PM Mountain Time
(GMT-6; GMT-7 Autumn/Winter)
Monday through Friday

Phone: +1 801 477 4700
Fax: +1 801 434 5493
Email: support@sawtoothsoftware.com

We are open!

SKIM Software

9:00 AM to 5:30 PM Central European
(GMT+2; GMT+1 Autumn/Winter)
Monday through Friday

Phone: +31 10 282 3500
Fax: +31 10 282 3560
Email: software@skimgroup.com
Online: www.skimgroup.com

Lighthouse Studio

Lighthouse Studio is our flagship software for producing and analyzing online and offline surveys. It contains modules for general interviewing, choice-based conjoint, adaptive choice-based conjoint, adaptive choice analysis, choice-value analysis, and maxdiff exercises.

Try Lighthouse Studio

  Buy Lighthouse Studio