![]() ![]() It's the regex used by the Perl module Mail::RFC822::Address and it's nasty. I've presented on the topic of regular expressions a number of times at usergroups and I always put up a slide showing this regex. Notice the parsing bit, but that's not the topic of this post. Curiously, and just as incorrect in my opinion, regex are also mentioned in the same sentence as parsing HTML. It seems regular expressions come up a lot when someone mentions validating email addresses. I updated the post to point to RFC 822 as well. ![]() See! Even after reading the RFC I still don’t know what the hell I’m doing! Just goes to show that programmers can’t read. UPDATES: Corrected some mistakes I made in reading the RFC. The lesson here is that it is healthy to challenge your preconceptions and assumptions once in a while and to never let me near an RFC. I think I’ll sign up for an email address like and start bitching at sites that require emails but don’t let me create an account with this new email address. There seems to be a standard stricter set of rules most email providers follow, but as far as I can tell it is undocumented. For example, Yahoo requires that an email start with a letter. Most email providers have stricter rules than are required for email addresses. public void EmailTests ( string email, bool expected ) īefore you call me a completely anal nitpicky numnut (you might be right, but wait anyways), I don’t think this level of detail in email validation is absolutely necessary. I am not worrying about checking my assumptions for the domain part for now.) Thanks.) that would validate all of these. Do they all pass?įor fun, I decided to try and write a regular expression ( yes, I know I now have two problems. Quick, run these through your favorite email validation method. Fred\ "Fred Gotta love the author for using my favorite example person, Joe Blow.In section 3, he gives some examples of valid email addresses. RFC 3696, Application Techniques for Checking and Transformation of Names, was written by the author of the SMTP protocol ( RFC 2821) as a human readable guide to SMTP. Quoting can be done via the backslash character (what is commonly known as escaping) or via surrounding the local part in double quotes. Not only that, but it’s also valid (though not recommended and very uncommon) to have quoted local parts which allow pretty much anyĬharacter. An atom is defined in section 3.2.4 as a series of alphanumeric characters and may include the followingĬharacters ( all the ones you need to swear in a comic strip)… The locally interpreted string isĪ dot-atom is a dot delimited series of atoms. Interpreted string followed by the at-sign character ASCII valueĦ4) followed by an Internet domain. Section section 3.4.1 of RFC 2822 goes into more detail about the specification of an email address (emphasis mine).Īn addr-spec is a specific Internet identifier that contains a locally Specified in the domain part of the address. Local-part MUST be interpreted and assigned semantics only by the host Hosts have attempted to optimize transport by modifying them, the According to section 2.3.10 of RFC 2821 which defines SMTP, the part before the sign is called the local part (the part after being the host domain) and it is only intended to be interpreted by the receiving host…Ĭonsequently, and due to a long history of problems when intermediate It turns out that the local part of an email address, the part before the sign, allows a lot more characters than you’d expect. Nearly 100% of regular expressions on the web purporting to validate an email address are too strict. I simply based my implementation on my preconceived assumptions about what makes a valid email address. I had never actually read (or even skimmed) the RFC for an email address. This time, for some reason, I decided to take a look at my underlying assumptions. Something I’ve done a hundred thousand times (seriously, I counted) using a handy dandy regular expression in my personal library. I needed to validate an email address on the server. I was speaking metaphorically.īefore yesterday I would have raised my hand (metaphorically) as well. It’s an odd sight to see someone sitting alone at the keyboard raising his or her hand. For those of you with your hand in the air, put it down quickly before someone sees you. Raise your hand if you know how to validate an email address.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |