draft-ietf-appsawg-malformed-mail-10.txt   draft-ietf-appsawg-malformed-mail.txt 
APPSAWG M. Kucherawy APPSAWG M. Kucherawy
Internet-Draft G. Shapiro Internet-Draft G. Shapiro
Intended status: Informational N. Freed Intended status: Informational N. Freed
Expires: May 10, 2014 November 6, 2013 Expires: June 3, 2014 November 30, 2013
Advice for Safe Handling of Malformed Messages Advice for Safe Handling of Malformed Messages
draft-ietf-appsawg-malformed-mail-10 draft-ietf-appsawg-malformed-mail-12
Abstract Abstract
Although Internet mail formats have been precisely defined since the Although Internet mail formats have been precisely defined since the
1970s, authoring and handling software often show only mild 1970s, authoring and handling software often show only mild
conformance to the specifications. The malformed messages that conformance to the specifications. The malformed messages that
result are non-standard. Nonetheless, decades of experience has result are non-standard. Nonetheless, decades of experience has
shown that handling with some tolerance the malformations that result shown that handling with some tolerance the malformations that result
is often an acceptable approach, and is better than rejecting the is often an acceptable approach, and is better than rejecting the
messages outright as nonconformant. This document includes a messages outright as nonconformant. This document includes a
skipping to change at page 1, line 38 skipping to change at page 1, line 38
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on May 10, 2014. This Internet-Draft will expire on June 3, 2014.
Copyright Notice Copyright Notice
Copyright (c) 2013 IETF Trust and the persons identified as the Copyright (c) 2013 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 18 skipping to change at page 2, line 18
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. The Purpose Of This Work . . . . . . . . . . . . . . . . . 3 1.1. The Purpose Of This Work . . . . . . . . . . . . . . . . . 3
1.2. Not The Purpose Of This Work . . . . . . . . . . . . . . . 4 1.2. Not The Purpose Of This Work . . . . . . . . . . . . . . . 4
1.3. General Considerations . . . . . . . . . . . . . . . . . . 4 1.3. General Considerations . . . . . . . . . . . . . . . . . . 4
2. Document Conventions . . . . . . . . . . . . . . . . . . . . . 5 2. Document Conventions . . . . . . . . . . . . . . . . . . . . . 5
2.1. Examples . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1. Examples . . . . . . . . . . . . . . . . . . . . . . . . . 5
3. Background . . . . . . . . . . . . . . . . . . . . . . . . . . 5 3. Background . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4. Invariant Content . . . . . . . . . . . . . . . . . . . . . . 6 4. Invariant Content . . . . . . . . . . . . . . . . . . . . . . 5
5. Mail Submission Agents . . . . . . . . . . . . . . . . . . . . 7 5. Mail Submission Agents . . . . . . . . . . . . . . . . . . . . 6
6. Line Termination . . . . . . . . . . . . . . . . . . . . . . . 7 6. Line Termination . . . . . . . . . . . . . . . . . . . . . . . 7
7. Header Anomalies . . . . . . . . . . . . . . . . . . . . . . . 8 7. Header Anomalies . . . . . . . . . . . . . . . . . . . . . . . 7
7.1. Converting Obsolete and Invalid Syntaxes . . . . . . . . . 8 7.1. Converting Obsolete and Invalid Syntaxes . . . . . . . . . 7
7.1.1. Host-Address Syntax . . . . . . . . . . . . . . . . . 8 7.1.1. Host-Address Syntax . . . . . . . . . . . . . . . . . 8
7.1.2. Excessive Angle Brackets . . . . . . . . . . . . . . . 8 7.1.2. Excessive Angle Brackets . . . . . . . . . . . . . . . 8
7.1.3. Unbalanced Angle Brackets . . . . . . . . . . . . . . 8 7.1.3. Unbalanced Angle Brackets . . . . . . . . . . . . . . 8
7.1.4. Unbalanced Parentheses . . . . . . . . . . . . . . . . 9 7.1.4. Unbalanced Parentheses . . . . . . . . . . . . . . . . 9
7.1.5. Commas in Address Lists . . . . . . . . . . . . . . . 9 7.1.5. Commas in Address Lists . . . . . . . . . . . . . . . 9
7.1.6. Unbalanced Quotes . . . . . . . . . . . . . . . . . . 10 7.1.6. Unbalanced Quotes . . . . . . . . . . . . . . . . . . 9
7.1.7. Naked Local-Parts . . . . . . . . . . . . . . . . . . 10 7.1.7. Naked Local-Parts . . . . . . . . . . . . . . . . . . 10
7.2. Non-Header Lines . . . . . . . . . . . . . . . . . . . . . 10 7.2. Non-Header Lines . . . . . . . . . . . . . . . . . . . . . 10
7.3. Unusual Spacing . . . . . . . . . . . . . . . . . . . . . 12 7.3. Unusual Spacing . . . . . . . . . . . . . . . . . . . . . 11
7.4. Header Malformations . . . . . . . . . . . . . . . . . . . 12 7.4. Header Malformations . . . . . . . . . . . . . . . . . . . 12
7.5. Header Field Counts . . . . . . . . . . . . . . . . . . . 13 7.5. Header Field Counts . . . . . . . . . . . . . . . . . . . 13
7.5.1. Repeated Header Fields . . . . . . . . . . . . . . . . 14 7.5.1. Repeated Header Fields . . . . . . . . . . . . . . . . 14
7.5.2. Missing Header Fields . . . . . . . . . . . . . . . . 15 7.5.2. Missing Header Fields . . . . . . . . . . . . . . . . 15
7.5.3. Return-Path . . . . . . . . . . . . . . . . . . . . . 16 7.5.3. Return-Path . . . . . . . . . . . . . . . . . . . . . 16
7.6. Missing or Incorrect Charset Information . . . . . . . . . 16 7.6. Missing or Incorrect Charset Information . . . . . . . . . 16
7.7. Eight-Bit Data . . . . . . . . . . . . . . . . . . . . . . 17 7.7. Eight-Bit Data . . . . . . . . . . . . . . . . . . . . . . 17
8. MIME Anomalies . . . . . . . . . . . . . . . . . . . . . . . . 18 8. MIME Anomalies . . . . . . . . . . . . . . . . . . . . . . . . 18
8.1. Missing MIME-Version Field . . . . . . . . . . . . . . . . 18 8.1. Missing MIME-Version Field . . . . . . . . . . . . . . . . 18
8.2. Faulty Encodings . . . . . . . . . . . . . . . . . . . . . 18 8.2. Faulty Encodings . . . . . . . . . . . . . . . . . . . . . 18
9. Body Anomalies . . . . . . . . . . . . . . . . . . . . . . . . 19 9. Body Anomalies . . . . . . . . . . . . . . . . . . . . . . . . 19
9.1. Oversized Lines . . . . . . . . . . . . . . . . . . . . . 19 9.1. Oversized Lines . . . . . . . . . . . . . . . . . . . . . 19
10. Security Considerations . . . . . . . . . . . . . . . . . . . 19 10. Security Considerations . . . . . . . . . . . . . . . . . . . 19
11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19
12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 20 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 20
12.1. Normative References . . . . . . . . . . . . . . . . . . . 20 12.1. Normative References . . . . . . . . . . . . . . . . . . . 20
12.2. Informative References . . . . . . . . . . . . . . . . . . 20 12.2. Informative References . . . . . . . . . . . . . . . . . . 20
Appendix A. RFC Editor Notes . . . . . . . . . . . . . . . . . . 21 Appendix A. RFC Editor Notes . . . . . . . . . . . . . . . . . . 21
Appendix B. Acknowledgements . . . . . . . . . . . . . . . . . . 21 Appendix B. Acknowledgements . . . . . . . . . . . . . . . . . . 21
1. Introduction 1. Introduction
1.1. The Purpose Of This Work 1.1. The Purpose Of This Work
skipping to change at page 3, line 49 skipping to change at page 3, line 49
prompted adjustments to receiving software, to handle these prompted adjustments to receiving software, to handle these
variations, rather than trying to gain better conformance by senders, variations, rather than trying to gain better conformance by senders,
since the receiving operator is primarily driven by complaints from since the receiving operator is primarily driven by complaints from
recipient users and has no authority over the sending side of the recipient users and has no authority over the sending side of the
system. Processing with such flexibility comes at some cost, since system. Processing with such flexibility comes at some cost, since
mail software is faced with decisions about whether to permit non- mail software is faced with decisions about whether to permit non-
conforming messages to continue toward their destinations unaltered, conforming messages to continue toward their destinations unaltered,
adjust them to conform (possibly at the cost of losing some of the adjust them to conform (possibly at the cost of losing some of the
original message), or outright rejecting them. original message), or outright rejecting them.
A core requirement for interoperability is that both sides of an
exchange work from the same details and semantics. By having
receivers be flexible, beyond the specifications, there can be -- and
often has been -- a good chance that a message will not be fully
interoperable. Worse, a well-established pattern of tolerance for
variations can sometimes be used as an attack vector.
This document includes a collection of the best advice available This document includes a collection of the best advice available
regarding a variety of common malformed mail situations, to be used regarding a variety of common malformed mail situations, to be used
as implementation guidance. These malformations are typically based as implementation guidance. These malformations are typically based
around loose interpretations or implementations of specifications around loose interpretations or implementations of specifications
such as Internet Message Format [MAIL] and Multipurpose Internet Mail such as Internet Message Format [MAIL] and Multipurpose Internet Mail
Extensions [MIME]. Extensions [MIME].
It must be emphasized, however, that the intent of this document is
not to standardize malformations or otherwise encourage their
proliferation. The messages are manifestly malformed, and the code
and culture that generates them needs to be fixed. Therefore, these
messages should be rejected outright if at all possible.
Nevertheless, many malformed messages from otherwise legitimate
senders are in circulation and will be for some time, and,
unfortunately, commercial reality shows that we cannot always simply
reject or discard them. Accordingly, this document presents
alternatives for dealing with them in ways that seem to do the least
additional harm until the infrastructure is tightened up to match the
standards.
1.2. Not The Purpose Of This Work 1.2. Not The Purpose Of This Work
It is important to understand that this work is not an effort to It is important to understand that this work is not an effort to
endorse or standardize certain common malformations. The code and endorse or standardize certain common malformations. The code and
culture that introduces such messages into the mail stream needs to culture that introduces such messages into the mail stream needs to
be repaired, as the security penalty now being paid for this lax be repaired, as the security penalty now being paid for this lax
processing arguably outweighs the reduction in support costs to end processing arguably outweighs the reduction in support costs to end
users who are not expected to understand the standards. However, the users who are not expected to understand the standards. However, the
reality is that this will not be fixed quickly. reality is that this will not be fixed quickly.
skipping to change at page 6, line 46 skipping to change at page 6, line 23
render a complaint inactionable as the system receiving the report render a complaint inactionable as the system receiving the report
may be unable to identify the original message as one of its own. may be unable to identify the original message as one of its own.
Some message changes alter syntax without changing semantics. For Some message changes alter syntax without changing semantics. For
example, Section 7.4 describes a situation where an agent removes example, Section 7.4 describes a situation where an agent removes
additional header whitespace. This is a syntax change without a additional header whitespace. This is a syntax change without a
change in semantics, though some systems (such as DKIM) are sensitive change in semantics, though some systems (such as DKIM) are sensitive
to such changes. Message system developers need to be aware of the to such changes. Message system developers need to be aware of the
downstream impact of making either kind of change. downstream impact of making either kind of change.
Where a change to content between modules is unavoidable, it is a
good idea to add standard trace data to indicate a "visible" handoff
between modules has occurred. The only advisable way to do this is
to prepend Received fields with the appropriate information, as
described in Section 3.6.7 of [MAIL].
There will always be local handling exceptions, but these guidelines There will always be local handling exceptions, but these guidelines
should be useful for developing integrated message processing should be useful for developing integrated message processing
environments. environments.
In most cases, this document only discusses techniques used on In most cases, this document only discusses techniques used on
internal representations. It is occasionally necessary to make internal representations. It is occasionally necessary to make
changes between the input and output versions; such cases will be changes between the input and output versions; such cases will be
called out explicitly. called out explicitly.
5. Mail Submission Agents 5. Mail Submission Agents
skipping to change at page 8, line 17 skipping to change at page 7, line 49
are used, the unusual character sequences are not visible in the raw are used, the unusual character sequences are not visible in the raw
SMTP stream. SMTP stream.
7. Header Anomalies 7. Header Anomalies
This section covers common syntactic and semantic anomalies found in This section covers common syntactic and semantic anomalies found in
a message header, and presents suggested mitigations. a message header, and presents suggested mitigations.
7.1. Converting Obsolete and Invalid Syntaxes 7.1. Converting Obsolete and Invalid Syntaxes
A message using an obsolete header syntax might confound an agent A message using an obsolete header syntax (see Section 4 of [MAIL])
that is attempting to be robust in its handling of syntax variations. might confound an agent that is attempting to be robust in its
A bad actor could exploit such a weakness in order to get abusive or handling of syntax variations. A bad actor could exploit such a
malicious content through a filter. This section presents some weakness in order to get abusive or malicious content through a
examples of such variations. Messages including them ought be filter. This section presents some examples of such variations.
rejected; where this is not possible, recommended internal Messages including them ought be rejected; where this is not
interpretations are provided. possible, recommended internal interpretations are provided.
7.1.1. Host-Address Syntax 7.1.1. Host-Address Syntax
The following obsolete syntax attempts to specify source routing: The following obsolete syntax attempts to specify source routing:
To: <@example.net:fran@example.com> To: <@example.net:fran@example.com>
This means "send to fran@example.com via the mail service at This means "send to fran@example.com via the mail service at
example.net". It can safely be interpreted as: example.net". It can safely be interpreted as:
skipping to change at page 10, line 11 skipping to change at page 9, line 46
usually best interpreted as: usually best interpreted as:
To: third@example.net, fourth@example.net To: third@example.net, fourth@example.net
7.1.6. Unbalanced Quotes 7.1.6. Unbalanced Quotes
The following use of unbalanced quotation marks: The following use of unbalanced quotation marks:
To: "Joe <joe@example.com> To: "Joe <joe@example.com>
leaves software with no obvious "good" interpretation. If it is leaves software with no unambiguous interpretation. One possible
essential to extract an address from the above, one possible
interpretation is: interpretation is:
To: "Joe <joe@example.com>"@example.net To: "Joe <joe@example.com>"@example.net
where "example.net" is the domain name or host name of the handling where "example.net" is the domain name or host name of the handling
agent making the interpretation. Another possible interpretation is agent making the interpretation. However, the more obvious and
simply: likely best interpretation is simply:
To: "Joe" <joe@example.com> To: "Joe" <joe@example.com>
7.1.7. Naked Local-Parts 7.1.7. Naked Local-Parts
[MAIL] defines a local-part as the user portion of an email address, [MAIL] defines a local-part as the user portion of an email address,
and the display-name as the "user-friendly" label that accompanies and the display-name as the "user-friendly" label that accompanies
the address specification. the address specification.
Some broken submission agents might introduce messages with only a Some broken submission agents might introduce messages with only a
skipping to change at page 13, line 4 skipping to change at page 12, line 41
Among the many possible malformations, a common one is insertion of Among the many possible malformations, a common one is insertion of
whitespace at unusual locations, such as: whitespace at unusual locations, such as:
From: user@example.com {1} From: user@example.com {1}
To: userpal@example.net {2} To: userpal@example.net {2}
Subject: This is your reminder {3} Subject: This is your reminder {3}
MIME-Version : 1.0 {4} MIME-Version : 1.0 {4}
Content-Type: text/plain {5} Content-Type: text/plain {5}
Date: Wed, 20 Oct 2010 20:53:35 -0400 {6} Date: Wed, 20 Oct 2010 20:53:35 -0400 {6}
Don't forget to meet us for the tailgate party! {8} Don't forget to meet us for the tailgate party! {8}
Note the addition of whitespace in line {4} after the header field Note the addition of whitespace in line {4} after the header field
name but before the colon that separates the name from the value. name but before the colon that separates the name from the value.
The acceptance grammar of [MAIL] permits that extra whitespace, so it The obsolete grammar of Section 4 of [MAIL] permits that extra
cannot be considered invalid. However, a consensus of whitespace, so it cannot be considered invalid. However, a consensus
implementations prefers to remove that whitespace. There is no of implementations prefers to remove that whitespace. There is no
perceived change to the semantics of the header field being altered perceived change to the semantics of the header field being altered
as the whitespace is itself semantically meaningless. Therefore, it as the whitespace is itself semantically meaningless. Therefore, it
is best to remove all whitespace after the field name but before the is best to remove all whitespace after the field name but before the
colon and to emit the field in this modified form. colon and to emit the field in this modified form.
7.5. Header Field Counts 7.5. Header Field Counts
Section 3.6 of [MAIL] prescribes specific header field counts for a Section 3.6 of [MAIL] prescribes specific header field counts for a
valid message. Few agents actually enforce these in the sense that a valid message. Few agents actually enforce these in the sense that a
message whose header contents exceed one or more limits set there are message whose header contents exceed one or more limits set there are
skipping to change at page 13, line 37 skipping to change at page 13, line 26
the input is valid before proceeding. Some popular open source the input is valid before proceeding. Some popular open source
filtering programs and some popular Mailing List Management (MLM) filtering programs and some popular Mailing List Management (MLM)
packages select either the first or last instance of a particular packages select either the first or last instance of a particular
field name, such as From, to decide who sent a message. Absent field name, such as From, to decide who sent a message. Absent
strict enforcement of [MAIL], an attacker can craft a message with strict enforcement of [MAIL], an attacker can craft a message with
multiple instances of the same field fields if that attacker knows multiple instances of the same field fields if that attacker knows
the filter will make a decision based on one but the user will be the filter will make a decision based on one but the user will be
shown the others. shown the others.
This situation is exacerbated when message validity is assessed, such This situation is exacerbated when message validity is assessed, such
as through enhanced authentication methods. Such methods might cover as through enhanced authentication methods like DomainKeys Identified
one instance of a constrained field but not another, taking the wrong Mail [DKIM]. Such methods might cover one instance of a constrained
one as "good" or "safe". An MUA, for example could show the first of field but not another, taking the wrong one as "good" or "safe". An
two From fields to an end user as "good" or "safe" while an MUA, for example could show the first of two From fields to an end
authentication method actually only verified the second. user as "good" or "safe" while an authentication method actually only
verified the second.
In attempting to counter this exposure, one of the following can be In attempting to counter this exposure, one of the following
enacted: strategies can be used:
1. reject outright or refuse to process further any input message 1. reject outright or refuse to process further any input message
that does not conform to Section 3.6 of [MAIL]; that does not conform to Section 3.6 of [MAIL];
2. remove or, in the case of an MUA, refuse to render any instances 2. remove or, in the case of an MUA, refuse to render any instances
of a header field whose presence exceeds a limit prescribed in of a header field whose presence exceeds a limit prescribed in
Section 3.6 of [MAIL] when generating its output; Section 3.6 of [MAIL] when generating its output;
3. where a field has a limited instance count, combine additional
instances into a single compound instance;
4. where a field can contain multiple distinct values (such as From) 3. where a field can contain multiple distinct values (such as From)
or is free-form text (such as Subject), combine them into a or is free-form text (such as Subject), combine them into a
semantically identical single header field of the same name (see semantically identical single header field of the same name (see
Section 7.5.1); Section 7.5.1);
5. alter the name of any header field whose presence exceeds a limit 4. alter the name of any header field whose presence exceeds a limit
prescribed in Section 3.6 of [MAIL] when generating its output so prescribed in Section 3.6 of [MAIL] when generating its output so
that later agents can produce a consistent result. Any that later agents can produce a consistent result. Any
alteration likely to cause the field to be ignored by downstream alteration likely to cause the field to be ignored by downstream
agents is acceptable. A common approach is to prefix the field agents is acceptable. A common approach is to prefix the field
names with a string such as "BAD-". names with a string such as "BAD-".
Selecting a mitigation action from the above list, or some other Selecting a mitigation action from the above list, or some other
action, must consider the needs of the operator making the decision, action, must consider the needs of the operator making the decision,
and the nature of its user base. and the nature of its user base.
skipping to change at page 16, line 23 skipping to change at page 16, line 12
An MTA that encounters a message missing this field should synthesize An MTA that encounters a message missing this field should synthesize
a valid one and add it to the external representation, since many a valid one and add it to the external representation, since many
deployed tools use the content of that field as a common unique deployed tools use the content of that field as a common unique
message reference, so its absence inhibits correlation of message message reference, so its absence inhibits correlation of message
processing. Section 3.6.4 of [MAIL] describes advisable practise for processing. Section 3.6.4 of [MAIL] describes advisable practise for
synthesizing the content of this field when it is absent, and synthesizing the content of this field when it is absent, and
establishes a requirement that it be globally unique. establishes a requirement that it be globally unique.
7.5.3. Return-Path 7.5.3. Return-Path
A valid message will have exactly one Return-Path header field, as While legitimate messages can contain more than one Return-Path
per Section 4.4 of [SMTP]. Should a message be encountered bearing header field, such usage is often an error rather that a valid
more than one, all but the topmost one is to be disregarded, as it is message containing multiple header field blocks as described in
most likely to have been added nearest to the mailbox that received Sections 3.6 of [MAIL]. Accordingly, when a message containing
that message. multiple Return-Path header fields is encountered, all but the
topmost one is to be disregarded, as it is most likely to have been
added nearest to the mailbox that received that message.
7.6. Missing or Incorrect Charset Information 7.6. Missing or Incorrect Charset Information
MIME provides the means to include textual material employing MIME provides the means to include textual material employing
character sets ("charsets") other than US-ASCII. Such material is character sets ("charsets") other than US-ASCII. Such material is
required to have an identified charset. Charset identification is required to have an identified charset. Charset identification is
done using a "charset" parameter in the Content-Type header field, a done using a "charset" parameter in the Content-Type header field, a
charset label within the MIME entity itself, or the charset can be charset label within the MIME entity itself, or the charset can be
implicitly specified by the Content-Type (see [CHARSET]). implicitly specified by the Content-Type (see [CHARSET]).
 End of changes. 22 change blocks. 
60 lines changed or deleted 47 lines changed or added

This html diff was produced by rfcdiff 1.33. The latest version is available from http://tools.ietf.org/tools/rfcdiff/