4539 lines
183 KiB
Plaintext
4539 lines
183 KiB
Plaintext
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Network Working Group N. Borenstein
|
|||
|
Request for Comments: 1521 Bellcore
|
|||
|
Obsoletes: 1341 N. Freed
|
|||
|
Category: Standards Track Innosoft
|
|||
|
September 1993
|
|||
|
|
|||
|
|
|||
|
MIME (Multipurpose Internet Mail Extensions) Part One:
|
|||
|
Mechanisms for Specifying and Describing
|
|||
|
the Format of Internet Message Bodies
|
|||
|
|
|||
|
Status of this Memo
|
|||
|
|
|||
|
This RFC specifies an Internet standards track protocol for the
|
|||
|
Internet community, and requests discussion and suggestions for
|
|||
|
improvements. Please refer to the current edition of the "Internet
|
|||
|
Official Protocol Standards" for the standardization state and status
|
|||
|
of this protocol. Distribution of this memo is unlimited.
|
|||
|
|
|||
|
Abstract
|
|||
|
|
|||
|
STD 11, RFC 822 defines a message representation protocol which
|
|||
|
specifies considerable detail about message headers, but which leaves
|
|||
|
the message content, or message body, as flat ASCII text. This
|
|||
|
document redefines the format of message bodies to allow multi-part
|
|||
|
textual and non-textual message bodies to be represented and
|
|||
|
exchanged without loss of information. This is based on earlier work
|
|||
|
documented in RFC 934 and STD 11, RFC 1049, but extends and revises
|
|||
|
that work. Because RFC 822 said so little about message bodies, this
|
|||
|
document is largely orthogonal to (rather than a revision of) RFC
|
|||
|
822.
|
|||
|
|
|||
|
In particular, this document is designed to provide facilities to
|
|||
|
include multiple objects in a single message, to represent body text
|
|||
|
in character sets other than US-ASCII, to represent formatted multi-
|
|||
|
font text messages, to represent non-textual material such as images
|
|||
|
and audio fragments, and generally to facilitate later extensions
|
|||
|
defining new types of Internet mail for use by cooperating mail
|
|||
|
agents.
|
|||
|
|
|||
|
This document does NOT extend Internet mail header fields to permit
|
|||
|
anything other than US-ASCII text data. Such extensions are the
|
|||
|
subject of a companion document [RFC-1522].
|
|||
|
|
|||
|
This document is a revision of RFC 1341. Significant differences
|
|||
|
from RFC 1341 are summarized in Appendix H.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 1]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
Table of Contents
|
|||
|
|
|||
|
1. Introduction....................................... 3
|
|||
|
2. Notations, Conventions, and Generic BNF Grammar.... 6
|
|||
|
3. The MIME-Version Header Field...................... 7
|
|||
|
4. The Content-Type Header Field...................... 9
|
|||
|
5. The Content-Transfer-Encoding Header Field......... 13
|
|||
|
5.1. Quoted-Printable Content-Transfer-Encoding......... 18
|
|||
|
5.2. Base64 Content-Transfer-Encoding................... 21
|
|||
|
6. Additional Content-Header Fields................... 23
|
|||
|
6.1. Optional Content-ID Header Field................... 23
|
|||
|
6.2. Optional Content-Description Header Field.......... 24
|
|||
|
7. The Predefined Content-Type Values................. 24
|
|||
|
7.1. The Text Content-Type.............................. 24
|
|||
|
7.1.1. The charset parameter.............................. 25
|
|||
|
7.1.2. The Text/plain subtype............................. 28
|
|||
|
7.2. The Multipart Content-Type......................... 28
|
|||
|
7.2.1. Multipart: The common syntax...................... 29
|
|||
|
7.2.2. The Multipart/mixed (primary) subtype.............. 34
|
|||
|
7.2.3. The Multipart/alternative subtype.................. 34
|
|||
|
7.2.4. The Multipart/digest subtype....................... 36
|
|||
|
7.2.5. The Multipart/parallel subtype..................... 37
|
|||
|
7.2.6. Other Multipart subtypes........................... 37
|
|||
|
7.3. The Message Content-Type........................... 38
|
|||
|
7.3.1. The Message/rfc822 (primary) subtype............... 38
|
|||
|
7.3.2. The Message/Partial subtype........................ 39
|
|||
|
7.3.3. The Message/External-Body subtype.................. 42
|
|||
|
7.3.3.1. The "ftp" and "tftp" access-types............... 44
|
|||
|
7.3.3.2. The "anon-ftp" access-type...................... 45
|
|||
|
7.3.3.3. The "local-file" and "afs" access-types......... 45
|
|||
|
7.3.3.4. The "mail-server" access-type................... 45
|
|||
|
7.3.3.5. Examples and Further Explanations............... 46
|
|||
|
7.4. The Application Content-Type....................... 49
|
|||
|
7.4.1. The Application/Octet-Stream (primary) subtype..... 50
|
|||
|
7.4.2. The Application/PostScript subtype................. 50
|
|||
|
7.4.3. Other Application subtypes......................... 53
|
|||
|
7.5. The Image Content-Type............................. 53
|
|||
|
7.6. The Audio Content-Type............................. 54
|
|||
|
7.7. The Video Content-Type............................. 54
|
|||
|
7.8. Experimental Content-Type Values................... 54
|
|||
|
8. Summary............................................ 56
|
|||
|
9. Security Considerations............................ 56
|
|||
|
10. Authors' Addresses................................. 57
|
|||
|
11. Acknowledgements................................... 58
|
|||
|
Appendix A -- Minimal MIME-Conformance.................... 60
|
|||
|
Appendix B -- General Guidelines For Sending Email Data... 63
|
|||
|
Appendix C -- A Complex Multipart Example................. 66
|
|||
|
Appendix D -- Collected Grammar........................... 68
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 2]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
Appendix E -- IANA Registration Procedures................ 72
|
|||
|
E.1 Registration of New Content-type/subtype Values...... 72
|
|||
|
E.2 Registration of New Access-type Values
|
|||
|
for Message/external-body............................ 73
|
|||
|
Appendix F -- Summary of the Seven Content-types.......... 74
|
|||
|
Appendix G -- Canonical Encoding Model.................... 76
|
|||
|
Appendix H -- Changes from RFC 1341....................... 78
|
|||
|
References................................................ 80
|
|||
|
|
|||
|
1. Introduction
|
|||
|
|
|||
|
Since its publication in 1982, STD 11, RFC 822 [RFC-822] has defined
|
|||
|
the standard format of textual mail messages on the Internet. Its
|
|||
|
success has been such that the RFC 822 format has been adopted,
|
|||
|
wholly or partially, well beyond the confines of the Internet and the
|
|||
|
Internet SMTP transport defined by STD 10, RFC 821 [RFC-821]. As the
|
|||
|
format has seen wider use, a number of limitations have proven
|
|||
|
increasingly restrictive for the user community.
|
|||
|
|
|||
|
RFC 822 was intended to specify a format for text messages. As such,
|
|||
|
non-text messages, such as multimedia messages that might include
|
|||
|
audio or images, are simply not mentioned. Even in the case of text,
|
|||
|
however, RFC 822 is inadequate for the needs of mail users whose
|
|||
|
languages require the use of character sets richer than US ASCII
|
|||
|
[US-ASCII]. Since RFC 822 does not specify mechanisms for mail
|
|||
|
containing audio, video, Asian language text, or even text in most
|
|||
|
European languages, additional specifications are needed.
|
|||
|
|
|||
|
One of the notable limitations of RFC 821/822 based mail systems is
|
|||
|
the fact that they limit the contents of electronic mail messages to
|
|||
|
relatively short lines of seven-bit ASCII. This forces users to
|
|||
|
convert any non-textual data that they may wish to send into seven-
|
|||
|
bit bytes representable as printable ASCII characters before invoking
|
|||
|
a local mail UA (User Agent, a program with which human users send
|
|||
|
and receive mail). Examples of such encodings currently used in the
|
|||
|
Internet include pure hexadecimal, uuencode, the 3-in-4 base 64
|
|||
|
scheme specified in RFC 1421, the Andrew Toolkit Representation
|
|||
|
[ATK], and many others.
|
|||
|
|
|||
|
The limitations of RFC 822 mail become even more apparent as gateways
|
|||
|
are designed to allow for the exchange of mail messages between RFC
|
|||
|
822 hosts and X.400 hosts. X.400 [X400] specifies mechanisms for the
|
|||
|
inclusion of non-textual body parts within electronic mail messages.
|
|||
|
The current standards for the mapping of X.400 messages to RFC 822
|
|||
|
messages specify either that X.400 non-textual body parts must be
|
|||
|
converted to (not encoded in) an ASCII format, or that they must be
|
|||
|
discarded, notifying the RFC 822 user that discarding has occurred.
|
|||
|
This is clearly undesirable, as information that a user may wish to
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 3]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
receive is lost. Even though a user's UA may not have the capability
|
|||
|
of dealing with the non-textual body part, the user might have some
|
|||
|
mechanism external to the UA that can extract useful information from
|
|||
|
the body part. Moreover, it does not allow for the fact that the
|
|||
|
message may eventually be gatewayed back into an X.400 message
|
|||
|
handling system (i.e., the X.400 message is "tunneled" through
|
|||
|
Internet mail), where the non-textual information would definitely
|
|||
|
become useful again.
|
|||
|
|
|||
|
This document describes several mechanisms that combine to solve most
|
|||
|
of these problems without introducing any serious incompatibilities
|
|||
|
with the existing world of RFC 822 mail. In particular, it
|
|||
|
describes:
|
|||
|
|
|||
|
1. A MIME-Version header field, which uses a version number to
|
|||
|
declare a message to be conformant with this specification and
|
|||
|
allows mail processing agents to distinguish between such
|
|||
|
messages and those generated by older or non-conformant software,
|
|||
|
which is presumed to lack such a field.
|
|||
|
|
|||
|
2. A Content-Type header field, generalized from RFC 1049 [RFC-1049],
|
|||
|
which can be used to specify the type and subtype of data in the
|
|||
|
body of a message and to fully specify the native representation
|
|||
|
(encoding) of such data.
|
|||
|
|
|||
|
2.a. A "text" Content-Type value, which can be used to represent
|
|||
|
textual information in a number of character sets and
|
|||
|
formatted text description languages in a standardized
|
|||
|
manner.
|
|||
|
|
|||
|
2.b. A "multipart" Content-Type value, which can be used to
|
|||
|
combine several body parts, possibly of differing types of
|
|||
|
data, into a single message.
|
|||
|
|
|||
|
2.c. An "application" Content-Type value, which can be used to
|
|||
|
transmit application data or binary data, and hence, among
|
|||
|
other uses, to implement an electronic mail file transfer
|
|||
|
service.
|
|||
|
|
|||
|
2.d. A "message" Content-Type value, for encapsulating another
|
|||
|
mail message.
|
|||
|
|
|||
|
2.e An "image" Content-Type value, for transmitting still image
|
|||
|
(picture) data.
|
|||
|
|
|||
|
2.f. An "audio" Content-Type value, for transmitting audio or
|
|||
|
voice data.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 4]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
2.g. A "video" Content-Type value, for transmitting video or
|
|||
|
moving image data, possibly with audio as part of the
|
|||
|
composite video data format.
|
|||
|
|
|||
|
3. A Content-Transfer-Encoding header field, which can be used to
|
|||
|
specify an auxiliary encoding that was applied to the data in
|
|||
|
order to allow it to pass through mail transport mechanisms which
|
|||
|
may have data or character set limitations.
|
|||
|
|
|||
|
4. Two additional header fields that can be used to further describe
|
|||
|
the data in a message body, the Content-ID and Content-
|
|||
|
Description header fields.
|
|||
|
|
|||
|
MIME has been carefully designed as an extensible mechanism, and it
|
|||
|
is expected that the set of content-type/subtype pairs and their
|
|||
|
associated parameters will grow significantly with time. Several
|
|||
|
other MIME fields, notably including character set names, are likely
|
|||
|
to have new values defined over time. In order to ensure that the
|
|||
|
set of such values is developed in an orderly, well-specified, and
|
|||
|
public manner, MIME defines a registration process which uses the
|
|||
|
Internet Assigned Numbers Authority (IANA) as a central registry for
|
|||
|
such values. Appendix E provides details about how IANA registration
|
|||
|
is accomplished.
|
|||
|
|
|||
|
Finally, to specify and promote interoperability, Appendix A of this
|
|||
|
document provides a basic applicability statement for a subset of the
|
|||
|
above mechanisms that defines a minimal level of "conformance" with
|
|||
|
this document.
|
|||
|
|
|||
|
HISTORICAL NOTE: Several of the mechanisms described in this
|
|||
|
document may seem somewhat strange or even baroque at first
|
|||
|
reading. It is important to note that compatibility with existing
|
|||
|
standards AND robustness across existing practice were two of the
|
|||
|
highest priorities of the working group that developed this
|
|||
|
document. In particular, compatibility was always favored over
|
|||
|
elegance.
|
|||
|
|
|||
|
MIME was first defined and published as RFCs 1341 and 1342 [RFC-1341]
|
|||
|
[RFC-1342]. This document is a relatively minor updating of RFC
|
|||
|
1341, and is intended to supersede it. The differences between this
|
|||
|
document and RFC 1341 are summarized in Appendix H. Please refer to
|
|||
|
the current edition of the "IAB Official Protocol Standards" for the
|
|||
|
standardization state and status of this protocol. Several other RFC
|
|||
|
documents will be of interest to the MIME implementor, in particular
|
|||
|
[RFC 1343], [RFC-1344], and [RFC-1345].
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 5]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
2. Notations, Conventions, and Generic BNF Grammar
|
|||
|
|
|||
|
This document is being published in two versions, one as plain ASCII
|
|||
|
text and one as PostScript (PostScript is a trademark of Adobe
|
|||
|
Systems Incorporated.). While the text version is the official
|
|||
|
specification, some will find the PostScript version easier to read.
|
|||
|
The textual contents are identical. An Andrew-format copy of this
|
|||
|
document is also available from the first author (Borenstein).
|
|||
|
|
|||
|
Although the mechanisms specified in this document are all described
|
|||
|
in prose, most are also described formally in the modified BNF
|
|||
|
notation of RFC 822. Implementors will need to be familiar with this
|
|||
|
notation in order to understand this specification, and are referred
|
|||
|
to RFC 822 for a complete explanation of the modified BNF notation.
|
|||
|
|
|||
|
Some of the modified BNF in this document makes reference to
|
|||
|
syntactic entities that are defined in RFC 822 and not in this
|
|||
|
document. A complete formal grammar, then, is obtained by combining
|
|||
|
the collected grammar appendix of this document with that of RFC 822
|
|||
|
plus the modifications to RFC 822 defined in RFC 1123, which
|
|||
|
specifically changes the syntax for `return', `date' and `mailbox'.
|
|||
|
|
|||
|
The term CRLF, in this document, refers to the sequence of the two
|
|||
|
ASCII characters CR (13) and LF (10) which, taken together, in this
|
|||
|
order, denote a line break in RFC 822 mail.
|
|||
|
|
|||
|
The term "character set" is used in this document to refer to a
|
|||
|
method used with one or more tables to convert encoded text to a
|
|||
|
series of octets. This definition is intended to allow various kinds
|
|||
|
of text encodings, from simple single-table mappings such as ASCII to
|
|||
|
complex table switching methods such as those that use ISO 2022's
|
|||
|
techniques. However, a MIME character set name must fully specify
|
|||
|
the mapping to be performed.
|
|||
|
|
|||
|
The term "message", when not further qualified, means either the
|
|||
|
(complete or "top-level") message being transferred on a network, or
|
|||
|
a message encapsulated in a body of type "message".
|
|||
|
|
|||
|
The term "body part", in this document, means one of the parts of the
|
|||
|
body of a multipart entity. A body part has a header and a body, so
|
|||
|
it makes sense to speak about the body of a body part.
|
|||
|
|
|||
|
The term "entity", in this document, means either a message or a body
|
|||
|
part. All kinds of entities share the property that they have a
|
|||
|
header and a body.
|
|||
|
|
|||
|
The term "body", when not further qualified, means the body of an
|
|||
|
entity, that is the body of either a message or of a body part.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 6]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
NOTE: The previous four definitions are clearly circular. This is
|
|||
|
unavoidable, since the overall structure of a MIME message is
|
|||
|
indeed recursive.
|
|||
|
|
|||
|
In this document, all numeric and octet values are given in decimal
|
|||
|
notation.
|
|||
|
|
|||
|
It must be noted that Content-Type values, subtypes, and parameter
|
|||
|
names as defined in this document are case-insensitive. However,
|
|||
|
parameter values are case-sensitive unless otherwise specified for
|
|||
|
the specific parameter.
|
|||
|
|
|||
|
FORMATTING NOTE: This document has been carefully formatted for
|
|||
|
ease of reading. The PostScript version of this document, in
|
|||
|
particular, places notes like this one, which may be skipped by
|
|||
|
the reader, in a smaller, italicized, font, and indents it as
|
|||
|
well. In the text version, only the indentation is preserved, so
|
|||
|
if you are reading the text version of this you might consider
|
|||
|
using the PostScript version instead. However, all such notes will
|
|||
|
be indented and preceded by "NOTE:" or some similar introduction,
|
|||
|
even in the text version.
|
|||
|
|
|||
|
The primary purpose of these non-essential notes is to convey
|
|||
|
information about the rationale of this document, or to place this
|
|||
|
document in the proper historical or evolutionary context. Such
|
|||
|
information may be skipped by those who are focused entirely on
|
|||
|
building a conformant implementation, but may be of use to those
|
|||
|
who wish to understand why this document is written as it is.
|
|||
|
|
|||
|
For ease of recognition, all BNF definitions have been placed in a
|
|||
|
fixed-width font in the PostScript version of this document.
|
|||
|
|
|||
|
3. The MIME-Version Header Field
|
|||
|
|
|||
|
Since RFC 822 was published in 1982, there has really been only one
|
|||
|
format standard for Internet messages, and there has been little
|
|||
|
perceived need to declare the format standard in use. This document
|
|||
|
is an independent document that complements RFC 822. Although the
|
|||
|
extensions in this document have been defined in such a way as to be
|
|||
|
compatible with RFC 822, there are still circumstances in which it
|
|||
|
might be desirable for a mail-processing agent to know whether a
|
|||
|
message was composed with the new standard in mind.
|
|||
|
|
|||
|
Therefore, this document defines a new header field, "MIME-Version",
|
|||
|
which is to be used to declare the version of the Internet message
|
|||
|
body format standard in use.
|
|||
|
|
|||
|
Messages composed in accordance with this document MUST include such
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 7]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
a header field, with the following verbatim text:
|
|||
|
|
|||
|
MIME-Version: 1.0
|
|||
|
|
|||
|
The presence of this header field is an assertion that the message
|
|||
|
has been composed in compliance with this document.
|
|||
|
|
|||
|
Since it is possible that a future document might extend the message
|
|||
|
format standard again, a formal BNF is given for the content of the
|
|||
|
MIME-Version field:
|
|||
|
|
|||
|
version := "MIME-Version" ":" 1*DIGIT "." 1*DIGIT
|
|||
|
|
|||
|
Thus, future format specifiers, which might replace or extend "1.0",
|
|||
|
are constrained to be two integer fields, separated by a period. If
|
|||
|
a message is received with a MIME-version value other than "1.0", it
|
|||
|
cannot be assumed to conform with this specification.
|
|||
|
|
|||
|
Note that the MIME-Version header field is required at the top level
|
|||
|
of a message. It is not required for each body part of a multipart
|
|||
|
entity. It is required for the embedded headers of a body of type
|
|||
|
"message" if and only if the embedded message is itself claimed to be
|
|||
|
MIME-conformant.
|
|||
|
|
|||
|
It is not possible to fully specify how a mail reader that conforms
|
|||
|
with MIME as defined in this document should treat a message that
|
|||
|
might arrive in the future with some value of MIME-Version other than
|
|||
|
"1.0". However, conformant software is encouraged to check the
|
|||
|
version number and at least warn the user if an unrecognized MIME-
|
|||
|
version is encountered.
|
|||
|
|
|||
|
It is also worth noting that version control for specific content-
|
|||
|
types is not accomplished using the MIME-Version mechanism. In
|
|||
|
particular, some formats (such as application/postscript) have
|
|||
|
version numbering conventions that are internal to the document
|
|||
|
format. Where such conventions exist, MIME does nothing to supersede
|
|||
|
them. Where no such conventions exist, a MIME type might use a
|
|||
|
"version" parameter in the content-type field if necessary.
|
|||
|
|
|||
|
NOTE TO IMPLEMENTORS: All header fields defined in this document,
|
|||
|
including MIME-Version, Content-type, etc., are subject to the
|
|||
|
general syntactic rules for header fields specified in RFC 822. In
|
|||
|
particular, all can include comments, which means that the following
|
|||
|
two MIME-Version fields are equivalent:
|
|||
|
|
|||
|
MIME-Version: 1.0
|
|||
|
MIME-Version: 1.0 (Generated by GBD-killer 3.7)
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 8]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
4. The Content-Type Header Field
|
|||
|
|
|||
|
The purpose of the Content-Type field is to describe the data
|
|||
|
contained in the body fully enough that the receiving user agent can
|
|||
|
pick an appropriate agent or mechanism to present the data to the
|
|||
|
user, or otherwise deal with the data in an appropriate manner.
|
|||
|
|
|||
|
HISTORICAL NOTE: The Content-Type header field was first defined in
|
|||
|
RFC 1049. RFC 1049 Content-types used a simpler and less powerful
|
|||
|
syntax, but one that is largely compatible with the mechanism given
|
|||
|
here.
|
|||
|
|
|||
|
The Content-Type header field is used to specify the nature of the
|
|||
|
data in the body of an entity, by giving type and subtype
|
|||
|
identifiers, and by providing auxiliary information that may be
|
|||
|
required for certain types. After the type and subtype names, the
|
|||
|
remainder of the header field is simply a set of parameters,
|
|||
|
specified in an attribute/value notation. The set of meaningful
|
|||
|
parameters differs for the different types. In particular, there are
|
|||
|
NO globally-meaningful parameters that apply to all content-types.
|
|||
|
Global mechanisms are best addressed, in the MIME model, by the
|
|||
|
definition of additional Content-* header fields. The ordering of
|
|||
|
parameters is not significant. Among the defined parameters is a
|
|||
|
"charset" parameter by which the character set used in the body may
|
|||
|
be declared. Comments are allowed in accordance with RFC 822 rules
|
|||
|
for structured header fields.
|
|||
|
|
|||
|
In general, the top-level Content-Type is used to declare the general
|
|||
|
type of data, while the subtype specifies a specific format for that
|
|||
|
type of data. Thus, a Content-Type of "image/xyz" is enough to tell
|
|||
|
a user agent that the data is an image, even if the user agent has no
|
|||
|
knowledge of the specific image format "xyz". Such information can
|
|||
|
be used, for example, to decide whether or not to show a user the raw
|
|||
|
data from an unrecognized subtype -- such an action might be
|
|||
|
reasonable for unrecognized subtypes of text, but not for
|
|||
|
unrecognized subtypes of image or audio. For this reason, registered
|
|||
|
subtypes of audio, image, text, and video, should not contain
|
|||
|
embedded information that is really of a different type. Such
|
|||
|
compound types should be represented using the "multipart" or
|
|||
|
"application" types.
|
|||
|
|
|||
|
Parameters are modifiers of the content-subtype, and do not
|
|||
|
fundamentally affect the requirements of the host system. Although
|
|||
|
most parameters make sense only with certain content-types, others
|
|||
|
are "global" in the sense that they might apply to any subtype. For
|
|||
|
example, the "boundary" parameter makes sense only for the
|
|||
|
"multipart" content-type, but the "charset" parameter might make
|
|||
|
sense with several content-types.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 9]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
An initial set of seven Content-Types is defined by this document.
|
|||
|
This set of top-level names is intended to be substantially complete.
|
|||
|
It is expected that additions to the larger set of supported types
|
|||
|
can generally be accomplished by the creation of new subtypes of
|
|||
|
these initial types. In the future, more top-level types may be
|
|||
|
defined only by an extension to this standard. If another primary
|
|||
|
type is to be used for any reason, it must be given a name starting
|
|||
|
with "X-" to indicate its non-standard status and to avoid a
|
|||
|
potential conflict with a future official name.
|
|||
|
|
|||
|
In the Augmented BNF notation of RFC 822, a Content-Type header field
|
|||
|
value is defined as follows:
|
|||
|
|
|||
|
content := "Content-Type" ":" type "/" subtype *(";"
|
|||
|
parameter)
|
|||
|
; case-insensitive matching of type and subtype
|
|||
|
|
|||
|
type := "application" / "audio"
|
|||
|
/ "image" / "message"
|
|||
|
/ "multipart" / "text"
|
|||
|
/ "video" / extension-token
|
|||
|
; All values case-insensitive
|
|||
|
|
|||
|
extension-token := x-token / iana-token
|
|||
|
|
|||
|
iana-token := <a publicly-defined extension token,
|
|||
|
registered with IANA, as specified in
|
|||
|
appendix E>
|
|||
|
|
|||
|
x-token := <The two characters "X-" or "x-" followed, with
|
|||
|
no intervening white space, by any token>
|
|||
|
|
|||
|
subtype := token ; case-insensitive
|
|||
|
|
|||
|
parameter := attribute "=" value
|
|||
|
|
|||
|
attribute := token ; case-insensitive
|
|||
|
|
|||
|
value := token / quoted-string
|
|||
|
|
|||
|
token := 1*<any (ASCII) CHAR except SPACE, CTLs,
|
|||
|
or tspecials>
|
|||
|
|
|||
|
tspecials := "(" / ")" / "<" / ">" / "@"
|
|||
|
/ "," / ";" / ":" / "\" / <">
|
|||
|
/ "/" / "[" / "]" / "?" / "="
|
|||
|
; Must be in quoted-string,
|
|||
|
; to use within parameter values
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 10]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
Note that the definition of "tspecials" is the same as the RFC 822
|
|||
|
definition of "specials" with the addition of the three characters
|
|||
|
"/", "?", and "=", and the removal of ".".
|
|||
|
|
|||
|
Note also that a subtype specification is MANDATORY. There are no
|
|||
|
default subtypes.
|
|||
|
|
|||
|
The type, subtype, and parameter names are not case sensitive. For
|
|||
|
example, TEXT, Text, and TeXt are all equivalent. Parameter values
|
|||
|
are normally case sensitive, but certain parameters are interpreted
|
|||
|
to be case-insensitive, depending on the intended use. (For example,
|
|||
|
multipart boundaries are case-sensitive, but the "access-type" for
|
|||
|
message/External-body is not case-sensitive.)
|
|||
|
|
|||
|
Beyond this syntax, the only constraint on the definition of subtype
|
|||
|
names is the desire that their uses must not conflict. That is, it
|
|||
|
would be undesirable to have two different communities using
|
|||
|
"Content-Type: application/foobar" to mean two different things. The
|
|||
|
process of defining new content-subtypes, then, is not intended to be
|
|||
|
a mechanism for imposing restrictions, but simply a mechanism for
|
|||
|
publicizing the usages. There are, therefore, two acceptable
|
|||
|
mechanisms for defining new Content-Type subtypes:
|
|||
|
|
|||
|
1. Private values (starting with "X-") may be
|
|||
|
defined bilaterally between two cooperating
|
|||
|
agents without outside registration or
|
|||
|
standardization.
|
|||
|
|
|||
|
2. New standard values must be documented,
|
|||
|
registered with, and approved by IANA, as
|
|||
|
described in Appendix E. Where intended for
|
|||
|
public use, the formats they refer to must
|
|||
|
also be defined by a published specification,
|
|||
|
and possibly offered for standardization.
|
|||
|
|
|||
|
The seven standard initial predefined Content-Types are detailed in
|
|||
|
the bulk of this document. They are:
|
|||
|
|
|||
|
text -- textual information. The primary subtype,
|
|||
|
"plain", indicates plain (unformatted) text. No
|
|||
|
special software is required to get the full
|
|||
|
meaning of the text, aside from support for the
|
|||
|
indicated character set. Subtypes are to be used
|
|||
|
for enriched text in forms where application
|
|||
|
software may enhance the appearance of the text,
|
|||
|
but such software must not be required in order to
|
|||
|
get the general idea of the content. Possible
|
|||
|
subtypes thus include any readable word processor
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 11]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
format. A very simple and portable subtype,
|
|||
|
richtext, was defined in RFC 1341, with a future
|
|||
|
revision expected.
|
|||
|
|
|||
|
multipart -- data consisting of multiple parts of
|
|||
|
independent data types. Four initial subtypes
|
|||
|
are defined, including the primary "mixed"
|
|||
|
subtype, "alternative" for representing the same
|
|||
|
data in multiple formats, "parallel" for parts
|
|||
|
intended to be viewed simultaneously, and "digest"
|
|||
|
for multipart entities in which each part is of
|
|||
|
type "message".
|
|||
|
|
|||
|
message -- an encapsulated message. A body of
|
|||
|
Content-Type "message" is itself all or part of a
|
|||
|
fully formatted RFC 822 conformant message which
|
|||
|
may contain its own different Content-Type header
|
|||
|
field. The primary subtype is "rfc822". The
|
|||
|
"partial" subtype is defined for partial messages,
|
|||
|
to permit the fragmented transmission of bodies
|
|||
|
that are thought to be too large to be passed
|
|||
|
through mail transport facilities. Another
|
|||
|
subtype, "External-body", is defined for
|
|||
|
specifying large bodies by reference to an
|
|||
|
external data source.
|
|||
|
|
|||
|
image -- image data. Image requires a display device
|
|||
|
(such as a graphical display, a printer, or a FAX
|
|||
|
machine) to view the information. Initial
|
|||
|
subtypes are defined for two widely-used image
|
|||
|
formats, jpeg and gif.
|
|||
|
|
|||
|
audio -- audio data, with initial subtype "basic".
|
|||
|
Audio requires an audio output device (such as a
|
|||
|
speaker or a telephone) to "display" the contents.
|
|||
|
|
|||
|
video -- video data. Video requires the capability to
|
|||
|
display moving images, typically including
|
|||
|
specialized hardware and software. The initial
|
|||
|
subtype is "mpeg".
|
|||
|
|
|||
|
application -- some other kind of data, typically
|
|||
|
either uninterpreted binary data or information to
|
|||
|
be processed by a mail-based application. The
|
|||
|
primary subtype, "octet-stream", is to be used in
|
|||
|
the case of uninterpreted binary data, in which
|
|||
|
case the simplest recommended action is to offer
|
|||
|
to write the information into a file for the user.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 12]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
An additional subtype, "PostScript", is defined
|
|||
|
for transporting PostScript documents in bodies.
|
|||
|
Other expected uses for "application" include
|
|||
|
spreadsheets, data for mail-based scheduling
|
|||
|
systems, and languages for "active"
|
|||
|
(computational) email. (Note that active email
|
|||
|
and other application data may entail several
|
|||
|
security considerations, which are discussed later
|
|||
|
in this memo, particularly in the context of
|
|||
|
application/PostScript.)
|
|||
|
|
|||
|
Default RFC 822 messages are typed by this protocol as plain text in
|
|||
|
the US-ASCII character set, which can be explicitly specified as
|
|||
|
"Content-type: text/plain; charset=us-ascii". If no Content-Type is
|
|||
|
specified, this default is assumed. In the presence of a MIME-
|
|||
|
Version header field, a receiving User Agent can also assume that
|
|||
|
plain US-ASCII text was the sender's intent. In the absence of a
|
|||
|
MIME-Version specification, plain US-ASCII text must still be
|
|||
|
assumed, but the sender's intent might have been otherwise.
|
|||
|
|
|||
|
RATIONALE: In the absence of any Content-Type header field or
|
|||
|
MIME-Version header field, it is impossible to be certain that a
|
|||
|
message is actually text in the US-ASCII character set, since it
|
|||
|
might well be a message that, using the conventions that predate
|
|||
|
this document, includes text in another character set or non-
|
|||
|
textual data in a manner that cannot be automatically recognized
|
|||
|
(e.g., a uuencoded compressed UNIX tar file). Although there is
|
|||
|
no fully acceptable alternative to treating such untyped messages
|
|||
|
as "text/plain; charset=us-ascii", implementors should remain
|
|||
|
aware that if a message lacks both the MIME-Version and the
|
|||
|
Content-Type header fields, it may in practice contain almost
|
|||
|
anything.
|
|||
|
|
|||
|
It should be noted that the list of Content-Type values given here
|
|||
|
may be augmented in time, via the mechanisms described above, and
|
|||
|
that the set of subtypes is expected to grow substantially.
|
|||
|
|
|||
|
When a mail reader encounters mail with an unknown Content-type
|
|||
|
value, it should generally treat it as equivalent to
|
|||
|
"application/octet-stream", as described later in this document.
|
|||
|
|
|||
|
5. The Content-Transfer-Encoding Header Field
|
|||
|
|
|||
|
Many Content-Types which could usefully be transported via email are
|
|||
|
represented, in their "natural" format, as 8-bit character or binary
|
|||
|
data. Such data cannot be transmitted over some transport protocols.
|
|||
|
For example, RFC 821 restricts mail messages to 7-bit US-ASCII data
|
|||
|
with lines no longer than 1000 characters.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 13]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
It is necessary, therefore, to define a standard mechanism for re-
|
|||
|
encoding such data into a 7-bit short-line format. This document
|
|||
|
specifies that such encodings will be indicated by a new "Content-
|
|||
|
Transfer-Encoding" header field. The Content-Transfer-Encoding field
|
|||
|
is used to indicate the type of transformation that has been used in
|
|||
|
order to represent the body in an acceptable manner for transport.
|
|||
|
|
|||
|
Unlike Content-Types, a proliferation of Content-Transfer-Encoding
|
|||
|
values is undesirable and unnecessary. However, establishing only a
|
|||
|
single Content-Transfer-Encoding mechanism does not seem possible.
|
|||
|
There is a tradeoff between the desire for a compact and efficient
|
|||
|
encoding of largely-binary data and the desire for a readable
|
|||
|
encoding of data that is mostly, but not entirely, 7-bit data. For
|
|||
|
this reason, at least two encoding mechanisms are necessary: a
|
|||
|
"readable" encoding and a "dense" encoding.
|
|||
|
|
|||
|
The Content-Transfer-Encoding field is designed to specify an
|
|||
|
invertible mapping between the "native" representation of a type of
|
|||
|
data and a representation that can be readily exchanged using 7 bit
|
|||
|
mail transport protocols, such as those defined by RFC 821 (SMTP).
|
|||
|
This field has not been defined by any previous standard. The field's
|
|||
|
value is a single token specifying the type of encoding, as
|
|||
|
enumerated below. Formally:
|
|||
|
|
|||
|
encoding := "Content-Transfer-Encoding" ":" mechanism
|
|||
|
|
|||
|
mechanism := "7bit" ; case-insensitive
|
|||
|
/ "quoted-printable"
|
|||
|
/ "base64"
|
|||
|
/ "8bit"
|
|||
|
/ "binary"
|
|||
|
/ x-token
|
|||
|
|
|||
|
These values are not case sensitive. That is, Base64 and BASE64 and
|
|||
|
bAsE64 are all equivalent. An encoding type of 7BIT requires that
|
|||
|
the body is already in a seven-bit mail-ready representation. This
|
|||
|
is the default value -- that is, "Content-Transfer-Encoding: 7BIT" is
|
|||
|
assumed if the Content-Transfer-Encoding header field is not present.
|
|||
|
|
|||
|
The values "8bit", "7bit", and "binary" all mean that NO encoding has
|
|||
|
been performed. However, they are potentially useful as indications
|
|||
|
of the kind of data contained in the object, and therefore of the
|
|||
|
kind of encoding that might need to be performed for transmission in
|
|||
|
a given transport system. In particular:
|
|||
|
|
|||
|
"7bit" means that the data is all represented as short
|
|||
|
lines of US-ASCII data.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 14]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
"8bit" means that the lines are short, but there may be
|
|||
|
non-ASCII characters (octets with the high-order
|
|||
|
bit set).
|
|||
|
|
|||
|
"Binary" means that not only may non-ASCII characters
|
|||
|
be present, but also that the lines are not
|
|||
|
necessarily short enough for SMTP transport.
|
|||
|
|
|||
|
The difference between "8bit" (or any other conceivable bit-width
|
|||
|
token) and the "binary" token is that "binary" does not require
|
|||
|
adherence to any limits on line length or to the SMTP CRLF semantics,
|
|||
|
while the bit-width tokens do require such adherence. If the body
|
|||
|
contains data in any bit-width other than 7-bit, the appropriate
|
|||
|
bit-width Content-Transfer-Encoding token must be used (e.g., "8bit"
|
|||
|
for unencoded 8 bit wide data). If the body contains binary data,
|
|||
|
the "binary" Content-Transfer-Encoding token must be used.
|
|||
|
|
|||
|
NOTE: The distinction between the Content-Transfer-Encoding values
|
|||
|
of "binary", "8bit", etc. may seem unimportant, in that all of
|
|||
|
them really mean "none" -- that is, there has been no encoding of
|
|||
|
the data for transport. However, clear labeling will be of
|
|||
|
enormous value to gateways between future mail transport systems
|
|||
|
with differing capabilities in transporting data that do not meet
|
|||
|
the restrictions of RFC 821 transport.
|
|||
|
|
|||
|
Mail transport for unencoded 8-bit data is defined in RFC-1426
|
|||
|
[RFC-1426]. As of the publication of this document, there are no
|
|||
|
standardized Internet mail transports for which it is legitimate
|
|||
|
to include unencoded binary data in mail bodies. Thus there are
|
|||
|
no circumstances in which the "binary" Content-Transfer-Encoding
|
|||
|
is actually legal on the Internet. However, in the event that
|
|||
|
binary mail transport becomes a reality in Internet mail, or when
|
|||
|
this document is used in conjunction with any other binary-capable
|
|||
|
transport mechanism, binary bodies should be labeled as such using
|
|||
|
this mechanism.
|
|||
|
|
|||
|
NOTE: The five values defined for the Content-Transfer-Encoding
|
|||
|
field imply nothing about the Content-Type other than the
|
|||
|
algorithm by which it was encoded or the transport system
|
|||
|
requirements if unencoded.
|
|||
|
|
|||
|
Implementors may, if necessary, define new Content-Transfer-Encoding
|
|||
|
values, but must use an x-token, which is a name prefixed by "X-" to
|
|||
|
indicate its non-standard status, e.g., "Content-Transfer-Encoding:
|
|||
|
x-my-new-encoding". However, unlike Content-Types and subtypes, the
|
|||
|
creation of new Content-Transfer-Encoding values is explicitly and
|
|||
|
strongly discouraged, as it seems likely to hinder interoperability
|
|||
|
with little potential benefit. Their use is allowed only as the
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 15]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
result of an agreement between cooperating user agents.
|
|||
|
|
|||
|
If a Content-Transfer-Encoding header field appears as part of a
|
|||
|
message header, it applies to the entire body of that message. If a
|
|||
|
Content-Transfer-Encoding header field appears as part of a body
|
|||
|
part's headers, it applies only to the body of that body part. If an
|
|||
|
entity is of type "multipart" or "message", the Content-Transfer-
|
|||
|
Encoding is not permitted to have any value other than a bit width
|
|||
|
(e.g., "7bit", "8bit", etc.) or "binary".
|
|||
|
|
|||
|
It should be noted that email is character-oriented, so that the
|
|||
|
mechanisms described here are mechanisms for encoding arbitrary octet
|
|||
|
streams, not bit streams. If a bit stream is to be encoded via one
|
|||
|
of these mechanisms, it must first be converted to an 8-bit byte
|
|||
|
stream using the network standard bit order ("big-endian"), in which
|
|||
|
the earlier bits in a stream become the higher-order bits in a byte.
|
|||
|
A bit stream not ending at an 8-bit boundary must be padded with
|
|||
|
zeroes. This document provides a mechanism for noting the addition
|
|||
|
of such padding in the case of the application Content-Type, which
|
|||
|
has a "padding" parameter.
|
|||
|
|
|||
|
The encoding mechanisms defined here explicitly encode all data in
|
|||
|
ASCII. Thus, for example, suppose an entity has header fields such
|
|||
|
as:
|
|||
|
|
|||
|
Content-Type: text/plain; charset=ISO-8859-1
|
|||
|
Content-transfer-encoding: base64
|
|||
|
|
|||
|
This must be interpreted to mean that the body is a base64 ASCII
|
|||
|
encoding of data that was originally in ISO-8859-1, and will be in
|
|||
|
that character set again after decoding.
|
|||
|
|
|||
|
The following sections will define the two standard encoding
|
|||
|
mechanisms. The definition of new content-transfer-encodings is
|
|||
|
explicitly discouraged and should only occur when absolutely
|
|||
|
necessary. All content-transfer-encoding namespace except that
|
|||
|
beginning with "X-" is explicitly reserved to the IANA for future
|
|||
|
use. Private agreements about content-transfer-encodings are also
|
|||
|
explicitly discouraged.
|
|||
|
|
|||
|
Certain Content-Transfer-Encoding values may only be used on certain
|
|||
|
Content-Types. In particular, it is expressly forbidden to use any
|
|||
|
encodings other than "7bit", "8bit", or "binary" with any Content-
|
|||
|
Type that recursively includes other Content-Type fields, notably the
|
|||
|
"multipart" and "message" Content-Types. All encodings that are
|
|||
|
desired for bodies of type multipart or message must be done at the
|
|||
|
innermost level, by encoding the actual body that needs to be
|
|||
|
encoded.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 16]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
NOTE ON ENCODING RESTRICTIONS: Though the prohibition against
|
|||
|
using content-transfer-encodings on data of type multipart or
|
|||
|
message may seem overly restrictive, it is necessary to prevent
|
|||
|
nested encodings, in which data are passed through an encoding
|
|||
|
algorithm multiple times, and must be decoded multiple times in
|
|||
|
order to be properly viewed. Nested encodings add considerable
|
|||
|
complexity to user agents: aside from the obvious efficiency
|
|||
|
problems with such multiple encodings, they can obscure the basic
|
|||
|
structure of a message. In particular, they can imply that
|
|||
|
several decoding operations are necessary simply to find out what
|
|||
|
types of objects a message contains. Banning nested encodings may
|
|||
|
complicate the job of certain mail gateways, but this seems less
|
|||
|
of a problem than the effect of nested encodings on user agents.
|
|||
|
|
|||
|
NOTE ON THE RELATIONSHIP BETWEEN CONTENT-TYPE AND CONTENT-
|
|||
|
TRANSFER-ENCODING: It may seem that the Content-Transfer-Encoding
|
|||
|
could be inferred from the characteristics of the Content-Type
|
|||
|
that is to be encoded, or, at the very least, that certain
|
|||
|
Content-Transfer-Encodings could be mandated for use with specific
|
|||
|
Content-Types. There are several reasons why this is not the case.
|
|||
|
First, given the varying types of transports used for mail, some
|
|||
|
encodings may be appropriate for some Content-Type/transport
|
|||
|
combinations and not for others. (For example, in an 8-bit
|
|||
|
transport, no encoding would be required for text in certain
|
|||
|
character sets, while such encodings are clearly required for 7-
|
|||
|
bit SMTP.) Second, certain Content-Types may require different
|
|||
|
types of transfer encoding under different circumstances. For
|
|||
|
example, many PostScript bodies might consist entirely of short
|
|||
|
lines of 7-bit data and hence require little or no encoding.
|
|||
|
Other PostScript bodies (especially those using Level 2
|
|||
|
PostScript's binary encoding mechanism) may only be reasonably
|
|||
|
represented using a binary transport encoding. Finally, since
|
|||
|
Content-Type is intended to be an open-ended specification
|
|||
|
mechanism, strict specification of an association between
|
|||
|
Content-Types and encodings effectively couples the specification
|
|||
|
of an application protocol with a specific lower-level transport.
|
|||
|
This is not desirable since the developers of a Content-Type
|
|||
|
should not have to be aware of all the transports in use and what
|
|||
|
their limitations are.
|
|||
|
|
|||
|
NOTE ON TRANSLATING ENCODINGS: The quoted-printable and base64
|
|||
|
encodings are designed so that conversion between them is
|
|||
|
possible. The only issue that arises in such a conversion is the
|
|||
|
handling of line breaks. When converting from quoted-printable to
|
|||
|
base64 a line break must be converted into a CRLF sequence.
|
|||
|
Similarly, a CRLF sequence in base64 data must be converted to a
|
|||
|
quoted-printable line break, but ONLY when converting text data.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 17]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
NOTE ON CANONICAL ENCODING MODEL: There was some confusion, in
|
|||
|
earlier drafts of this memo, regarding the model for when email
|
|||
|
data was to be converted to canonical form and encoded, and in
|
|||
|
particular how this process would affect the treatment of CRLFs,
|
|||
|
given that the representation of newlines varies greatly from
|
|||
|
system to system, and the relationship between content-transfer-
|
|||
|
encodings and character sets. For this reason, a canonical model
|
|||
|
for encoding is presented as Appendix G.
|
|||
|
|
|||
|
5.1. Quoted-Printable Content-Transfer-Encoding
|
|||
|
|
|||
|
The Quoted-Printable encoding is intended to represent data that
|
|||
|
largely consists of octets that correspond to printable characters in
|
|||
|
the ASCII character set. It encodes the data in such a way that the
|
|||
|
resulting octets are unlikely to be modified by mail transport. If
|
|||
|
the data being encoded are mostly ASCII text, the encoded form of the
|
|||
|
data remains largely recognizable by humans. A body which is
|
|||
|
entirely ASCII may also be encoded in Quoted-Printable to ensure the
|
|||
|
integrity of the data should the message pass through a character-
|
|||
|
translating, and/or line-wrapping gateway.
|
|||
|
|
|||
|
In this encoding, octets are to be represented as determined by the
|
|||
|
following rules:
|
|||
|
|
|||
|
Rule #1: (General 8-bit representation) Any octet, except those
|
|||
|
indicating a line break according to the newline convention of the
|
|||
|
canonical (standard) form of the data being encoded, may be
|
|||
|
represented by an "=" followed by a two digit hexadecimal
|
|||
|
representation of the octet's value. The digits of the
|
|||
|
hexadecimal alphabet, for this purpose, are "0123456789ABCDEF".
|
|||
|
Uppercase letters must be used when sending hexadecimal data,
|
|||
|
though a robust implementation may choose to recognize lowercase
|
|||
|
letters on receipt. Thus, for example, the value 12 (ASCII form
|
|||
|
feed) can be represented by "=0C", and the value 61 (ASCII EQUAL
|
|||
|
SIGN) can be represented by "=3D". Except when the following
|
|||
|
rules allow an alternative encoding, this rule is mandatory.
|
|||
|
|
|||
|
Rule #2: (Literal representation) Octets with decimal values of 33
|
|||
|
through 60 inclusive, and 62 through 126, inclusive, MAY be
|
|||
|
represented as the ASCII characters which correspond to those
|
|||
|
octets (EXCLAMATION POINT through LESS THAN, and GREATER THAN
|
|||
|
through TILDE, respectively).
|
|||
|
|
|||
|
Rule #3: (White Space): Octets with values of 9 and 32 MAY be
|
|||
|
represented as ASCII TAB (HT) and SPACE characters, respectively,
|
|||
|
but MUST NOT be so represented at the end of an encoded line. Any
|
|||
|
TAB (HT) or SPACE characters on an encoded line MUST thus be
|
|||
|
followed on that line by a printable character. In particular, an
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 18]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
"=" at the end of an encoded line, indicating a soft line break
|
|||
|
(see rule #5) may follow one or more TAB (HT) or SPACE characters.
|
|||
|
It follows that an octet with value 9 or 32 appearing at the end
|
|||
|
of an encoded line must be represented according to Rule #1. This
|
|||
|
rule is necessary because some MTAs (Message Transport Agents,
|
|||
|
programs which transport messages from one user to another, or
|
|||
|
perform a part of such transfers) are known to pad lines of text
|
|||
|
with SPACEs, and others are known to remove "white space"
|
|||
|
characters from the end of a line. Therefore, when decoding a
|
|||
|
Quoted-Printable body, any trailing white space on a line must be
|
|||
|
deleted, as it will necessarily have been added by intermediate
|
|||
|
transport agents.
|
|||
|
|
|||
|
Rule #4 (Line Breaks): A line break in a text body, independent of
|
|||
|
what its representation is following the canonical representation
|
|||
|
of the data being encoded, must be represented by a (RFC 822) line
|
|||
|
break, which is a CRLF sequence, in the Quoted-Printable encoding.
|
|||
|
Since the canonical representation of types other than text do not
|
|||
|
generally include the representation of line breaks, no hard line
|
|||
|
breaks (i.e. line breaks that are intended to be meaningful and
|
|||
|
to be displayed to the user) should occur in the quoted-printable
|
|||
|
encoding of such types. Of course, occurrences of "=0D", "=0A",
|
|||
|
"0A=0D" and "=0D=0A" will eventually be encountered. In general,
|
|||
|
however, base64 is preferred over quoted-printable for binary
|
|||
|
data.
|
|||
|
|
|||
|
Note that many implementations may elect to encode the local
|
|||
|
representation of various content types directly, as described in
|
|||
|
Appendix G. In particular, this may apply to plain text material
|
|||
|
on systems that use newline conventions other than CRLF
|
|||
|
delimiters. Such an implementation is permissible, but the
|
|||
|
generation of line breaks must be generalized to account for the
|
|||
|
case where alternate representations of newline sequences are
|
|||
|
used.
|
|||
|
|
|||
|
Rule #5 (Soft Line Breaks): The Quoted-Printable encoding REQUIRES
|
|||
|
that encoded lines be no more than 76 characters long. If longer
|
|||
|
lines are to be encoded with the Quoted-Printable encoding, 'soft'
|
|||
|
line breaks must be used. An equal sign as the last character on a
|
|||
|
encoded line indicates such a non-significant ('soft') line break
|
|||
|
in the encoded text. Thus if the "raw" form of the line is a
|
|||
|
single unencoded line that says:
|
|||
|
|
|||
|
Now's the time for all folk to come to the aid of
|
|||
|
their country.
|
|||
|
|
|||
|
This can be represented, in the Quoted-Printable encoding, as
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 19]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
Now's the time =
|
|||
|
for all folk to come=
|
|||
|
to the aid of their country.
|
|||
|
|
|||
|
This provides a mechanism with which long lines are encoded in
|
|||
|
such a way as to be restored by the user agent. The 76 character
|
|||
|
limit does not count the trailing CRLF, but counts all other
|
|||
|
characters, including any equal signs.
|
|||
|
|
|||
|
Since the hyphen character ("-") is represented as itself in the
|
|||
|
Quoted-Printable encoding, care must be taken, when encapsulating a
|
|||
|
quoted-printable encoded body in a multipart entity, to ensure that
|
|||
|
the encapsulation boundary does not appear anywhere in the encoded
|
|||
|
body. (A good strategy is to choose a boundary that includes a
|
|||
|
character sequence such as "=_" which can never appear in a quoted-
|
|||
|
printable body. See the definition of multipart messages later in
|
|||
|
this document.)
|
|||
|
|
|||
|
NOTE: The quoted-printable encoding represents something of a
|
|||
|
compromise between readability and reliability in transport.
|
|||
|
Bodies encoded with the quoted-printable encoding will work
|
|||
|
reliably over most mail gateways, but may not work perfectly over
|
|||
|
a few gateways, notably those involving translation into EBCDIC.
|
|||
|
(In theory, an EBCDIC gateway could decode a quoted-printable body
|
|||
|
and re-encode it using base64, but such gateways do not yet
|
|||
|
exist.) A higher level of confidence is offered by the base64
|
|||
|
Content-Transfer-Encoding. A way to get reasonably reliable
|
|||
|
transport through EBCDIC gateways is to also quote the ASCII
|
|||
|
characters
|
|||
|
|
|||
|
!"#$@[\]^`{|}~
|
|||
|
|
|||
|
according to rule #1. See Appendix B for more information.
|
|||
|
|
|||
|
Because quoted-printable data is generally assumed to be line-
|
|||
|
oriented, it is to be expected that the representation of the breaks
|
|||
|
between the lines of quoted printable data may be altered in
|
|||
|
transport, in the same manner that plain text mail has always been
|
|||
|
altered in Internet mail when passing between systems with differing
|
|||
|
newline conventions. If such alterations are likely to constitute a
|
|||
|
corruption of the data, it is probably more sensible to use the
|
|||
|
base64 encoding rather than the quoted-printable encoding.
|
|||
|
|
|||
|
WARNING TO IMPLEMENTORS: If binary data are encoded in quoted-
|
|||
|
printable, care must be taken to encode CR and LF characters as "=0D"
|
|||
|
and "=0A", respectively. In particular, a CRLF sequence in binary
|
|||
|
data should be encoded as "=0D=0A". Otherwise, if CRLF were
|
|||
|
represented as a hard line break, it might be incorrectly decoded on
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 20]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
platforms with different line break conventions.
|
|||
|
|
|||
|
For formalists, the syntax of quoted-printable data is described by
|
|||
|
the following grammar:
|
|||
|
|
|||
|
quoted-printable := ([*(ptext / SPACE / TAB) ptext] ["="] CRLF)
|
|||
|
; Maximum line length of 76 characters excluding CRLF
|
|||
|
|
|||
|
ptext := octet /<any ASCII character except "=", SPACE, or TAB>
|
|||
|
; characters not listed as "mail-safe" in Appendix B
|
|||
|
; are also not recommended.
|
|||
|
|
|||
|
octet := "=" 2(DIGIT / "A" / "B" / "C" / "D" / "E" / "F")
|
|||
|
; octet must be used for characters > 127, =, SPACE, or TAB,
|
|||
|
; and is recommended for any characters not listed in
|
|||
|
; Appendix B as "mail-safe".
|
|||
|
|
|||
|
5.2. Base64 Content-Transfer-Encoding
|
|||
|
|
|||
|
The Base64 Content-Transfer-Encoding is designed to represent
|
|||
|
arbitrary sequences of octets in a form that need not be humanly
|
|||
|
readable. The encoding and decoding algorithms are simple, but the
|
|||
|
encoded data are consistently only about 33 percent larger than the
|
|||
|
unencoded data. This encoding is virtually identical to the one used
|
|||
|
in Privacy Enhanced Mail (PEM) applications, as defined in RFC 1421.
|
|||
|
The base64 encoding is adapted from RFC 1421, with one change: base64
|
|||
|
eliminates the "*" mechanism for embedded clear text.
|
|||
|
|
|||
|
A 65-character subset of US-ASCII is used, enabling 6 bits to be
|
|||
|
represented per printable character. (The extra 65th character, "=",
|
|||
|
is used to signify a special processing function.)
|
|||
|
|
|||
|
NOTE: This subset has the important property that it is
|
|||
|
represented identically in all versions of ISO 646, including US
|
|||
|
ASCII, and all characters in the subset are also represented
|
|||
|
identically in all versions of EBCDIC. Other popular encodings,
|
|||
|
such as the encoding used by the uuencode utility and the base85
|
|||
|
encoding specified as part of Level 2 PostScript, do not share
|
|||
|
these properties, and thus do not fulfill the portability
|
|||
|
requirements a binary transport encoding for mail must meet.
|
|||
|
|
|||
|
The encoding process represents 24-bit groups of input bits as output
|
|||
|
strings of 4 encoded characters. Proceeding from left to right, a
|
|||
|
24-bit input group is formed by concatenating 3 8-bit input groups.
|
|||
|
These 24 bits are then treated as 4 concatenated 6-bit groups, each
|
|||
|
of which is translated into a single digit in the base64 alphabet.
|
|||
|
When encoding a bit stream via the base64 encoding, the bit stream
|
|||
|
must be presumed to be ordered with the most-significant-bit first.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 21]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
That is, the first bit in the stream will be the high-order bit in
|
|||
|
the first byte, and the eighth bit will be the low-order bit in the
|
|||
|
first byte, and so on.
|
|||
|
|
|||
|
Each 6-bit group is used as an index into an array of 64 printable
|
|||
|
characters. The character referenced by the index is placed in the
|
|||
|
output string. These characters, identified in Table 1, below, are
|
|||
|
selected so as to be universally representable, and the set excludes
|
|||
|
characters with particular significance to SMTP (e.g., ".", CR, LF)
|
|||
|
and to the encapsulation boundaries defined in this document (e.g.,
|
|||
|
"-").
|
|||
|
|
|||
|
Table 1: The Base64 Alphabet
|
|||
|
|
|||
|
Value Encoding Value Encoding Value Encoding Value Encoding
|
|||
|
0 A 17 R 34 i 51 z
|
|||
|
1 B 18 S 35 j 52 0
|
|||
|
2 C 19 T 36 k 53 1
|
|||
|
3 D 20 U 37 l 54 2
|
|||
|
4 E 21 V 38 m 55 3
|
|||
|
5 F 22 W 39 n 56 4
|
|||
|
6 G 23 X 40 o 57 5
|
|||
|
7 H 24 Y 41 p 58 6
|
|||
|
8 I 25 Z 42 q 59 7
|
|||
|
9 J 26 a 43 r 60 8
|
|||
|
10 K 27 b 44 s 61 9
|
|||
|
11 L 28 c 45 t 62 +
|
|||
|
12 M 29 d 46 u 63 /
|
|||
|
13 N 30 e 47 v
|
|||
|
14 O 31 f 48 w (pad) =
|
|||
|
15 P 32 g 49 x
|
|||
|
16 Q 33 h 50 y
|
|||
|
|
|||
|
The output stream (encoded bytes) must be represented in lines of no
|
|||
|
more than 76 characters each. All line breaks or other characters
|
|||
|
not found in Table 1 must be ignored by decoding software. In base64
|
|||
|
data, characters other than those in Table 1, line breaks, and other
|
|||
|
white space probably indicate a transmission error, about which a
|
|||
|
warning message or even a message rejection might be appropriate
|
|||
|
under some circumstances.
|
|||
|
|
|||
|
Special processing is performed if fewer than 24 bits are available
|
|||
|
at the end of the data being encoded. A full encoding quantum is
|
|||
|
always completed at the end of a body. When fewer than 24 input bits
|
|||
|
are available in an input group, zero bits are added (on the right)
|
|||
|
to form an integral number of 6-bit groups. Padding at the end of
|
|||
|
the data is performed using the '=' character. Since all base64
|
|||
|
input is an integral number of octets, only the following cases can
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 22]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
arise: (1) the final quantum of encoding input is an integral
|
|||
|
multiple of 24 bits; here, the final unit of encoded output will be
|
|||
|
an integral multiple of 4 characters with no "=" padding, (2) the
|
|||
|
final quantum of encoding input is exactly 8 bits; here, the final
|
|||
|
unit of encoded output will be two characters followed by two "="
|
|||
|
padding characters, or (3) the final quantum of encoding input is
|
|||
|
exactly 16 bits; here, the final unit of encoded output will be three
|
|||
|
characters followed by one "=" padding character.
|
|||
|
|
|||
|
Because it is used only for padding at the end of the data, the
|
|||
|
occurrence of any '=' characters may be taken as evidence that the
|
|||
|
end of the data has been reached (without truncation in transit). No
|
|||
|
such assurance is possible, however, when the number of octets
|
|||
|
transmitted was a multiple of three.
|
|||
|
|
|||
|
Any characters outside of the base64 alphabet are to be ignored in
|
|||
|
base64-encoded data. The same applies to any illegal sequence of
|
|||
|
characters in the base64 encoding, such as "====="
|
|||
|
|
|||
|
Care must be taken to use the proper octets for line breaks if base64
|
|||
|
encoding is applied directly to text material that has not been
|
|||
|
converted to canonical form. In particular, text line breaks must be
|
|||
|
converted into CRLF sequences prior to base64 encoding. The important
|
|||
|
thing to note is that this may be done directly by the encoder rather
|
|||
|
than in a prior canonicalization step in some implementations.
|
|||
|
|
|||
|
NOTE: There is no need to worry about quoting apparent
|
|||
|
encapsulation boundaries within base64-encoded parts of multipart
|
|||
|
entities because no hyphen characters are used in the base64
|
|||
|
encoding.
|
|||
|
|
|||
|
6. Additional Content-Header Fields
|
|||
|
|
|||
|
6.1. Optional Content-ID Header Field
|
|||
|
|
|||
|
In constructing a high-level user agent, it may be desirable to allow
|
|||
|
one body to make reference to another. Accordingly, bodies may be
|
|||
|
labeled using the "Content-ID" header field, which is syntactically
|
|||
|
identical to the "Message-ID" header field:
|
|||
|
|
|||
|
id := "Content-ID" ":" msg-id
|
|||
|
Like the Message-ID values, Content-ID values must be generated to be
|
|||
|
world-unique.
|
|||
|
|
|||
|
The Content-ID value may be used for uniquely identifying MIME
|
|||
|
entities in several contexts, particularly for cacheing data
|
|||
|
referenced by the message/external-body mechanism. Although the
|
|||
|
Content-ID header is generally optional, its use is mandatory in
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 23]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
implementations which generate data of the optional MIME Content-type
|
|||
|
"message/external-body". That is, each message/external-body entity
|
|||
|
must have a Content-ID field to permit cacheing of such data.
|
|||
|
|
|||
|
It is also worth noting that the Content-ID value has special
|
|||
|
semantics in the case of the multipart/alternative content-type.
|
|||
|
This is explained in the section of this document dealing with
|
|||
|
multipart/alternative.
|
|||
|
|
|||
|
6.2. Optional Content-Description Header Field
|
|||
|
|
|||
|
The ability to associate some descriptive information with a given
|
|||
|
body is often desirable. For example, it may be useful to mark an
|
|||
|
"image" body as "a picture of the Space Shuttle Endeavor." Such text
|
|||
|
may be placed in the Content-Description header field.
|
|||
|
|
|||
|
description := "Content-Description" ":" *text
|
|||
|
|
|||
|
The description is presumed to be given in the US-ASCII character
|
|||
|
set, although the mechanism specified in [RFC-1522] may be used for
|
|||
|
non-US-ASCII Content-Description values.
|
|||
|
|
|||
|
7. The Predefined Content-Type Values
|
|||
|
|
|||
|
This document defines seven initial Content-Type values and an
|
|||
|
extension mechanism for private or experimental types. Further
|
|||
|
standard types must be defined by new published specifications. It
|
|||
|
is expected that most innovation in new types of mail will take place
|
|||
|
as subtypes of the seven types defined here. The most essential
|
|||
|
characteristics of the seven content-types are summarized in Appendix
|
|||
|
F.
|
|||
|
|
|||
|
7.1 The Text Content-Type
|
|||
|
|
|||
|
The text Content-Type is intended for sending material which is
|
|||
|
principally textual in form. It is the default Content-Type. A
|
|||
|
"charset" parameter may be used to indicate the character set of the
|
|||
|
body text for some text subtypes, notably including the primary
|
|||
|
subtype, "text/plain", which indicates plain (unformatted) text. The
|
|||
|
default Content-Type for Internet mail is "text/plain; charset=us-
|
|||
|
ascii".
|
|||
|
|
|||
|
Beyond plain text, there are many formats for representing what might
|
|||
|
be known as "extended text" -- text with embedded formatting and
|
|||
|
presentation information. An interesting characteristic of many such
|
|||
|
representations is that they are to some extent readable even without
|
|||
|
the software that interprets them. It is useful, then, to
|
|||
|
distinguish them, at the highest level, from such unreadable data as
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 24]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
images, audio, or text represented in an unreadable form. In the
|
|||
|
absence of appropriate interpretation software, it is reasonable to
|
|||
|
show subtypes of text to the user, while it is not reasonable to do
|
|||
|
so with most nontextual data.
|
|||
|
|
|||
|
Such formatted textual data should be represented using subtypes of
|
|||
|
text. Plausible subtypes of text are typically given by the common
|
|||
|
name of the representation format, e.g., "text/richtext" [RFC-1341].
|
|||
|
|
|||
|
7.1.1. The charset parameter
|
|||
|
|
|||
|
A critical parameter that may be specified in the Content-Type field
|
|||
|
for text/plain data is the character set. This is specified with a
|
|||
|
"charset" parameter, as in:
|
|||
|
|
|||
|
Content-type: text/plain; charset=us-ascii
|
|||
|
|
|||
|
Unlike some other parameter values, the values of the charset
|
|||
|
parameter are NOT case sensitive. The default character set, which
|
|||
|
must be assumed in the absence of a charset parameter, is US-ASCII.
|
|||
|
|
|||
|
The specification for any future subtypes of "text" must specify
|
|||
|
whether or not they will also utilize a "charset" parameter, and may
|
|||
|
possibly restrict its values as well. When used with a particular
|
|||
|
body, the semantics of the "charset" parameter should be identical to
|
|||
|
those specified here for "text/plain", i.e., the body consists
|
|||
|
entirely of characters in the given charset. In particular, definers
|
|||
|
of future text subtypes should pay close attention the the
|
|||
|
implications of multibyte character sets for their subtype
|
|||
|
definitions.
|
|||
|
|
|||
|
This RFC specifies the definition of the charset parameter for the
|
|||
|
purposes of MIME to be a unique mapping of a byte stream to glyphs, a
|
|||
|
mapping which does not require external profiling information.
|
|||
|
|
|||
|
An initial list of predefined character set names can be found at the
|
|||
|
end of this section. Additional character sets may be registered
|
|||
|
with IANA, although the standardization of their use requires the
|
|||
|
usual IESG [RFC-1340] review and approval. Note that if the
|
|||
|
specified character set includes 8-bit data, a Content-Transfer-
|
|||
|
Encoding header field and a corresponding encoding on the data are
|
|||
|
required in order to transmit the body via some mail transfer
|
|||
|
protocols, such as SMTP.
|
|||
|
|
|||
|
The default character set, US-ASCII, has been the subject of some
|
|||
|
confusion and ambiguity in the past. Not only were there some
|
|||
|
ambiguities in the definition, there have been wide variations in
|
|||
|
practice. In order to eliminate such ambiguity and variations in the
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 25]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
future, it is strongly recommended that new user agents explicitly
|
|||
|
specify a character set via the Content-Type header field. "US-
|
|||
|
ASCII" does not indicate an arbitrary seven-bit character code, but
|
|||
|
specifies that the body uses character coding that uses the exact
|
|||
|
correspondence of codes to characters specified in ASCII. National
|
|||
|
use variations of ISO 646 [ISO-646] are NOT ASCII and their use in
|
|||
|
Internet mail is explicitly discouraged. The omission of the ISO 646
|
|||
|
character set is deliberate in this regard. The character set name
|
|||
|
of "US-ASCII" explicitly refers to ANSI X3.4-1986 [US-ASCII] only.
|
|||
|
The character set name "ASCII" is reserved and must not be used for
|
|||
|
any purpose.
|
|||
|
|
|||
|
NOTE: RFC 821 explicitly specifies "ASCII", and references an
|
|||
|
earlier version of the American Standard. Insofar as one of the
|
|||
|
purposes of specifying a Content-Type and character set is to
|
|||
|
permit the receiver to unambiguously determine how the sender
|
|||
|
intended the coded message to be interpreted, assuming anything
|
|||
|
other than "strict ASCII" as the default would risk unintentional
|
|||
|
and incompatible changes to the semantics of messages now being
|
|||
|
transmitted. This also implies that messages containing
|
|||
|
characters coded according to national variations on ISO 646, or
|
|||
|
using code-switching procedures (e.g., those of ISO 2022), as well
|
|||
|
as 8-bit or multiple octet character encodings MUST use an
|
|||
|
appropriate character set specification to be consistent with this
|
|||
|
specification.
|
|||
|
|
|||
|
The complete US-ASCII character set is listed in [US-ASCII]. Note
|
|||
|
that the control characters including DEL (0-31, 127) have no defined
|
|||
|
meaning apart from the combination CRLF (ASCII values 13 and 10)
|
|||
|
indicating a new line. Two of the characters have de facto meanings
|
|||
|
in wide use: FF (12) often means "start subsequent text on the
|
|||
|
beginning of a new page"; and TAB or HT (9) often (though not always)
|
|||
|
means "move the cursor to the next available column after the current
|
|||
|
position where the column number is a multiple of 8 (counting the
|
|||
|
first column as column 0)." Apart from this, any use of the control
|
|||
|
characters or DEL in a body must be part of a private agreement
|
|||
|
between the sender and recipient. Such private agreements are
|
|||
|
discouraged and should be replaced by the other capabilities of this
|
|||
|
document.
|
|||
|
|
|||
|
NOTE: Beyond US-ASCII, an enormous proliferation of character sets
|
|||
|
is possible. It is the opinion of the IETF working group that a
|
|||
|
large number of character sets is NOT a good thing. We would
|
|||
|
prefer to specify a single character set that can be used
|
|||
|
universally for representing all of the world's languages in
|
|||
|
electronic mail. Unfortunately, existing practice in several
|
|||
|
communities seems to point to the continued use of multiple
|
|||
|
character sets in the near future. For this reason, we define
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 26]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
names for a small number of character sets for which a strong
|
|||
|
constituent base exists.
|
|||
|
|
|||
|
The defined charset values are:
|
|||
|
|
|||
|
US-ASCII -- as defined in [US-ASCII].
|
|||
|
|
|||
|
ISO-8859-X -- where "X" is to be replaced, as necessary, for the
|
|||
|
parts of ISO-8859 [ISO-8859]. Note that the ISO 646
|
|||
|
character sets have deliberately been omitted in favor of
|
|||
|
their 8859 replacements, which are the designated character
|
|||
|
sets for Internet mail. As of the publication of this
|
|||
|
document, the legitimate values for "X" are the digits 1
|
|||
|
through 9.
|
|||
|
|
|||
|
The character sets specified above are the ones that were relatively
|
|||
|
uncontroversial during the drafting of MIME. This document does not
|
|||
|
endorse the use of any particular character set other than US-ASCII,
|
|||
|
and recognizes that the future evolution of world character sets
|
|||
|
remains unclear. It is expected that in the future, additional
|
|||
|
character sets will be registered for use in MIME.
|
|||
|
|
|||
|
Note that the character set used, if anything other than US-ASCII,
|
|||
|
must always be explicitly specified in the Content-Type field.
|
|||
|
|
|||
|
No other character set name may be used in Internet mail without the
|
|||
|
publication of a formal specification and its registration with IANA,
|
|||
|
or by private agreement, in which case the character set name must
|
|||
|
begin with "X-".
|
|||
|
|
|||
|
Implementors are discouraged from defining new character sets for
|
|||
|
mail use unless absolutely necessary.
|
|||
|
|
|||
|
The "charset" parameter has been defined primarily for the purpose of
|
|||
|
textual data, and is described in this section for that reason.
|
|||
|
However, it is conceivable that non-textual data might also wish to
|
|||
|
specify a charset value for some purpose, in which case the same
|
|||
|
syntax and values should be used.
|
|||
|
|
|||
|
In general, mail-sending software must always use the "lowest common
|
|||
|
denominator" character set possible. For example, if a body contains
|
|||
|
only US-ASCII characters, it must be marked as being in the US-ASCII
|
|||
|
character set, not ISO-8859-1, which, like all the ISO-8859 family of
|
|||
|
character sets, is a superset of US-ASCII. More generally, if a
|
|||
|
widely-used character set is a subset of another character set, and a
|
|||
|
body contains only characters in the widely-used subset, it must be
|
|||
|
labeled as being in that subset. This will increase the chances that
|
|||
|
the recipient will be able to view the mail correctly.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 27]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
7.1.2. The Text/plain subtype
|
|||
|
|
|||
|
The primary subtype of text is "plain". This indicates plain
|
|||
|
(unformatted) text. The default Content-Type for Internet mail,
|
|||
|
"text/plain; charset=us-ascii", describes existing Internet practice.
|
|||
|
That is, it is the type of body defined by RFC 822.
|
|||
|
|
|||
|
No other text subtype is defined by this document.
|
|||
|
|
|||
|
The formal grammar for the content-type header field for text is as
|
|||
|
follows:
|
|||
|
|
|||
|
text-type := "text" "/" text-subtype [";" "charset" "=" charset]
|
|||
|
|
|||
|
text-subtype := "plain" / extension-token
|
|||
|
|
|||
|
charset := "us-ascii"/ "iso-8859-1"/ "iso-8859-2"/ "iso-8859-3"
|
|||
|
/ "iso-8859-4"/ "iso-8859-5"/ "iso-8859-6"/ "iso-8859-7"
|
|||
|
/ "iso-8859-8" / "iso-8859-9" / extension-token
|
|||
|
; case insensitive
|
|||
|
|
|||
|
7.2. The Multipart Content-Type
|
|||
|
|
|||
|
In the case of multiple part entities, in which one or more different
|
|||
|
sets of data are combined in a single body, a "multipart" Content-
|
|||
|
Type field must appear in the entity's header. The body must then
|
|||
|
contain one or more "body parts," each preceded by an encapsulation
|
|||
|
boundary, and the last one followed by a closing boundary. Each part
|
|||
|
starts with an encapsulation boundary, and then contains a body part
|
|||
|
consisting of header area, a blank line, and a body area. Thus a
|
|||
|
body part is similar to an RFC 822 message in syntax, but different
|
|||
|
in meaning.
|
|||
|
|
|||
|
A body part is NOT to be interpreted as actually being an RFC 822
|
|||
|
message. To begin with, NO header fields are actually required in
|
|||
|
body parts. A body part that starts with a blank line, therefore, is
|
|||
|
allowed and is a body part for which all default values are to be
|
|||
|
assumed. In such a case, the absence of a Content-Type header field
|
|||
|
implies that the corresponding body is plain US-ASCII text. The only
|
|||
|
header fields that have defined meaning for body parts are those the
|
|||
|
names of which begin with "Content-". All other header fields are
|
|||
|
generally to be ignored in body parts. Although they should
|
|||
|
generally be retained in mail processing, they may be discarded by
|
|||
|
gateways if necessary. Such other fields are permitted to appear in
|
|||
|
body parts but must not be depended on. "X-" fields may be created
|
|||
|
for experimental or private purposes, with the recognition that the
|
|||
|
information they contain may be lost at some gateways.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 28]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
NOTE: The distinction between an RFC 822 message and a body part
|
|||
|
is subtle, but important. A gateway between Internet and X.400
|
|||
|
mail, for example, must be able to tell the difference between a
|
|||
|
body part that contains an image and a body part that contains an
|
|||
|
encapsulated message, the body of which is an image. In order to
|
|||
|
represent the latter, the body part must have "Content-Type:
|
|||
|
message", and its body (after the blank line) must be the
|
|||
|
encapsulated message, with its own "Content-Type: image" header
|
|||
|
field. The use of similar syntax facilitates the conversion of
|
|||
|
messages to body parts, and vice versa, but the distinction
|
|||
|
between the two must be understood by implementors. (For the
|
|||
|
special case in which all parts actually are messages, a "digest"
|
|||
|
subtype is also defined.)
|
|||
|
|
|||
|
As stated previously, each body part is preceded by an encapsulation
|
|||
|
boundary. The encapsulation boundary MUST NOT appear inside any of
|
|||
|
the encapsulated parts. Thus, it is crucial that the composing agent
|
|||
|
be able to choose and specify the unique boundary that will separate
|
|||
|
the parts.
|
|||
|
|
|||
|
All present and future subtypes of the "multipart" type must use an
|
|||
|
identical syntax. Subtypes may differ in their semantics, and may
|
|||
|
impose additional restrictions on syntax, but must conform to the
|
|||
|
required syntax for the multipart type. This requirement ensures
|
|||
|
that all conformant user agents will at least be able to recognize
|
|||
|
and separate the parts of any multipart entity, even of an
|
|||
|
unrecognized subtype.
|
|||
|
|
|||
|
As stated in the definition of the Content-Transfer-Encoding field,
|
|||
|
no encoding other than "7bit", "8bit", or "binary" is permitted for
|
|||
|
entities of type "multipart". The multipart delimiters and header
|
|||
|
fields are always represented as 7-bit ASCII in any case (though the
|
|||
|
header fields may encode non-ASCII header text as per [RFC-1522]),
|
|||
|
and data within the body parts can be encoded on a part-by-part
|
|||
|
basis, with Content-Transfer-Encoding fields for each appropriate
|
|||
|
body part.
|
|||
|
|
|||
|
Mail gateways, relays, and other mail handling agents are commonly
|
|||
|
known to alter the top-level header of an RFC 822 message. In
|
|||
|
particular, they frequently add, remove, or reorder header fields.
|
|||
|
Such alterations are explicitly forbidden for the body part headers
|
|||
|
embedded in the bodies of messages of type "multipart."
|
|||
|
|
|||
|
7.2.1. Multipart: The common syntax
|
|||
|
|
|||
|
All subtypes of "multipart" share a common syntax, defined in this
|
|||
|
section. A simple example of a multipart message also appears in
|
|||
|
this section. An example of a more complex multipart message is
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 29]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
given in Appendix C.
|
|||
|
|
|||
|
The Content-Type field for multipart entities requires one parameter,
|
|||
|
"boundary", which is used to specify the encapsulation boundary. The
|
|||
|
encapsulation boundary is defined as a line consisting entirely of
|
|||
|
two hyphen characters ("-", decimal code 45) followed by the boundary
|
|||
|
parameter value from the Content-Type header field.
|
|||
|
|
|||
|
NOTE: The hyphens are for rough compatibility with the earlier RFC
|
|||
|
934 method of message encapsulation, and for ease of searching for
|
|||
|
the boundaries in some implementations. However, it should be
|
|||
|
noted that multipart messages are NOT completely compatible with
|
|||
|
RFC 934 encapsulations; in particular, they do not obey RFC 934
|
|||
|
quoting conventions for embedded lines that begin with hyphens.
|
|||
|
This mechanism was chosen over the RFC 934 mechanism because the
|
|||
|
latter causes lines to grow with each level of quoting. The
|
|||
|
combination of this growth with the fact that SMTP implementations
|
|||
|
sometimes wrap long lines made the RFC 934 mechanism unsuitable
|
|||
|
for use in the event that deeply-nested multipart structuring is
|
|||
|
ever desired.
|
|||
|
|
|||
|
WARNING TO IMPLEMENTORS: The grammar for parameters on the Content-
|
|||
|
type field is such that it is often necessary to enclose the
|
|||
|
boundaries in quotes on the Content-type line. This is not always
|
|||
|
necessary, but never hurts. Implementors should be sure to study the
|
|||
|
grammar carefully in order to avoid producing illegal Content-type
|
|||
|
fields. Thus, a typical multipart Content-Type header field might
|
|||
|
look like this:
|
|||
|
|
|||
|
Content-Type: multipart/mixed;
|
|||
|
boundary=gc0p4Jq0M2Yt08jU534c0p
|
|||
|
|
|||
|
But the following is illegal:
|
|||
|
|
|||
|
Content-Type: multipart/mixed;
|
|||
|
boundary=gc0p4Jq0M:2Yt08jU534c0p
|
|||
|
|
|||
|
(because of the colon) and must instead be represented as
|
|||
|
|
|||
|
Content-Type: multipart/mixed;
|
|||
|
boundary="gc0p4Jq0M:2Yt08jU534c0p"
|
|||
|
|
|||
|
This indicates that the entity consists of several parts, each itself
|
|||
|
with a structure that is syntactically identical to an RFC 822
|
|||
|
message, except that the header area might be completely empty, and
|
|||
|
that the parts are each preceded by the line
|
|||
|
|
|||
|
--gc0p4Jq0M:2Yt08jU534c0p
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 30]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
Note that the encapsulation boundary must occur at the beginning of a
|
|||
|
line, i.e., following a CRLF, and that the initial CRLF is considered
|
|||
|
to be attached to the encapsulation boundary rather than part of the
|
|||
|
preceding part. The boundary must be followed immediately either by
|
|||
|
another CRLF and the header fields for the next part, or by two
|
|||
|
CRLFs, in which case there are no header fields for the next part
|
|||
|
(and it is therefore assumed to be of Content-Type text/plain).
|
|||
|
|
|||
|
NOTE: The CRLF preceding the encapsulation line is conceptually
|
|||
|
attached to the boundary so that it is possible to have a part
|
|||
|
that does not end with a CRLF (line break). Body parts that must
|
|||
|
be considered to end with line breaks, therefore, must have two
|
|||
|
CRLFs preceding the encapsulation line, the first of which is part
|
|||
|
of the preceding body part, and the second of which is part of the
|
|||
|
encapsulation boundary.
|
|||
|
|
|||
|
Encapsulation boundaries must not appear within the encapsulations,
|
|||
|
and must be no longer than 70 characters, not counting the two
|
|||
|
leading hyphens.
|
|||
|
|
|||
|
The encapsulation boundary following the last body part is a
|
|||
|
distinguished delimiter that indicates that no further body parts
|
|||
|
will follow. Such a delimiter is identical to the previous
|
|||
|
delimiters, with the addition of two more hyphens at the end of the
|
|||
|
line:
|
|||
|
|
|||
|
--gc0p4Jq0M2Yt08jU534c0p--
|
|||
|
|
|||
|
There appears to be room for additional information prior to the
|
|||
|
first encapsulation boundary and following the final boundary. These
|
|||
|
areas should generally be left blank, and implementations must ignore
|
|||
|
anything that appears before the first boundary or after the last
|
|||
|
one.
|
|||
|
|
|||
|
NOTE: These "preamble" and "epilogue" areas are generally not used
|
|||
|
because of the lack of proper typing of these parts and the lack
|
|||
|
of clear semantics for handling these areas at gateways,
|
|||
|
particularly X.400 gateways. However, rather than leaving the
|
|||
|
preamble area blank, many MIME implementations have found this to
|
|||
|
be a convenient place to insert an explanatory note for recipients
|
|||
|
who read the message with pre-MIME software, since such notes will
|
|||
|
be ignored by MIME-compliant software.
|
|||
|
|
|||
|
NOTE: Because encapsulation boundaries must not appear in the body
|
|||
|
parts being encapsulated, a user agent must exercise care to
|
|||
|
choose a unique boundary. The boundary in the example above could
|
|||
|
have been the result of an algorithm designed to produce
|
|||
|
boundaries with a very low probability of already existing in the
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 31]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
data to be encapsulated without having to prescan the data.
|
|||
|
Alternate algorithms might result in more 'readable' boundaries
|
|||
|
for a recipient with an old user agent, but would require more
|
|||
|
attention to the possibility that the boundary might appear in the
|
|||
|
encapsulated part. The simplest boundary possible is something
|
|||
|
like "---", with a closing boundary of "-----".
|
|||
|
|
|||
|
As a very simple example, the following multipart message has two
|
|||
|
parts, both of them plain text, one of them explicitly typed and one
|
|||
|
of them implicitly typed:
|
|||
|
|
|||
|
From: Nathaniel Borenstein <nsb@bellcore.com>
|
|||
|
To: Ned Freed <ned@innosoft.com>
|
|||
|
Subject: Sample message
|
|||
|
MIME-Version: 1.0
|
|||
|
Content-type: multipart/mixed; boundary="simple
|
|||
|
boundary"
|
|||
|
|
|||
|
This is the preamble. It is to be ignored, though it
|
|||
|
is a handy place for mail composers to include an
|
|||
|
explanatory note to non-MIME conformant readers.
|
|||
|
--simple boundary
|
|||
|
|
|||
|
This is implicitly typed plain ASCII text.
|
|||
|
It does NOT end with a linebreak.
|
|||
|
--simple boundary
|
|||
|
Content-type: text/plain; charset=us-ascii
|
|||
|
|
|||
|
This is explicitly typed plain ASCII text.
|
|||
|
It DOES end with a linebreak.
|
|||
|
|
|||
|
--simple boundary--
|
|||
|
This is the epilogue. It is also to be ignored.
|
|||
|
|
|||
|
The use of a Content-Type of multipart in a body part within another
|
|||
|
multipart entity is explicitly allowed. In such cases, for obvious
|
|||
|
reasons, care must be taken to ensure that each nested multipart
|
|||
|
entity must use a different boundary delimiter. See Appendix C for an
|
|||
|
example of nested multipart entities.
|
|||
|
|
|||
|
The use of the multipart Content-Type with only a single body part
|
|||
|
may be useful in certain contexts, and is explicitly permitted.
|
|||
|
|
|||
|
The only mandatory parameter for the multipart Content-Type is the
|
|||
|
boundary parameter, which consists of 1 to 70 characters from a set
|
|||
|
of characters known to be very robust through email gateways, and NOT
|
|||
|
ending with white space. (If a boundary appears to end with white
|
|||
|
space, the white space must be presumed to have been added by a
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 32]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
gateway, and must be deleted.) It is formally specified by the
|
|||
|
following BNF:
|
|||
|
|
|||
|
boundary := 0*69<bchars> bcharsnospace
|
|||
|
|
|||
|
bchars := bcharsnospace / " "
|
|||
|
|
|||
|
bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / "+" /"_"
|
|||
|
/ "," / "-" / "." / "/" / ":" / "=" / "?"
|
|||
|
|
|||
|
Overall, the body of a multipart entity may be specified as
|
|||
|
follows:
|
|||
|
|
|||
|
multipart-body := preamble 1*encapsulation
|
|||
|
close-delimiter epilogue
|
|||
|
|
|||
|
encapsulation := delimiter body-part CRLF
|
|||
|
|
|||
|
delimiter := "--" boundary CRLF ; taken from Content-Type field.
|
|||
|
; There must be no space
|
|||
|
; between "--" and boundary.
|
|||
|
|
|||
|
close-delimiter := "--" boundary "--" CRLF ; Again, no space
|
|||
|
by "--",
|
|||
|
|
|||
|
preamble := discard-text ; to be ignored upon receipt.
|
|||
|
|
|||
|
epilogue := discard-text ; to be ignored upon receipt.
|
|||
|
|
|||
|
discard-text := *(*text CRLF)
|
|||
|
|
|||
|
body-part := <"message" as defined in RFC 822,
|
|||
|
with all header fields optional, and with the
|
|||
|
specified delimiter not occurring anywhere in
|
|||
|
the message body, either on a line by itself
|
|||
|
or as a substring anywhere. Note that the
|
|||
|
semantics of a part differ from the semantics
|
|||
|
of a message, as described in the text.>
|
|||
|
|
|||
|
NOTE: In certain transport enclaves, RFC 822 restrictions such as
|
|||
|
the one that limits bodies to printable ASCII characters may not
|
|||
|
be in force. (That is, the transport domains may resemble
|
|||
|
standard Internet mail transport as specified in RFC821 and
|
|||
|
assumed by RFC822, but without certain restrictions.) The
|
|||
|
relaxation of these restrictions should be construed as locally
|
|||
|
extending the definition of bodies, for example to include octets
|
|||
|
outside of the ASCII range, as long as these extensions are
|
|||
|
supported by the transport and adequately documented in the
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 33]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
Content-Transfer-Encoding header field. However, in no event are
|
|||
|
headers (either message headers or body-part headers) allowed to
|
|||
|
contain anything other than ASCII characters.
|
|||
|
|
|||
|
NOTE: Conspicuously missing from the multipart type is a notion of
|
|||
|
structured, related body parts. In general, it seems premature to
|
|||
|
try to standardize interpart structure yet. It is recommended
|
|||
|
that those wishing to provide a more structured or integrated
|
|||
|
multipart messaging facility should define a subtype of multipart
|
|||
|
that is syntactically identical, but that always expects the
|
|||
|
inclusion of a distinguished part that can be used to specify the
|
|||
|
structure and integration of the other parts, probably referring
|
|||
|
to them by their Content-ID field. If this approach is used,
|
|||
|
other implementations will not recognize the new subtype, but will
|
|||
|
treat it as the primary subtype (multipart/mixed) and will thus be
|
|||
|
able to show the user the parts that are recognized.
|
|||
|
|
|||
|
7.2.2. The Multipart/mixed (primary) subtype
|
|||
|
|
|||
|
The primary subtype for multipart, "mixed", is intended for use when
|
|||
|
the body parts are independent and need to be bundled in a particular
|
|||
|
order. Any multipart subtypes that an implementation does not
|
|||
|
recognize must be treated as being of subtype "mixed".
|
|||
|
|
|||
|
7.2.3. The Multipart/alternative subtype
|
|||
|
|
|||
|
The multipart/alternative type is syntactically identical to
|
|||
|
multipart/mixed, but the semantics are different. In particular,
|
|||
|
each of the parts is an "alternative" version of the same
|
|||
|
information.
|
|||
|
|
|||
|
Systems should recognize that the content of the various parts are
|
|||
|
interchangeable. Systems should choose the "best" type based on the
|
|||
|
local environment and preferences, in some cases even through user
|
|||
|
interaction. As with multipart/mixed, the order of body parts is
|
|||
|
significant. In this case, the alternatives appear in an order of
|
|||
|
increasing faithfulness to the original content. In general, the best
|
|||
|
choice is the LAST part of a type supported by the recipient system's
|
|||
|
local environment.
|
|||
|
|
|||
|
Multipart/alternative may be used, for example, to send mail in a
|
|||
|
fancy text format in such a way that it can easily be displayed
|
|||
|
anywhere:
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 34]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
From: Nathaniel Borenstein <nsb@bellcore.com>
|
|||
|
To: Ned Freed <ned@innosoft.com>
|
|||
|
Subject: Formatted text mail
|
|||
|
MIME-Version: 1.0
|
|||
|
Content-Type: multipart/alternative; boundary=boundary42
|
|||
|
|
|||
|
--boundary42
|
|||
|
|
|||
|
Content-Type: text/plain; charset=us-ascii
|
|||
|
|
|||
|
...plain text version of message goes here....
|
|||
|
--boundary42
|
|||
|
Content-Type: text/richtext
|
|||
|
|
|||
|
.... RFC 1341 richtext version of same message goes here ...
|
|||
|
--boundary42
|
|||
|
Content-Type: text/x-whatever
|
|||
|
|
|||
|
.... fanciest formatted version of same message goes here
|
|||
|
...
|
|||
|
--boundary42--
|
|||
|
|
|||
|
In this example, users whose mail system understood the "text/x-
|
|||
|
whatever" format would see only the fancy version, while other users
|
|||
|
would see only the richtext or plain text version, depending on the
|
|||
|
capabilities of their system.
|
|||
|
|
|||
|
In general, user agents that compose multipart/alternative entities
|
|||
|
must place the body parts in increasing order of preference, that is,
|
|||
|
with the preferred format last. For fancy text, the sending user
|
|||
|
agent should put the plainest format first and the richest format
|
|||
|
last. Receiving user agents should pick and display the last format
|
|||
|
they are capable of displaying. In the case where one of the
|
|||
|
alternatives is itself of type "multipart" and contains unrecognized
|
|||
|
sub-parts, the user agent may choose either to show that alternative,
|
|||
|
an earlier alternative, or both.
|
|||
|
|
|||
|
NOTE: From an implementor's perspective, it might seem more
|
|||
|
sensible to reverse this ordering, and have the plainest
|
|||
|
alternative last. However, placing the plainest alternative first
|
|||
|
is the friendliest possible option when multipart/alternative
|
|||
|
entities are viewed using a non-MIME-conformant mail reader.
|
|||
|
While this approach does impose some burden on conformant mail
|
|||
|
readers, interoperability with older mail readers was deemed to be
|
|||
|
more important in this case.
|
|||
|
|
|||
|
It may be the case that some user agents, if they can recognize more
|
|||
|
than one of the formats, will prefer to offer the user the choice of
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 35]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
which format to view. This makes sense, for example, if mail
|
|||
|
includes both a nicely-formatted image version and an easily-edited
|
|||
|
text version. What is most critical, however, is that the user not
|
|||
|
automatically be shown multiple versions of the same data. Either
|
|||
|
the user should be shown the last recognized version or should be
|
|||
|
given the choice.
|
|||
|
|
|||
|
NOTE ON THE SEMANTICS OF CONTENT-ID IN MULTIPART/ALTERNATIVE: Each
|
|||
|
part of a multipart/alternative entity represents the same data, but
|
|||
|
the mappings between the two are not necessarily without information
|
|||
|
loss. For example, information is lost when translating ODA to
|
|||
|
PostScript or plain text. It is recommended that each part should
|
|||
|
have a different Content-ID value in the case where the information
|
|||
|
content of the two parts is not identical. However, where the
|
|||
|
information content is identical -- for example, where several parts
|
|||
|
of type "application/external- body" specify alternate ways to access
|
|||
|
the identical data -- the same Content-ID field value should be used,
|
|||
|
to optimize any cacheing mechanisms that might be present on the
|
|||
|
recipient's end. However, it is recommended that the Content-ID
|
|||
|
values used by the parts should not be the same Content-ID value that
|
|||
|
describes the multipart/alternative as a whole, if there is any such
|
|||
|
Content-ID field. That is, one Content-ID value will refer to the
|
|||
|
multipart/alternative entity, while one or more other Content-ID
|
|||
|
values will refer to the parts inside it.
|
|||
|
|
|||
|
7.2.4. The Multipart/digest subtype
|
|||
|
|
|||
|
This document defines a "digest" subtype of the multipart Content-
|
|||
|
Type. This type is syntactically identical to multipart/mixed, but
|
|||
|
the semantics are different. In particular, in a digest, the default
|
|||
|
Content-Type value for a body part is changed from "text/plain" to
|
|||
|
"message/rfc822". This is done to allow a more readable digest
|
|||
|
format that is largely compatible (except for the quoting convention)
|
|||
|
with RFC 934.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 36]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
A digest in this format might, then, look something like this:
|
|||
|
|
|||
|
From: Moderator-Address
|
|||
|
To: Recipient-List
|
|||
|
MIME-Version: 1.0
|
|||
|
Subject: Internet Digest, volume 42
|
|||
|
Content-Type: multipart/digest;
|
|||
|
boundary="---- next message ----"
|
|||
|
|
|||
|
------ next message ----
|
|||
|
|
|||
|
From: someone-else
|
|||
|
Subject: my opinion
|
|||
|
|
|||
|
...body goes here ...
|
|||
|
|
|||
|
------ next message ----
|
|||
|
|
|||
|
From: someone-else-again
|
|||
|
Subject: my different opinion
|
|||
|
|
|||
|
... another body goes here...
|
|||
|
|
|||
|
------ next message ------
|
|||
|
|
|||
|
7.2.5. The Multipart/parallel subtype
|
|||
|
|
|||
|
This document defines a "parallel" subtype of the multipart Content-
|
|||
|
Type. This type is syntactically identical to multipart/mixed, but
|
|||
|
the semantics are different. In particular, in a parallel entity,
|
|||
|
the order of body parts is not significant.
|
|||
|
|
|||
|
A common presentation of this type is to display all of the parts
|
|||
|
simultaneously on hardware and software that are capable of doing so.
|
|||
|
However, composing agents should be aware that many mail readers will
|
|||
|
lack this capability and will show the parts serially in any event.
|
|||
|
|
|||
|
7.2.6. Other Multipart subtypes
|
|||
|
|
|||
|
Other multipart subtypes are expected in the future. MIME
|
|||
|
implementations must in general treat unrecognized subtypes of
|
|||
|
multipart as being equivalent to "multipart/mixed".
|
|||
|
|
|||
|
The formal grammar for content-type header fields for multipart data
|
|||
|
is given by:
|
|||
|
|
|||
|
multipart-type := "multipart" "/" multipart-subtype
|
|||
|
";" "boundary" "=" boundary
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 37]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
multipart-subtype := "mixed" / "parallel" / "digest"
|
|||
|
/ "alternative" / extension-token
|
|||
|
|
|||
|
7.3. The Message Content-Type
|
|||
|
|
|||
|
It is frequently desirable, in sending mail, to encapsulate another
|
|||
|
mail message. For this common operation, a special Content-Type,
|
|||
|
"message", is defined. The primary subtype, message/rfc822, has no
|
|||
|
required parameters in the Content-Type field. Additional subtypes,
|
|||
|
"partial" and "External-body", do have required parameters. These
|
|||
|
subtypes are explained below.
|
|||
|
|
|||
|
NOTE: It has been suggested that subtypes of message might be
|
|||
|
defined for forwarded or rejected messages. However, forwarded
|
|||
|
and rejected messages can be handled as multipart messages in
|
|||
|
which the first part contains any control or descriptive
|
|||
|
information, and a second part, of type message/rfc822, is the
|
|||
|
forwarded or rejected message. Composing rejection and forwarding
|
|||
|
messages in this manner will preserve the type information on the
|
|||
|
original message and allow it to be correctly presented to the
|
|||
|
recipient, and hence is strongly encouraged.
|
|||
|
|
|||
|
As stated in the definition of the Content-Transfer-Encoding field,
|
|||
|
no encoding other than "7bit", "8bit", or "binary" is permitted for
|
|||
|
messages or parts of type "message". Even stronger restrictions
|
|||
|
apply to the subtypes "message/partial" and "message/external-body",
|
|||
|
as specified below. The message header fields are always US-ASCII in
|
|||
|
any case, and data within the body can still be encoded, in which
|
|||
|
case the Content-Transfer-Encoding header field in the encapsulated
|
|||
|
message will reflect this. Non-ASCII text in the headers of an
|
|||
|
encapsulated message can be specified using the mechanisms described
|
|||
|
in [RFC-1522].
|
|||
|
|
|||
|
Mail gateways, relays, and other mail handling agents are commonly
|
|||
|
known to alter the top-level header of an RFC 822 message. In
|
|||
|
particular, they frequently add, remove, or reorder header fields.
|
|||
|
Such alterations are explicitly forbidden for the encapsulated
|
|||
|
headers embedded in the bodies of messages of type "message."
|
|||
|
|
|||
|
7.3.1. The Message/rfc822 (primary) subtype
|
|||
|
|
|||
|
A Content-Type of "message/rfc822" indicates that the body contains
|
|||
|
an encapsulated message, with the syntax of an RFC 822 message.
|
|||
|
However, unlike top-level RFC 822 messages, it is not required that
|
|||
|
each message/rfc822 body must include a "From", "Subject", and at
|
|||
|
least one destination header.
|
|||
|
|
|||
|
It should be noted that, despite the use of the numbers "822", a
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 38]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
message/rfc822 entity can include enhanced information as defined in
|
|||
|
this document. In other words, a message/rfc822 message may be a
|
|||
|
MIME message.
|
|||
|
|
|||
|
7.3.2. The Message/Partial subtype
|
|||
|
|
|||
|
A subtype of message, "partial", is defined in order to allow large
|
|||
|
objects to be delivered as several separate pieces of mail and
|
|||
|
automatically reassembled by the receiving user agent. (The concept
|
|||
|
is similar to IP fragmentation/reassembly in the basic Internet
|
|||
|
Protocols.) This mechanism can be used when intermediate transport
|
|||
|
agents limit the size of individual messages that can be sent.
|
|||
|
Content-Type "message/partial" thus indicates that the body contains
|
|||
|
a fragment of a larger message.
|
|||
|
|
|||
|
Three parameters must be specified in the Content-Type field of type
|
|||
|
message/partial: The first, "id", is a unique identifier, as close to
|
|||
|
a world-unique identifier as possible, to be used to match the parts
|
|||
|
together. (In general, the identifier is essentially a message-id;
|
|||
|
if placed in double quotes, it can be any message-id, in accordance
|
|||
|
with the BNF for "parameter" given earlier in this specification.)
|
|||
|
The second, "number", an integer, is the part number, which indicates
|
|||
|
where this part fits into the sequence of fragments. The third,
|
|||
|
"total", another integer, is the total number of parts. This third
|
|||
|
subfield is required on the final part, and is optional (though
|
|||
|
encouraged) on the earlier parts. Note also that these parameters
|
|||
|
may be given in any order.
|
|||
|
|
|||
|
Thus, part 2 of a 3-part message may have either of the following
|
|||
|
header fields:
|
|||
|
|
|||
|
Content-Type: Message/Partial;
|
|||
|
number=2; total=3;
|
|||
|
id="oc=jpbe0M2Yt4s@thumper.bellcore.com"
|
|||
|
|
|||
|
Content-Type: Message/Partial;
|
|||
|
id="oc=jpbe0M2Yt4s@thumper.bellcore.com";
|
|||
|
number=2
|
|||
|
|
|||
|
But part 3 MUST specify the total number of parts:
|
|||
|
|
|||
|
Content-Type: Message/Partial;
|
|||
|
number=3; total=3;
|
|||
|
id="oc=jpbe0M2Yt4s@thumper.bellcore.com"
|
|||
|
|
|||
|
Note that part numbering begins with 1, not 0.
|
|||
|
|
|||
|
When the parts of a message broken up in this manner are put
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 39]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
together, the result is a complete MIME entity, which may have its
|
|||
|
own Content-Type header field, and thus may contain any other data
|
|||
|
type.
|
|||
|
|
|||
|
Message fragmentation and reassembly: The semantics of a reassembled
|
|||
|
partial message must be those of the "inner" message, rather than of
|
|||
|
a message containing the inner message. This makes it possible, for
|
|||
|
example, to send a large audio message as several partial messages,
|
|||
|
and still have it appear to the recipient as a simple audio message
|
|||
|
rather than as an encapsulated message containing an audio message.
|
|||
|
That is, the encapsulation of the message is considered to be
|
|||
|
"transparent".
|
|||
|
|
|||
|
When generating and reassembling the parts of a message/partial
|
|||
|
message, the headers of the encapsulated message must be merged with
|
|||
|
the headers of the enclosing entities. In this process the following
|
|||
|
rules must be observed:
|
|||
|
|
|||
|
(1) All of the header fields from the initial enclosing entity
|
|||
|
(part one), except those that start with "Content-" and the
|
|||
|
specific header fields "Message-ID", "Encrypted", and "MIME-
|
|||
|
Version", must be copied, in order, to the new message.
|
|||
|
|
|||
|
(2) Only those header fields in the enclosed message which start
|
|||
|
with "Content-" and "Message-ID", "Encrypted", and "MIME-Version"
|
|||
|
must be appended, in order, to the header fields of the new
|
|||
|
message. Any header fields in the enclosed message which do not
|
|||
|
start with "Content-" (except for "Message-ID", "Encrypted", and
|
|||
|
"MIME-Version") will be ignored.
|
|||
|
|
|||
|
(3) All of the header fields from the second and any subsequent
|
|||
|
messages will be ignored.
|
|||
|
|
|||
|
For example, if an audio message is broken into two parts, the first
|
|||
|
part might look something like this:
|
|||
|
|
|||
|
X-Weird-Header-1: Foo
|
|||
|
From: Bill@host.com
|
|||
|
To: joe@otherhost.com
|
|||
|
Subject: Audio mail
|
|||
|
Message-ID: <id1@host.com>
|
|||
|
MIME-Version: 1.0
|
|||
|
Content-type: message/partial;
|
|||
|
id="ABC@host.com";
|
|||
|
number=1; total=2
|
|||
|
|
|||
|
X-Weird-Header-1: Bar
|
|||
|
X-Weird-Header-2: Hello
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 40]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
Message-ID: <anotherid@foo.com>
|
|||
|
MIME-Version: 1.0
|
|||
|
Content-type: audio/basic
|
|||
|
Content-transfer-encoding: base64
|
|||
|
|
|||
|
... first half of encoded audio data goes here...
|
|||
|
|
|||
|
and the second half might look something like this:
|
|||
|
|
|||
|
From: Bill@host.com
|
|||
|
To: joe@otherhost.com
|
|||
|
Subject: Audio mail
|
|||
|
MIME-Version: 1.0
|
|||
|
Message-ID: <id2@host.com>
|
|||
|
Content-type: message/partial;
|
|||
|
id="ABC@host.com"; number=2; total=2
|
|||
|
|
|||
|
... second half of encoded audio data goes here...
|
|||
|
|
|||
|
Then, when the fragmented message is reassembled, the resulting
|
|||
|
message to be displayed to the user should look something like this:
|
|||
|
|
|||
|
X-Weird-Header-1: Foo
|
|||
|
From: Bill@host.com
|
|||
|
To: joe@otherhost.com
|
|||
|
Subject: Audio mail
|
|||
|
Message-ID: <anotherid@foo.com>
|
|||
|
MIME-Version: 1.0
|
|||
|
Content-type: audio/basic
|
|||
|
Content-transfer-encoding: base64
|
|||
|
|
|||
|
... first half of encoded audio data goes here...
|
|||
|
... second half of encoded audio data goes here...
|
|||
|
|
|||
|
Note on encoding of MIME entities encapsulated inside message/partial
|
|||
|
entities: Because data of type "message" may never be encoded in
|
|||
|
base64 or quoted-printable, a problem might arise if message/partial
|
|||
|
entities are constructed in an environment that supports binary or
|
|||
|
8-bit transport. The problem is that the binary data would be split
|
|||
|
into multiple message/partial objects, each of them requiring binary
|
|||
|
transport. If such objects were encountered at a gateway into a 7-
|
|||
|
bit transport environment, there would be no way to properly encode
|
|||
|
them for the 7-bit world, aside from waiting for all of the parts,
|
|||
|
reassembling the message, and then encoding the reassembled data in
|
|||
|
base64 or quoted-printable. Since it is possible that different
|
|||
|
parts might go through different gateways, even this is not an
|
|||
|
acceptable solution. For this reason, it is specified that MIME
|
|||
|
entities of type message/partial must always have a content-
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 41]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
transfer-encoding of 7-bit (the default). In particular, even in
|
|||
|
environments that support binary or 8-bit transport, the use of a
|
|||
|
content-transfer-encoding of "8bit" or "binary" is explicitly
|
|||
|
prohibited for entities of type message/partial.
|
|||
|
|
|||
|
It should be noted that, because some message transfer agents may
|
|||
|
choose to automatically fragment large messages, and because such
|
|||
|
agents may use different fragmentation thresholds, it is possible
|
|||
|
that the pieces of a partial message, upon reassembly, may prove
|
|||
|
themselves to comprise a partial message. This is explicitly
|
|||
|
permitted.
|
|||
|
|
|||
|
It should also be noted that the inclusion of a "References" field in
|
|||
|
the headers of the second and subsequent pieces of a fragmented
|
|||
|
message that references the Message-Id on the previous piece may be
|
|||
|
of benefit to mail readers that understand and track references.
|
|||
|
However, the generation of such "References" fields is entirely
|
|||
|
optional.
|
|||
|
|
|||
|
Finally, it should be noted that the "Encrypted" header field has
|
|||
|
been made obsolete by Privacy Enhanced Messaging (PEM), but the rules
|
|||
|
above are believed to describe the correct way to treat it if it is
|
|||
|
encountered in the context of conversion to and from message/partial
|
|||
|
fragments.
|
|||
|
|
|||
|
7.3.3. The Message/External-Body subtype
|
|||
|
|
|||
|
The external-body subtype indicates that the actual body data are not
|
|||
|
included, but merely referenced. In this case, the parameters
|
|||
|
describe a mechanism for accessing the external data.
|
|||
|
|
|||
|
When an entity is of type "message/external-body", it consists of a
|
|||
|
header, two consecutive CRLFs, and the message header for the
|
|||
|
encapsulated message. If another pair of consecutive CRLFs appears,
|
|||
|
this of course ends the message header for the encapsulated message.
|
|||
|
However, since the encapsulated message's body is itself external, it
|
|||
|
does NOT appear in the area that follows. For example, consider the
|
|||
|
following message:
|
|||
|
|
|||
|
Content-type: message/external-body; access-
|
|||
|
type=local-file;
|
|||
|
|
|||
|
name="/u/nsb/Me.gif"
|
|||
|
|
|||
|
Content-type: image/gif
|
|||
|
Content-ID: <id42@guppylake.bellcore.com>
|
|||
|
Content-Transfer-Encoding: binary
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 42]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
THIS IS NOT REALLY THE BODY!
|
|||
|
|
|||
|
The area at the end, which might be called the "phantom body", is
|
|||
|
ignored for most external-body messages. However, it may be used to
|
|||
|
contain auxiliary information for some such messages, as indeed it is
|
|||
|
when the access-type is "mail-server". Of the access-types defined
|
|||
|
by this document, the phantom body is used only when the access-type
|
|||
|
is "mail-server". In all other cases, the phantom body is ignored.
|
|||
|
|
|||
|
The only always-mandatory parameter for message/external-body is
|
|||
|
"access-type"; all of the other parameters may be mandatory or
|
|||
|
optional depending on the value of access-type.
|
|||
|
|
|||
|
ACCESS-TYPE -- A case-insensitive word, indicating the supported
|
|||
|
access mechanism by which the file or data may be obtained.
|
|||
|
Values include, but are not limited to, "FTP", "ANON-FTP", "TFTP",
|
|||
|
"AFS", "LOCAL-FILE", and "MAIL-SERVER". Future values, except for
|
|||
|
experimental values beginning with "X-" must be registered with
|
|||
|
IANA, as described in Appendix E .
|
|||
|
|
|||
|
In addition, the following three parameters are optional for ALL
|
|||
|
access-types:
|
|||
|
|
|||
|
EXPIRATION -- The date (in the RFC 822 "date-time" syntax, as
|
|||
|
extended by RFC 1123 to permit 4 digits in the year field) after
|
|||
|
which the existence of the external data is not guaranteed.
|
|||
|
|
|||
|
SIZE -- The size (in octets) of the data. The intent of this
|
|||
|
parameter is to help the recipient decide whether or not to expend
|
|||
|
the necessary resources to retrieve the external data. Note that
|
|||
|
this describes the size of the data in its canonical form, that
|
|||
|
is, before any Content- Transfer-Encoding has been applied or
|
|||
|
after the data have been decoded.
|
|||
|
|
|||
|
PERMISSION -- A case-insensitive field that indicates whether or
|
|||
|
not it is expected that clients might also attempt to overwrite
|
|||
|
the data. By default, or if permission is "read", the assumption
|
|||
|
is that they are not, and that if the data is retrieved once, it
|
|||
|
is never needed again. If PERMISSION is "read-write", this
|
|||
|
assumption is invalid, and any local copy must be considered no
|
|||
|
more than a cache. "Read" and "Read-write" are the only defined
|
|||
|
values of permission.
|
|||
|
|
|||
|
The precise semantics of the access-types defined here are described
|
|||
|
in the sections that follow.
|
|||
|
|
|||
|
The encapsulated headers in ALL message/external-body entities MUST
|
|||
|
include a Content-ID header field to give a unique identifier by
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 43]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
which to reference the data. This identifier may be used for
|
|||
|
cacheing mechanisms, and for recognizing the receipt of the data when
|
|||
|
the access-type is "mail-server".
|
|||
|
|
|||
|
Note that, as specified here, the tokens that describe external-body
|
|||
|
data, such as file names and mail server commands, are required to be
|
|||
|
in the US-ASCII character set. If this proves problematic in
|
|||
|
practice, a new mechanism may be required as a future extension to
|
|||
|
MIME, either as newly defined access-types for message/external-body
|
|||
|
or by some other mechanism.
|
|||
|
|
|||
|
As with message/partial, it is specified that MIME entities of type
|
|||
|
message/external-body must always have a content-transfer-encoding of
|
|||
|
7-bit (the default). In particular, even in environments that
|
|||
|
support binary or 8-bit transport, the use of a content-transfer-
|
|||
|
encoding of "8bit" or "binary" is explicitly prohibited for entities
|
|||
|
of type message/external-body.
|
|||
|
|
|||
|
7.3.3.1. The "ftp" and "tftp" access-types
|
|||
|
|
|||
|
An access-type of FTP or TFTP indicates that the message body is
|
|||
|
accessible as a file using the FTP [RFC-959] or TFTP [RFC-783]
|
|||
|
protocols, respectively. For these access-types, the following
|
|||
|
additional parameters are mandatory:
|
|||
|
|
|||
|
NAME -- The name of the file that contains the actual body data.
|
|||
|
|
|||
|
SITE -- A machine from which the file may be obtained, using the
|
|||
|
given protocol. This must be a fully qualified domain name, not a
|
|||
|
nickname.
|
|||
|
|
|||
|
Before any data are retrieved, using FTP, the user will generally
|
|||
|
need to be asked to provide a login id and a password for the machine
|
|||
|
named by the site parameter. For security reasons, such an id and
|
|||
|
password are not specified as content-type parameters, but must be
|
|||
|
obtained from the user.
|
|||
|
|
|||
|
In addition, the following parameters are optional:
|
|||
|
|
|||
|
DIRECTORY -- A directory from which the data named by NAME should
|
|||
|
be retrieved.
|
|||
|
|
|||
|
MODE -- A case-insensitive string indicating the mode to be used
|
|||
|
when retrieving the information. The legal values for access-type
|
|||
|
"TFTP" are "NETASCII", "OCTET", and "MAIL", as specified by the
|
|||
|
TFTP protocol [RFC-783]. The legal values for access-type "FTP"
|
|||
|
are "ASCII", "EBCDIC", "IMAGE", and "LOCALn" where "n" is a
|
|||
|
decimal integer, typically 8. These correspond to the
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 44]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
representation types "A" "E" "I" and "L n" as specified by the FTP
|
|||
|
protocol [RFC-959]. Note that "BINARY" and "TENEX" are not valid
|
|||
|
values for MODE, but that "OCTET" or "IMAGE" or "LOCAL8" should be
|
|||
|
used instead. IF MODE is not specified, the default value is
|
|||
|
"NETASCII" for TFTP and "ASCII" otherwise.
|
|||
|
|
|||
|
7.3.3.2. The "anon-ftp" access-type
|
|||
|
|
|||
|
The "anon-ftp" access-type is identical to the "ftp" access type,
|
|||
|
except that the user need not be asked to provide a name and password
|
|||
|
for the specified site. Instead, the ftp protocol will be used with
|
|||
|
login "anonymous" and a password that corresponds to the user's email
|
|||
|
address.
|
|||
|
|
|||
|
7.3.3.3. The "local-file" and "afs" access-types
|
|||
|
|
|||
|
An access-type of "local-file" indicates that the actual body is
|
|||
|
accessible as a file on the local machine. An access-type of "afs"
|
|||
|
indicates that the file is accessible via the global AFS file system.
|
|||
|
In both cases, only a single parameter is required:
|
|||
|
|
|||
|
NAME -- The name of the file that contains the actual body data.
|
|||
|
|
|||
|
The following optional parameter may be used to describe the locality
|
|||
|
of reference for the data, that is, the site or sites at which the
|
|||
|
file is expected to be visible:
|
|||
|
|
|||
|
SITE -- A domain specifier for a machine or set of machines that
|
|||
|
are known to have access to the data file. Asterisks may be used
|
|||
|
for wildcard matching to a part of a domain name, such as
|
|||
|
"*.bellcore.com", to indicate a set of machines on which the data
|
|||
|
should be directly visible, while a single asterisk may be used to
|
|||
|
indicate a file that is expected to be universally available,
|
|||
|
e.g., via a global file system.
|
|||
|
|
|||
|
7.3.3.4. The "mail-server" access-type
|
|||
|
|
|||
|
The "mail-server" access-type indicates that the actual body is
|
|||
|
available from a mail server. The mandatory parameter for this
|
|||
|
access-type is:
|
|||
|
|
|||
|
SERVER -- The email address of the mail server from which the
|
|||
|
actual body data can be obtained.
|
|||
|
|
|||
|
Because mail servers accept a variety of syntaxes, some of which is
|
|||
|
multiline, the full command to be sent to a mail server is not
|
|||
|
included as a parameter on the content-type line. Instead, it is
|
|||
|
provided as the "phantom body" when the content-type is
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 45]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
message/external-body and the access- type is mail-server.
|
|||
|
|
|||
|
An optional parameter for this access-type is:
|
|||
|
|
|||
|
SUBJECT -- The subject that is to be used in the mail that is sent
|
|||
|
to obtain the data. Note that keying mail servers on Subject lines
|
|||
|
is NOT recommended, but such mail servers are known to exist.
|
|||
|
|
|||
|
Note that MIME does not define a mail server syntax. Rather, it
|
|||
|
allows the inclusion of arbitrary mail server commands in the phantom
|
|||
|
body. Implementations must include the phantom body in the body of
|
|||
|
the message it sends to the mail server address to retrieve the
|
|||
|
relevant data.
|
|||
|
|
|||
|
It is worth noting that, unlike other access-types, mail-server
|
|||
|
access is asynchronous and will happen at an unpredictable time in
|
|||
|
the future. For this reason, it is important that there be a
|
|||
|
mechanism by which the returned data can be matched up with the
|
|||
|
original message/external-body entity. MIME mailservers must use the
|
|||
|
same Content-ID field on the returned message that was used in the
|
|||
|
original message/external-body entity, to facilitate such matching.
|
|||
|
|
|||
|
7.3.3.5. Examples and Further Explanations
|
|||
|
|
|||
|
With the emerging possibility of very wide-area file systems, it
|
|||
|
becomes very hard to know in advance the set of machines where a file
|
|||
|
will and will not be accessible directly from the file system.
|
|||
|
Therefore it may make sense to provide both a file name, to be tried
|
|||
|
directly, and the name of one or more sites from which the file is
|
|||
|
known to be accessible. An implementation can try to retrieve remote
|
|||
|
files using FTP or any other protocol, using anonymous file retrieval
|
|||
|
or prompting the user for the necessary name and password. If an
|
|||
|
external body is accessible via multiple mechanisms, the sender may
|
|||
|
include multiple parts of type message/external-body within an entity
|
|||
|
of type multipart/alternative.
|
|||
|
|
|||
|
However, the external-body mechanism is not intended to be limited to
|
|||
|
file retrieval, as shown by the mail-server access-type. Beyond
|
|||
|
this, one can imagine, for example, using a video server for external
|
|||
|
references to video clips.
|
|||
|
|
|||
|
If an entity is of type "message/external-body", then the body of the
|
|||
|
entity will contain the header fields of the encapsulated message.
|
|||
|
The body itself is to be found in the external location. This means
|
|||
|
that if the body of the "message/external-body" message contains two
|
|||
|
consecutive CRLFs, everything after those pairs is NOT part of the
|
|||
|
message itself. For most message/external-body messages, this
|
|||
|
trailing area must simply be ignored. However, it is a convenient
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 46]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
place for additional data that cannot be included in the content-type
|
|||
|
header field. In particular, if the "access-type" value is "mail-
|
|||
|
server", then the trailing area must contain commands to be sent to
|
|||
|
the mail server at the address given by the value of the SERVER
|
|||
|
parameter.
|
|||
|
|
|||
|
The embedded message header fields which appear in the body of the
|
|||
|
message/external-body data must be used to declare the Content-type
|
|||
|
of the external body if it is anything other than plain ASCII text,
|
|||
|
since the external body does not have a header section to declare its
|
|||
|
type. Similarly, any Content-transfer-encoding other than "7bit"
|
|||
|
must also be declared here. Thus a complete message/external-body
|
|||
|
message, referring to a document in PostScript format, might look
|
|||
|
like this:
|
|||
|
|
|||
|
From: Whomever
|
|||
|
To: Someone
|
|||
|
Subject: whatever
|
|||
|
MIME-Version: 1.0
|
|||
|
Message-ID: <id1@host.com>
|
|||
|
Content-Type: multipart/alternative; boundary=42
|
|||
|
Content-ID: <id001@guppylake.bellcore.com>
|
|||
|
|
|||
|
--42
|
|||
|
Content-Type: message/external-body;
|
|||
|
name="BodyFormats.ps";
|
|||
|
site="thumper.bellcore.com";
|
|||
|
access-type=ANON-FTP;
|
|||
|
directory="pub";
|
|||
|
mode="image";
|
|||
|
expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)"
|
|||
|
|
|||
|
Content-type: application/postscript
|
|||
|
Content-ID: <id42@guppylake.bellcore.com>
|
|||
|
|
|||
|
--42
|
|||
|
Content-Type: message/external-body;
|
|||
|
name="/u/nsb/writing/rfcs/RFC-MIME.ps";
|
|||
|
site="thumper.bellcore.com";
|
|||
|
access-type=AFS
|
|||
|
expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)"
|
|||
|
|
|||
|
Content-type: application/postscript
|
|||
|
Content-ID: <id42@guppylake.bellcore.com>
|
|||
|
|
|||
|
--42
|
|||
|
Content-Type: message/external-body;
|
|||
|
access-type=mail-server
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 47]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
server="listserv@bogus.bitnet";
|
|||
|
expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)"
|
|||
|
|
|||
|
Content-type: application/postscript
|
|||
|
Content-ID: <id42@guppylake.bellcore.com>
|
|||
|
|
|||
|
get RFC-MIME.DOC
|
|||
|
|
|||
|
--42--
|
|||
|
|
|||
|
Note that in the above examples, the default Content-transfer-
|
|||
|
encoding of "7bit" is assumed for the external postscript data.
|
|||
|
|
|||
|
Like the message/partial type, the message/external-body type is
|
|||
|
intended to be transparent, that is, to convey the data type in the
|
|||
|
external body rather than to convey a message with a body of that
|
|||
|
type. Thus the headers on the outer and inner parts must be merged
|
|||
|
using the same rules as for message/partial. In particular, this
|
|||
|
means that the Content-type header is overridden, but the From and
|
|||
|
Subject headers are preserved.
|
|||
|
|
|||
|
Note that since the external bodies are not transported as mail, they
|
|||
|
need not conform to the 7-bit and line length requirements, but might
|
|||
|
in fact be binary files. Thus a Content-Transfer-Encoding is not
|
|||
|
generally necessary, though it is permitted.
|
|||
|
|
|||
|
Note that the body of a message of type "message/external-body" is
|
|||
|
governed by the basic syntax for an RFC 822 message. In particular,
|
|||
|
anything before the first consecutive pair of CRLFs is header
|
|||
|
information, while anything after it is body information, which is
|
|||
|
ignored for most access-types.
|
|||
|
|
|||
|
The formal grammar for content-type header fields for data of type
|
|||
|
message is given by:
|
|||
|
|
|||
|
message-type := "message" "/" message-subtype
|
|||
|
|
|||
|
message-subtype := "rfc822"
|
|||
|
/ "partial" 2#3partial-param
|
|||
|
/ "external-body" 1*external-param
|
|||
|
/ extension-token
|
|||
|
|
|||
|
partial-param := (";" "id" "=" value)
|
|||
|
/ (";" "number" "=" 1*DIGIT)
|
|||
|
/ (";" "total" "=" 1*DIGIT)
|
|||
|
; id & number required; total required for last part
|
|||
|
|
|||
|
external-param := (";" "access-type" "=" atype)
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 48]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
/ (";" "expiration" "=" date-time)
|
|||
|
; Note that date-time is quoted
|
|||
|
/ (";" "size" "=" 1*DIGIT)
|
|||
|
/ (";" "permission" "=" ("read" / "read-write"))
|
|||
|
; Permission is case-insensitive
|
|||
|
/ (";" "name" "=" value)
|
|||
|
/ (";" "site" "=" value)
|
|||
|
/ (";" "dir" "=" value)
|
|||
|
/ (";" "mode" "=" value)
|
|||
|
/ (";" "server" "=" value)
|
|||
|
/ (";" "subject" "=" value)
|
|||
|
; access-type required;others required based on access-type
|
|||
|
|
|||
|
atype := "ftp" / "anon-ftp" / "tftp" / "local-file"
|
|||
|
/ "afs" / "mail-server" / extension-token
|
|||
|
; Case-insensitive
|
|||
|
|
|||
|
7.4. The Application Content-Type
|
|||
|
|
|||
|
The "application" Content-Type is to be used for data which do not
|
|||
|
fit in any of the other categories, and particularly for data to be
|
|||
|
processed by mail-based uses of application programs. This is
|
|||
|
information which must be processed by an application before it is
|
|||
|
viewable or usable to a user. Expected uses for Content-Type
|
|||
|
application include mail-based file transfer, spreadsheets, data for
|
|||
|
mail-based scheduling systems, and languages for "active"
|
|||
|
(computational) email. (The latter, in particular, can pose security
|
|||
|
problems which must be understood by implementors, and are considered
|
|||
|
in detail in the discussion of the application/PostScript content-
|
|||
|
type.)
|
|||
|
|
|||
|
For example, a meeting scheduler might define a standard
|
|||
|
representation for information about proposed meeting dates. An
|
|||
|
intelligent user agent would use this information to conduct a dialog
|
|||
|
with the user, and might then send further mail based on that dialog.
|
|||
|
More generally, there have been several "active" messaging languages
|
|||
|
developed in which programs in a suitably specialized language are
|
|||
|
sent through the mail and automatically run in the recipient's
|
|||
|
environment.
|
|||
|
|
|||
|
Such applications may be defined as subtypes of the "application"
|
|||
|
Content-Type. This document defines two subtypes: octet-stream, and
|
|||
|
PostScript.
|
|||
|
|
|||
|
In general, the subtype of application will often be the name of the
|
|||
|
application for which the data are intended. This does not mean,
|
|||
|
however, that any application program name may be used freely as a
|
|||
|
subtype of application. Such usages (other than subtypes beginning
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 49]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
with "x-") must be registered with IANA, as described in Appendix E.
|
|||
|
|
|||
|
7.4.1. The Application/Octet-Stream (primary) subtype
|
|||
|
|
|||
|
The primary subtype of application, "octet-stream", may be used to
|
|||
|
indicate that a body contains binary data. The set of possible
|
|||
|
parameters includes, but is not limited to:
|
|||
|
|
|||
|
TYPE -- the general type or category of binary data. This is
|
|||
|
intended as information for the human recipient rather than for
|
|||
|
any automatic processing.
|
|||
|
|
|||
|
PADDING -- the number of bits of padding that were appended to the
|
|||
|
bit-stream comprising the actual contents to produce the enclosed
|
|||
|
byte-oriented data. This is useful for enclosing a bit-stream in
|
|||
|
a body when the total number of bits is not a multiple of the byte
|
|||
|
size.
|
|||
|
|
|||
|
An additional parameter, "conversions", was defined in [RFC-1341] but
|
|||
|
has been removed.
|
|||
|
|
|||
|
RFC 1341 also defined the use of a "NAME" parameter which gave a
|
|||
|
suggested file name to be used if the data were to be written to a
|
|||
|
file. This has been deprecated in anticipation of a separate
|
|||
|
Content-Disposition header field, to be defined in a subsequent RFC.
|
|||
|
|
|||
|
The recommended action for an implementation that receives
|
|||
|
application/octet-stream mail is to simply offer to put the data in a
|
|||
|
file, with any Content-Transfer-Encoding undone, or perhaps to use it
|
|||
|
as input to a user-specified process.
|
|||
|
|
|||
|
To reduce the danger of transmitting rogue programs through the mail,
|
|||
|
it is strongly recommended that implementations NOT implement a
|
|||
|
path-search mechanism whereby an arbitrary program named in the
|
|||
|
Content-Type parameter (e.g., an "interpreter=" parameter) is found
|
|||
|
and executed using the mail body as input.
|
|||
|
|
|||
|
7.4.2. The Application/PostScript subtype
|
|||
|
|
|||
|
A Content-Type of "application/postscript" indicates a PostScript
|
|||
|
program. Currently two variants of the PostScript language are
|
|||
|
allowed; the original level 1 variant is described in [POSTSCRIPT]
|
|||
|
and the more recent level 2 variant is described in [POSTSCRIPT2].
|
|||
|
|
|||
|
PostScript is a registered trademark of Adobe Systems, Inc. Use of
|
|||
|
the MIME content-type "application/postscript" implies recognition of
|
|||
|
that trademark and all the rights it entails.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 50]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
The PostScript language definition provides facilities for internal
|
|||
|
labeling of the specific language features a given program uses. This
|
|||
|
labeling, called the PostScript document structuring conventions, is
|
|||
|
very general and provides substantially more information than just
|
|||
|
the language level.
|
|||
|
|
|||
|
The use of document structuring conventions, while not required, is
|
|||
|
strongly recommended as an aid to interoperability. Documents which
|
|||
|
lack proper structuring conventions cannot be tested to see whether
|
|||
|
or not they will work in a given environment. As such, some systems
|
|||
|
may assume the worst and refuse to process unstructured documents.
|
|||
|
|
|||
|
The execution of general-purpose PostScript interpreters entails
|
|||
|
serious security risks, and implementors are discouraged from simply
|
|||
|
sending PostScript email bodies to "off-the-shelf" interpreters.
|
|||
|
While it is usually safe to send PostScript to a printer, where the
|
|||
|
potential for harm is greatly constrained, implementors should
|
|||
|
consider all of the following before they add interactive display of
|
|||
|
PostScript bodies to their mail readers.
|
|||
|
|
|||
|
The remainder of this section outlines some, though probably not all,
|
|||
|
of the possible problems with sending PostScript through the mail.
|
|||
|
|
|||
|
Dangerous operations in the PostScript language include, but may not
|
|||
|
be limited to, the PostScript operators deletefile, renamefile,
|
|||
|
filenameforall, and file. File is only dangerous when applied to
|
|||
|
something other than standard input or output. Implementations may
|
|||
|
also define additional nonstandard file operators; these may also
|
|||
|
pose a threat to security. Filenameforall, the wildcard file search
|
|||
|
operator, may appear at first glance to be harmless. Note, however,
|
|||
|
that this operator has the potential to reveal information about what
|
|||
|
files the recipient has access to, and this information may itself be
|
|||
|
sensitive. Message senders should avoid the use of potentially
|
|||
|
dangerous file operators, since these operators are quite likely to
|
|||
|
be unavailable in secure PostScript implementations. Message-
|
|||
|
receiving and -displaying software should either completely disable
|
|||
|
all potentially dangerous file operators or take special care not to
|
|||
|
delegate any special authority to their operation. These operators
|
|||
|
should be viewed as being done by an outside agency when interpreting
|
|||
|
PostScript documents. Such disabling and/or checking should be done
|
|||
|
completely outside of the reach of the PostScript language itself;
|
|||
|
care should be taken to insure that no method exists for re-enabling
|
|||
|
full-function versions of these operators.
|
|||
|
|
|||
|
The PostScript language provides facilities for exiting the normal
|
|||
|
interpreter, or server, loop. Changes made in this "outer"
|
|||
|
environment are customarily retained across documents, and may in
|
|||
|
some cases be retained semipermanently in nonvolatile memory. The
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 51]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
operators associated with exiting the interpreter loop have the
|
|||
|
potential to interfere with subsequent document processing. As such,
|
|||
|
their unrestrained use constitutes a threat of service denial.
|
|||
|
PostScript operators that exit the interpreter loop include, but may
|
|||
|
not be limited to, the exitserver and startjob operators. Message-
|
|||
|
sending software should not generate PostScript that depends on
|
|||
|
exiting the interpreter loop to operate. The ability to exit will
|
|||
|
probably be unavailable in secure PostScript implementations.
|
|||
|
Message-receiving and -displaying software should, if possible,
|
|||
|
disable the ability to make retained changes to the PostScript
|
|||
|
environment, and eliminate the startjob and exitserver commands. If
|
|||
|
these commands cannot be eliminated, the password associated with
|
|||
|
them should at least be set to a hard-to-guess value.
|
|||
|
|
|||
|
PostScript provides operators for setting system-wide and device-
|
|||
|
specific parameters. These parameter settings may be retained across
|
|||
|
jobs and may potentially pose a threat to the correct operation of
|
|||
|
the interpreter. The PostScript operators that set system and device
|
|||
|
parameters include, but may not be limited to, the setsystemparams
|
|||
|
and setdevparams operators. Message-sending software should not
|
|||
|
generate PostScript that depends on the setting of system or device
|
|||
|
parameters to operate correctly. The ability to set these parameters
|
|||
|
will probably be unavailable in secure PostScript implementations.
|
|||
|
Message-receiving and -displaying software should, if possible,
|
|||
|
disable the ability to change system and device parameters. If these
|
|||
|
operators cannot be disabled, the password associated with them
|
|||
|
should at least be set to a hard-to-guess value.
|
|||
|
|
|||
|
Some PostScript implementations provide nonstandard facilities for
|
|||
|
the direct loading and execution of machine code. Such facilities
|
|||
|
are quite obviously open to substantial abuse. Message-sending
|
|||
|
software should not make use of such features. Besides being totally
|
|||
|
hardware- specific, they are also likely to be unavailable in secure
|
|||
|
implementations of PostScript. Message-receiving and -displaying
|
|||
|
software should not allow such operators to be used if they exist.
|
|||
|
|
|||
|
PostScript is an extensible language, and many, if not most,
|
|||
|
implementations of it provide a number of their own extensions. This
|
|||
|
document does not deal with such extensions explicitly since they
|
|||
|
constitute an unknown factor. Message-sending software should not
|
|||
|
make use of nonstandard extensions; they are likely to be missing
|
|||
|
from some implementations. Message-receiving and -displaying software
|
|||
|
should make sure that any nonstandard PostScript operators are secure
|
|||
|
and don't present any kind of threat.
|
|||
|
|
|||
|
It is possible to write PostScript that consumes huge amounts of
|
|||
|
various system resources. It is also possible to write PostScript
|
|||
|
programs that loop infinitely. Both types of programs have the
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 52]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
potential to cause damage if sent to unsuspecting recipients.
|
|||
|
Message-sending software should avoid the construction and
|
|||
|
dissemination of such programs, which is antisocial. Message-
|
|||
|
receiving and -displaying software should provide appropriate
|
|||
|
mechanisms to abort processing of a document after a reasonable
|
|||
|
amount of time has elapsed. In addition, PostScript interpreters
|
|||
|
should be limited to the consumption of only a reasonable amount of
|
|||
|
any given system resource.
|
|||
|
|
|||
|
Finally, bugs may exist in some PostScript interpreters which could
|
|||
|
possibly be exploited to gain unauthorized access to a recipient's
|
|||
|
system. Apart from noting this possibility, there is no specific
|
|||
|
action to take to prevent this, apart from the timely correction of
|
|||
|
such bugs if any are found.
|
|||
|
|
|||
|
7.4.3. Other Application subtypes
|
|||
|
|
|||
|
It is expected that many other subtypes of application will be
|
|||
|
defined in the future. MIME implementations must generally treat any
|
|||
|
unrecognized subtypes as being equivalent to application/octet-
|
|||
|
stream.
|
|||
|
|
|||
|
The formal grammar for content-type header fields for application
|
|||
|
data is given by:
|
|||
|
|
|||
|
application-type := "application" "/" application-subtype
|
|||
|
|
|||
|
application-subtype := ("octet-stream" *stream-param)
|
|||
|
/ "postscript" / extension-token
|
|||
|
|
|||
|
stream-param := (";" "type" "=" value)
|
|||
|
/ (";" "padding" "=" padding)
|
|||
|
|
|||
|
padding := "0" / "1" / "2" / "3" / "4" / "5" / "6" / "7"
|
|||
|
|
|||
|
7.5. The Image Content-Type
|
|||
|
|
|||
|
A Content-Type of "image" indicates that the body contains an image.
|
|||
|
The subtype names the specific image format. These names are case
|
|||
|
insensitive. Two initial subtypes are "jpeg" for the JPEG format,
|
|||
|
JFIF encoding, and "gif" for GIF format [GIF].
|
|||
|
|
|||
|
The list of image subtypes given here is neither exclusive nor
|
|||
|
exhaustive, and is expected to grow as more types are registered with
|
|||
|
IANA, as described in Appendix E.
|
|||
|
|
|||
|
The formal grammar for the content-type header field for data of type
|
|||
|
image is given by:
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 53]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
image-type := "image" "/" ("gif" / "jpeg" / extension-token)
|
|||
|
|
|||
|
7.6. The Audio Content-Type
|
|||
|
|
|||
|
A Content-Type of "audio" indicates that the body contains audio
|
|||
|
data. Although there is not yet a consensus on an "ideal" audio
|
|||
|
format for use with computers, there is a pressing need for a format
|
|||
|
capable of providing interoperable behavior.
|
|||
|
|
|||
|
The initial subtype of "basic" is specified to meet this requirement
|
|||
|
by providing an absolutely minimal lowest common denominator audio
|
|||
|
format. It is expected that richer formats for higher quality and/or
|
|||
|
lower bandwidth audio will be defined by a later document.
|
|||
|
|
|||
|
The content of the "audio/basic" subtype is audio encoded using 8-bit
|
|||
|
ISDN mu-law [PCM]. When this subtype is present, a sample rate of
|
|||
|
8000 Hz and a single channel is assumed.
|
|||
|
|
|||
|
The formal grammar for the content-type header field for data of type
|
|||
|
audio is given by:
|
|||
|
|
|||
|
audio-type := "audio" "/" ("basic" / extension-token)
|
|||
|
|
|||
|
7.7. The Video Content-Type
|
|||
|
|
|||
|
A Content-Type of "video" indicates that the body contains a time-
|
|||
|
varying-picture image, possibly with color and coordinated sound.
|
|||
|
The term "video" is used extremely generically, rather than with
|
|||
|
reference to any particular technology or format, and is not meant to
|
|||
|
preclude subtypes such as animated drawings encoded compactly. The
|
|||
|
subtype "mpeg" refers to video coded according to the MPEG standard
|
|||
|
[MPEG].
|
|||
|
|
|||
|
Note that although in general this document strongly discourages the
|
|||
|
mixing of multiple media in a single body, it is recognized that many
|
|||
|
so-called "video" formats include a representation for synchronized
|
|||
|
audio, and this is explicitly permitted for subtypes of "video".
|
|||
|
|
|||
|
The formal grammar for the content-type header field for data of type
|
|||
|
video is given by:
|
|||
|
|
|||
|
video-type := "video" "/" ("mpeg" / extension-token)
|
|||
|
|
|||
|
7.8. Experimental Content-Type Values
|
|||
|
|
|||
|
A Content-Type value beginning with the characters "X-" is a private
|
|||
|
value, to be used by consenting mail systems by mutual agreement.
|
|||
|
Any format without a rigorous and public definition must be named
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 54]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
with an "X-" prefix, and publicly specified values shall never begin
|
|||
|
with "X-". (Older versions of the widely-used Andrew system use the
|
|||
|
"X-BE2" name, so new systems should probably choose a different
|
|||
|
name.)
|
|||
|
|
|||
|
In general, the use of "X-" top-level types is strongly discouraged.
|
|||
|
Implementors should invent subtypes of the existing types whenever
|
|||
|
possible. The invention of new types is intended to be restricted
|
|||
|
primarily to the development of new media types for email, such as
|
|||
|
digital odors or holography, and not for new data formats in general.
|
|||
|
In many cases, a subtype of application will be more appropriate than
|
|||
|
a new top-level type.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 55]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
8. Summary
|
|||
|
|
|||
|
Using the MIME-Version, Content-Type, and Content-Transfer-Encoding
|
|||
|
header fields, it is possible to include, in a standardized way,
|
|||
|
arbitrary types of data objects with RFC 822 conformant mail
|
|||
|
messages. No restrictions imposed by either RFC 821 or RFC 822 are
|
|||
|
violated, and care has been taken to avoid problems caused by
|
|||
|
additional restrictions imposed by the characteristics of some
|
|||
|
Internet mail transport mechanisms (see Appendix B). The "multipart"
|
|||
|
and "message" Content-Types allow mixing and hierarchical structuring
|
|||
|
of objects of different types in a single message. Further Content-
|
|||
|
Types provide a standardized mechanism for tagging messages or body
|
|||
|
parts as audio, image, or several other kinds of data. A
|
|||
|
distinguished parameter syntax allows further specification of data
|
|||
|
format details, particularly the specification of alternate character
|
|||
|
sets. Additional optional header fields provide mechanisms for
|
|||
|
certain extensions deemed desirable by many implementors. Finally, a
|
|||
|
number of useful Content-Types are defined for general use by
|
|||
|
consenting user agents, notably message/partial, and
|
|||
|
message/external-body.
|
|||
|
|
|||
|
9. Security Considerations
|
|||
|
|
|||
|
Security issues are discussed in Section 7.4.2 and in Appendix F.
|
|||
|
Implementors should pay special attention to the security
|
|||
|
implications of any mail content-types that can cause the remote
|
|||
|
execution of any actions in the recipient's environment. In such
|
|||
|
cases, the discussion of the application/postscript content-type in
|
|||
|
Section 7.4.2 may serve as a model for considering other content-
|
|||
|
types with remote execution capabilities.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 56]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
10. Authors' Addresses
|
|||
|
|
|||
|
For more information, the authors of this document may be contacted
|
|||
|
via Internet mail:
|
|||
|
|
|||
|
Nathaniel S. Borenstein
|
|||
|
MRE 2D-296, Bellcore
|
|||
|
445 South St.
|
|||
|
Morristown, NJ 07962-1910
|
|||
|
|
|||
|
Phone: +1 201 829 4270
|
|||
|
Fax: +1 201 829 7019
|
|||
|
Email: nsb@bellcore.com
|
|||
|
|
|||
|
|
|||
|
Ned Freed
|
|||
|
Innosoft International, Inc.
|
|||
|
250 West First Street
|
|||
|
Suite 240
|
|||
|
Claremont, CA 91711
|
|||
|
|
|||
|
Phone: +1 909 624 7907
|
|||
|
Fax: +1 909 621 5319
|
|||
|
Email: ned@innosoft.com
|
|||
|
|
|||
|
MIME is a result of the work of the Internet Engineering Task Force
|
|||
|
Working Group on Email Extensions. The chairman of that group, Greg
|
|||
|
Vaudreuil, may be reached at:
|
|||
|
|
|||
|
Gregory M. Vaudreuil
|
|||
|
Tigon Corporation
|
|||
|
17060 Dallas Parkway
|
|||
|
Dallas Texas, 75248
|
|||
|
|
|||
|
Phone: +1 214-733-2722
|
|||
|
EMail: gvaudre@cnri.reston.va.us
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 57]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
11. Acknowledgements
|
|||
|
|
|||
|
This document is the result of the collective effort of a large
|
|||
|
number of people, at several IETF meetings, on the IETF-SMTP and
|
|||
|
IETF-822 mailing lists, and elsewhere. Although any enumeration
|
|||
|
seems doomed to suffer from egregious omissions, the following are
|
|||
|
among the many contributors to this effort:
|
|||
|
|
|||
|
Harald Tveit Alvestrand Timo Lehtinen
|
|||
|
Randall Atkinson John R. MacMillan
|
|||
|
Philippe Brandon Rick McGowan
|
|||
|
Kevin Carosso Leo Mclaughlin
|
|||
|
Uhhyung Choi Goli Montaser-Kohsari
|
|||
|
Cristian Constantinof Keith Moore
|
|||
|
Mark Crispin Tom Moore
|
|||
|
Dave Crocker Erik Naggum
|
|||
|
Terry Crowley Mark Needleman
|
|||
|
Walt Daniels John Noerenberg
|
|||
|
Frank Dawson Mats Ohrman
|
|||
|
Hitoshi Doi Julian Onions
|
|||
|
Kevin Donnelly Michael Patton
|
|||
|
Keith Edwards David J. Pepper
|
|||
|
Chris Eich Blake C. Ramsdell
|
|||
|
Johnny Eriksson Luc Rooijakkers
|
|||
|
Craig Everhart Marshall T. Rose
|
|||
|
Patrik Faeltstroem Jonathan Rosenberg
|
|||
|
Erik E. Fair Jan Rynning
|
|||
|
Roger Fajman Harri Salminen
|
|||
|
Alain Fontaine Michael Sanderson
|
|||
|
James M. Galvin Masahiro Sekiguchi
|
|||
|
Philip Gladstone Mark Sherman
|
|||
|
Thomas Gordon Keld Simonsen
|
|||
|
Phill Gross Bob Smart
|
|||
|
James Hamilton Peter Speck
|
|||
|
Steve Hardcastle-Kille Henry Spencer
|
|||
|
David Herron Einar Stefferud
|
|||
|
Bruce Howard Michael Stein
|
|||
|
Bill Janssen Klaus Steinberger
|
|||
|
Olle Jaernefors Peter Svanberg
|
|||
|
Risto Kankkunen James Thompson
|
|||
|
Phil Karn Steve Uhler
|
|||
|
Alan Katz Stuart Vance
|
|||
|
Tim Kehres Erik van der Poel
|
|||
|
Neil Katin Guido van Rossum
|
|||
|
Kyuho Kim Peter Vanderbilt
|
|||
|
Anders Klemets Greg Vaudreuil
|
|||
|
John Klensin Ed Vielmetti
|
|||
|
Valdis Kletniek Ryan Waldron
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 58]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
Jim Knowles Wally Wedel
|
|||
|
Stev Knowles Sven-Ove Westberg
|
|||
|
Bob Kummerfeld Brian Wideen
|
|||
|
Pekka Kytolaakso John Wobus
|
|||
|
Stellan Lagerstrom Glenn Wright
|
|||
|
Vincent Lau Rayan Zachariassen
|
|||
|
Donald Lindsay David Zimmerman
|
|||
|
Marc Andreessen Bob Braden
|
|||
|
Brian Capouch Peter Clitherow
|
|||
|
Dave Collier-Brown John Coonrod
|
|||
|
Stephen Crocker Jim Davis
|
|||
|
Axel Deininger Dana S Emery
|
|||
|
Martin Forssen Stephen Gildea
|
|||
|
Terry Gray Mark Horton
|
|||
|
Warner Losh Carlyn Lowery
|
|||
|
Laurence Lundblade Charles Lynn
|
|||
|
Larry Masinter Michael J. McInerny
|
|||
|
Jon Postel Christer Romson
|
|||
|
Yutaka Sato Markku Savela
|
|||
|
Richard Alan Schafer Larry W. Virden
|
|||
|
Rhys Weatherly Jay Weber
|
|||
|
Dave Wecker
|
|||
|
|
|||
|
The authors apologize for any omissions from this list, which are
|
|||
|
certainly unintentional.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 59]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
Appendix A -- Minimal MIME-Conformance
|
|||
|
|
|||
|
The mechanisms described in this document are open-ended. It is
|
|||
|
definitely not expected that all implementations will support all of
|
|||
|
the Content-Types described, nor that they will all share the same
|
|||
|
extensions. In order to promote interoperability, however, it is
|
|||
|
useful to define the concept of "MIME-conformance" to define a
|
|||
|
certain level of implementation that allows the useful interworking
|
|||
|
of messages with content that differs from US ASCII text. In this
|
|||
|
section, we specify the requirements for such conformance.
|
|||
|
|
|||
|
A mail user agent that is MIME-conformant MUST:
|
|||
|
|
|||
|
1. Always generate a "MIME-Version: 1.0" header field.
|
|||
|
|
|||
|
2. Recognize the Content-Transfer-Encoding header field, and
|
|||
|
decode all received data encoded with either the quoted-printable
|
|||
|
or base64 implementations. Encode any data sent that is not in
|
|||
|
seven-bit mail-ready representation using one of these
|
|||
|
transformations and include the appropriate Content-Transfer-
|
|||
|
Encoding header field, unless the underlying transport mechanism
|
|||
|
supports non-seven-bit data, as SMTP does not.
|
|||
|
|
|||
|
3. Recognize and interpret the Content-Type header field, and
|
|||
|
avoid showing users raw data with a Content-Type field other than
|
|||
|
text. Be able to send at least text/plain messages, with the
|
|||
|
character set specified as a parameter if it is not US-ASCII.
|
|||
|
|
|||
|
4. Explicitly handle the following Content-Type values, to at
|
|||
|
least the following extents:
|
|||
|
|
|||
|
Text:
|
|||
|
|
|||
|
-- Recognize and display "text" mail
|
|||
|
with the character set "US-ASCII."
|
|||
|
|
|||
|
-- Recognize other character sets at
|
|||
|
least to the extent of being able
|
|||
|
to inform the user about what
|
|||
|
character set the message uses.
|
|||
|
|
|||
|
-- Recognize the "ISO-8859-*" character
|
|||
|
sets to the extent of being able to
|
|||
|
display those characters that are
|
|||
|
common to ISO-8859-* and US-ASCII,
|
|||
|
namely all characters represented
|
|||
|
by octet values 0-127.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 60]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
-- For unrecognized subtypes, show or
|
|||
|
offer to show the user the "raw"
|
|||
|
version of the data after
|
|||
|
conversion of the content from
|
|||
|
canonical form to local form.
|
|||
|
|
|||
|
Message:
|
|||
|
|
|||
|
-- Recognize and display at least the
|
|||
|
primary (822) encapsulation.
|
|||
|
|
|||
|
Multipart:
|
|||
|
|
|||
|
-- Recognize the primary (mixed)
|
|||
|
subtype. Display all relevant
|
|||
|
information on the message level
|
|||
|
and the body part header level and
|
|||
|
then display or offer to display
|
|||
|
each of the body parts individually.
|
|||
|
|
|||
|
-- Recognize the "alternative" subtype,
|
|||
|
and avoid showing the user
|
|||
|
redundant parts of
|
|||
|
multipart/alternative mail.
|
|||
|
|
|||
|
-- Treat any unrecognized subtypes as if
|
|||
|
they were "mixed".
|
|||
|
|
|||
|
Application:
|
|||
|
|
|||
|
-- Offer the ability to remove either of
|
|||
|
the two types of Content-Transfer-
|
|||
|
Encoding defined in this document
|
|||
|
and put the resulting information
|
|||
|
in a user file.
|
|||
|
|
|||
|
5. Upon encountering any unrecognized Content- Type, an
|
|||
|
implementation must treat it as if it had a Content-Type of
|
|||
|
"application/octet-stream" with no parameter sub-arguments. How
|
|||
|
such data are handled is up to an implementation, but likely
|
|||
|
options for handling such unrecognized data include offering the
|
|||
|
user to write it into a file (decoded from its mail transport
|
|||
|
format) or offering the user to name a program to which the
|
|||
|
decoded data should be passed as input. Unrecognized predefined
|
|||
|
types, which in a MIME-conformant mailer might still include
|
|||
|
audio, image, or video, should also be treated in this way.
|
|||
|
|
|||
|
A user agent that meets the above conditions is said to be MIME-
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 61]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
conformant. The meaning of this phrase is that it is assumed to be
|
|||
|
"safe" to send virtually any kind of properly-marked data to users of
|
|||
|
such mail systems, because such systems will at least be able to
|
|||
|
treat the data as undifferentiated binary, and will not simply splash
|
|||
|
it onto the screen of unsuspecting users. There is another sense in
|
|||
|
which it is always "safe" to send data in a format that is MIME-
|
|||
|
conformant, which is that such data will not break or be broken by
|
|||
|
any known systems that are conformant with RFC 821 and RFC 822. User
|
|||
|
agents that are MIME-conformant have the additional guarantee that
|
|||
|
the user will not be shown data that were never intended to be viewed
|
|||
|
as text.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 62]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
Appendix B -- General Guidelines For Sending Email Data
|
|||
|
|
|||
|
Internet email is not a perfect, homogeneous system. Mail may become
|
|||
|
corrupted at several stages in its travel to a final destination.
|
|||
|
Specifically, email sent throughout the Internet may travel across
|
|||
|
many networking technologies. Many networking and mail technologies
|
|||
|
do not support the full functionality possible in the SMTP transport
|
|||
|
environment. Mail traversing these systems is likely to be modified
|
|||
|
in such a way that it can be transported.
|
|||
|
|
|||
|
There exist many widely-deployed non-conformant MTAs in the Internet.
|
|||
|
These MTAs, speaking the SMTP protocol, alter messages on the fly to
|
|||
|
take advantage of the internal data structure of the hosts they are
|
|||
|
implemented on, or are just plain broken.
|
|||
|
|
|||
|
The following guidelines may be useful to anyone devising a data
|
|||
|
format (Content-Type) that will survive the widest range of
|
|||
|
networking technologies and known broken MTAs unscathed. Note that
|
|||
|
anything encoded in the base64 encoding will satisfy these rules, but
|
|||
|
that some well-known mechanisms, notably the UNIX uuencode facility,
|
|||
|
will not. Note also that anything encoded in the Quoted-Printable
|
|||
|
encoding will survive most gateways intact, but possibly not some
|
|||
|
gateways to systems that use the EBCDIC character set.
|
|||
|
|
|||
|
(1) Under some circumstances the encoding used for data may change
|
|||
|
as part of normal gateway or user agent operation. In particular,
|
|||
|
conversion from base64 to quoted-printable and vice versa may be
|
|||
|
necessary. This may result in the confusion of CRLF sequences with
|
|||
|
line breaks in text bodies. As such, the persistence of CRLF as
|
|||
|
something other than a line break must not be relied on.
|
|||
|
|
|||
|
(2) Many systems may elect to represent and store text data using
|
|||
|
local newline conventions. Local newline conventions may not match
|
|||
|
the RFC822 CRLF convention -- systems are known that use plain CR,
|
|||
|
plain LF, CRLF, or counted records. The result is that isolated
|
|||
|
CR and LF characters are not well tolerated in general; they may
|
|||
|
be lost or converted to delimiters on some systems, and hence must
|
|||
|
not be relied on.
|
|||
|
|
|||
|
(3) TAB (HT) characters may be misinterpreted or may be
|
|||
|
automatically converted to variable numbers of spaces. This is
|
|||
|
unavoidable in some environments, notably those not based on the
|
|||
|
ASCII character set. Such conversion is STRONGLY DISCOURAGED, but
|
|||
|
it may occur, and mail formats must not rely on the persistence of
|
|||
|
TAB (HT) characters.
|
|||
|
|
|||
|
(4) Lines longer than 76 characters may be wrapped or truncated in
|
|||
|
some environments. Line wrapping and line truncation are STRONGLY
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 63]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
DISCOURAGED, but unavoidable in some cases. Applications which
|
|||
|
require long lines must somehow differentiate between soft and
|
|||
|
hard line breaks. (A simple way to do this is to use the quoted-
|
|||
|
printable encoding.)
|
|||
|
|
|||
|
(5) Trailing "white space" characters (SPACE, TAB (HT)) on a line
|
|||
|
may be discarded by some transport agents, while other transport
|
|||
|
agents may pad lines with these characters so that all lines in a
|
|||
|
mail file are of equal length. The persistence of trailing white
|
|||
|
space, therefore, must not be relied on.
|
|||
|
|
|||
|
(6) Many mail domains use variations on the ASCII character set,
|
|||
|
or use character sets such as EBCDIC which contain most but not
|
|||
|
all of the US-ASCII characters. The correct translation of
|
|||
|
characters not in the "invariant" set cannot be depended on across
|
|||
|
character converting gateways. For example, this situation is a
|
|||
|
problem when sending uuencoded information across BITNET, an
|
|||
|
EBCDIC system. Similar problems can occur without crossing a
|
|||
|
gateway, since many Internet hosts use character sets other than
|
|||
|
ASCII internally. The definition of Printable Strings in X.400
|
|||
|
adds further restrictions in certain special cases. In
|
|||
|
particular, the only characters that are known to be consistent
|
|||
|
across all gateways are the 73 characters that correspond to the
|
|||
|
upper and lower case letters A-Z and a-z, the 10 digits 0-9, and
|
|||
|
the following eleven special characters:
|
|||
|
|
|||
|
"'" (ASCII code 39)
|
|||
|
"(" (ASCII code 40)
|
|||
|
")" (ASCII code 41)
|
|||
|
"+" (ASCII code 43)
|
|||
|
"," (ASCII code 44)
|
|||
|
"-" (ASCII code 45)
|
|||
|
"." (ASCII code 46)
|
|||
|
"/" (ASCII code 47)
|
|||
|
":" (ASCII code 58)
|
|||
|
"=" (ASCII code 61)
|
|||
|
"?" (ASCII code 63)
|
|||
|
|
|||
|
A maximally portable mail representation, such as the base64
|
|||
|
encoding, will confine itself to relatively short lines of text in
|
|||
|
which the only meaningful characters are taken from this set of 73
|
|||
|
characters.
|
|||
|
|
|||
|
(7) Some mail transport agents will corrupt data that includes
|
|||
|
certain literal strings. In particular, a period (".") alone on a
|
|||
|
line is known to be corrupted by some (incorrect) SMTP
|
|||
|
implementations, and a line that starts with the five characters
|
|||
|
"From " (the fifth character is a SPACE) are commonly corrupted as
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 64]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
well. A careful composition agent can prevent these corruptions
|
|||
|
by encoding the data (e.g., in the quoted-printable encoding,
|
|||
|
"=46rom " in place of "From " at the start of a line, and "=2E" in
|
|||
|
place of "." alone on a line.
|
|||
|
|
|||
|
Please note that the above list is NOT a list of recommended
|
|||
|
practices for MTAs. RFC 821 MTAs are prohibited from altering the
|
|||
|
character of white space or wrapping long lines. These BAD and
|
|||
|
illegal practices are known to occur on established networks, and
|
|||
|
implementations should be robust in dealing with the bad effects they
|
|||
|
can cause.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 65]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
Appendix C -- A Complex Multipart Example
|
|||
|
|
|||
|
What follows is the outline of a complex multipart message. This
|
|||
|
message has five parts to be displayed serially: two introductory
|
|||
|
plain text parts, an embedded multipart message, a richtext part, and
|
|||
|
a closing encapsulated text message in a non-ASCII character set.
|
|||
|
The embedded multipart message has two parts to be displayed in
|
|||
|
parallel, a picture and an audio fragment.
|
|||
|
|
|||
|
MIME-Version: 1.0
|
|||
|
From: Nathaniel Borenstein <nsb@bellcore.com>
|
|||
|
To: Ned Freed <ned@innosoft.com>
|
|||
|
Subject: A multipart example
|
|||
|
Content-Type: multipart/mixed;
|
|||
|
boundary=unique-boundary-1
|
|||
|
|
|||
|
This is the preamble area of a multipart message.
|
|||
|
Mail readers that understand multipart format
|
|||
|
should ignore this preamble.
|
|||
|
If you are reading this text, you might want to
|
|||
|
consider changing to a mail reader that understands
|
|||
|
how to properly display multipart messages.
|
|||
|
--unique-boundary-1
|
|||
|
|
|||
|
...Some text appears here...
|
|||
|
[Note that the preceding blank line means
|
|||
|
no header fields were given and this is text,
|
|||
|
with charset US ASCII. It could have been
|
|||
|
done with explicit typing as in the next part.]
|
|||
|
|
|||
|
--unique-boundary-1
|
|||
|
Content-type: text/plain; charset=US-ASCII
|
|||
|
|
|||
|
This could have been part of the previous part,
|
|||
|
but illustrates explicit versus implicit
|
|||
|
typing of body parts.
|
|||
|
|
|||
|
--unique-boundary-1
|
|||
|
Content-Type: multipart/parallel;
|
|||
|
boundary=unique-boundary-2
|
|||
|
|
|||
|
|
|||
|
--unique-boundary-2
|
|||
|
Content-Type: audio/basic
|
|||
|
Content-Transfer-Encoding: base64
|
|||
|
|
|||
|
... base64-encoded 8000 Hz single-channel
|
|||
|
mu-law-format audio data goes here....
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 66]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
--unique-boundary-2
|
|||
|
Content-Type: image/gif
|
|||
|
Content-Transfer-Encoding: base64
|
|||
|
|
|||
|
... base64-encoded image data goes here....
|
|||
|
|
|||
|
--unique-boundary-2--
|
|||
|
|
|||
|
--unique-boundary-1
|
|||
|
Content-type: text/richtext
|
|||
|
|
|||
|
This is <bold><italic>richtext.</italic></bold>
|
|||
|
<smaller>as defined in RFC 1341</smaller>
|
|||
|
<nl><nl>Isn't it
|
|||
|
<bigger><bigger>cool?</bigger></bigger>
|
|||
|
|
|||
|
--unique-boundary-1
|
|||
|
Content-Type: message/rfc822
|
|||
|
|
|||
|
From: (mailbox in US-ASCII)
|
|||
|
To: (address in US-ASCII)
|
|||
|
Subject: (subject in US-ASCII)
|
|||
|
Content-Type: Text/plain; charset=ISO-8859-1
|
|||
|
Content-Transfer-Encoding: Quoted-printable
|
|||
|
|
|||
|
... Additional text in ISO-8859-1 goes here ...
|
|||
|
|
|||
|
--unique-boundary-1--
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 67]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
Appendix D -- Collected Grammar
|
|||
|
|
|||
|
This appendix contains the complete BNF grammar for all the syntax
|
|||
|
specified by this document.
|
|||
|
|
|||
|
By itself, however, this grammar is incomplete. It refers to several
|
|||
|
entities that are defined by RFC 822. Rather than reproduce those
|
|||
|
definitions here, and risk unintentional differences between the two,
|
|||
|
this document simply refers the reader to RFC 822 for the remaining
|
|||
|
definitions. Wherever a term is undefined, it refers to the RFC 822
|
|||
|
definition.
|
|||
|
|
|||
|
application-subtype := ("octet-stream" *stream-param)
|
|||
|
/ "postscript" / extension-token
|
|||
|
|
|||
|
application-type := "application" "/" application-subtype
|
|||
|
|
|||
|
attribute := token ; case-insensitive
|
|||
|
|
|||
|
atype := "ftp" / "anon-ftp" / "tftp" / "local-file"
|
|||
|
/ "afs" / "mail-server" / extension-token
|
|||
|
; Case-insensitive
|
|||
|
|
|||
|
audio-type := "audio" "/" ("basic" / extension-token)
|
|||
|
|
|||
|
body-part := <"message" as defined in RFC 822,
|
|||
|
with all header fields optional, and with the
|
|||
|
specified delimiter not occurring anywhere in
|
|||
|
the message body, either on a line by itself
|
|||
|
or as a substring anywhere.>
|
|||
|
|
|||
|
NOTE: In certain transport enclaves, RFC 822 restrictions such as
|
|||
|
the one that limits bodies to printable ASCII characters may not
|
|||
|
be in force. (That is, the transport domains may resemble
|
|||
|
standard Internet mail transport as specified in RFC821 and
|
|||
|
assumed by RFC822, but without certain restrictions.) The
|
|||
|
relaxation of these restrictions should be construed as locally
|
|||
|
extending the definition of bodies, for example to include octets
|
|||
|
outside of the ASCII range, as long as these extensions are
|
|||
|
supported by the transport and adequately documented in the
|
|||
|
Content-Transfer-Encoding header field. However, in no event are
|
|||
|
headers (either message headers or body-part headers) allowed to
|
|||
|
contain anything other than ASCII characters.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 68]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
boundary := 0*69<bchars> bcharsnospace
|
|||
|
|
|||
|
bchars := bcharsnospace / " "
|
|||
|
|
|||
|
bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / "+" / "_"
|
|||
|
/ "," / "-" / "." / "/" / ":" / "=" / "?"
|
|||
|
|
|||
|
charset := "us-ascii" / "iso-8859-1" / "iso-8859-2"/ "iso-8859-3"
|
|||
|
/ "iso-8859-4" / "iso-8859-5" / "iso-8859-6" / "iso-8859-7"
|
|||
|
/ "iso-8859-8" / "iso-8859-9" / extension-token
|
|||
|
; case insensitive
|
|||
|
|
|||
|
close-delimiter := "--" boundary "--" CRLF;Again,no space by "--",
|
|||
|
|
|||
|
content := "Content-Type" ":" type "/" subtype *(";" parameter)
|
|||
|
; case-insensitive matching of type and subtype
|
|||
|
|
|||
|
delimiter := "--" boundary CRLF ;taken from Content-Type field.
|
|||
|
; There must be no space
|
|||
|
; between "--" and boundary.
|
|||
|
|
|||
|
description := "Content-Description" ":" *text
|
|||
|
|
|||
|
discard-text := *(*text CRLF)
|
|||
|
|
|||
|
encapsulation := delimiter body-part CRLF
|
|||
|
|
|||
|
encoding := "Content-Transfer-Encoding" ":" mechanism
|
|||
|
|
|||
|
epilogue := discard-text ; to be ignored upon receipt.
|
|||
|
|
|||
|
extension-token := x-token / iana-token
|
|||
|
|
|||
|
external-param := (";" "access-type" "=" atype)
|
|||
|
/ (";" "expiration" "=" date-time)
|
|||
|
|
|||
|
; Note that date-time is quoted
|
|||
|
/ (";" "size" "=" 1*DIGIT)
|
|||
|
/ (";" "permission" "=" ("read" / "read-write"))
|
|||
|
; Permission is case-insensitive
|
|||
|
/ (";" "name" "=" value)
|
|||
|
/ (";" "site" "=" value)
|
|||
|
/ (";" "dir" "=" value)
|
|||
|
/ (";" "mode" "=" value)
|
|||
|
/ (";" "server" "=" value)
|
|||
|
/ (";" "subject" "=" value)
|
|||
|
;access-type required; others required based on access-type
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 69]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
iana-token := <a publicly-defined extension token,
|
|||
|
registered with IANA, as specified in
|
|||
|
appendix E>
|
|||
|
|
|||
|
id := "Content-ID" ":" msg-id
|
|||
|
|
|||
|
image-type := "image" "/" ("gif" / "jpeg" / extension-token)
|
|||
|
|
|||
|
mechanism := "7bit" ; case-insensitive
|
|||
|
/ "quoted-printable"
|
|||
|
/ "base64"
|
|||
|
/ "8bit"
|
|||
|
/ "binary"
|
|||
|
/ x-token
|
|||
|
|
|||
|
message-subtype := "rfc822"
|
|||
|
/ "partial" 2#3partial-param
|
|||
|
/ "external-body" 1*external-param
|
|||
|
/ extension-token
|
|||
|
|
|||
|
message-type := "message" "/" message-subtype
|
|||
|
|
|||
|
multipart-body :=preamble 1*encapsulation close-delimiter epilogue
|
|||
|
|
|||
|
multipart-subtype := "mixed" / "parallel" / "digest"
|
|||
|
/ "alternative" / extension-token
|
|||
|
|
|||
|
multipart-type := "multipart" "/" multipart-subtype
|
|||
|
";" "boundary" "=" boundary
|
|||
|
|
|||
|
octet := "=" 2(DIGIT / "A" / "B" / "C" / "D" / "E" / "F")
|
|||
|
; octet must be used for characters > 127, =, SPACE, or
|
|||
|
TAB,
|
|||
|
; and is recommended for any characters not listed in
|
|||
|
; Appendix B as "mail-safe".
|
|||
|
|
|||
|
padding := "0" / "1" / "2" / "3" / "4" / "5" / "6" / "7"
|
|||
|
|
|||
|
parameter := attribute "=" value
|
|||
|
|
|||
|
partial-param := (";" "id" "=" value)
|
|||
|
/ (";" "number" "=" 1*DIGIT)
|
|||
|
/ (";" "total" "=" 1*DIGIT)
|
|||
|
; id & number required;total required for last part
|
|||
|
|
|||
|
preamble := discard-text ; to be ignored upon receipt.
|
|||
|
|
|||
|
ptext := octet / <any ASCII character except "=", SPACE, or TAB>
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 70]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
; characters not listed as "mail-safe" in Appendix B
|
|||
|
; are also not recommended.
|
|||
|
|
|||
|
quoted-printable := ([*(ptext / SPACE / TAB) ptext] ["="] CRLF)
|
|||
|
; Maximum line length of 76 characters excluding CRLF
|
|||
|
|
|||
|
stream-param := (";" "type" "=" value)
|
|||
|
/ (";" "padding" "=" padding)
|
|||
|
|
|||
|
subtype := token ; case-insensitive
|
|||
|
|
|||
|
text-subtype := "plain" / extension-token
|
|||
|
|
|||
|
text-type := "text" "/" text-subtype [";" "charset" "=" charset]
|
|||
|
|
|||
|
token := 1*<any (ASCII) CHAR except SPACE, CTLs, or tspecials>
|
|||
|
|
|||
|
tspecials := "(" / ")" / "<" / ">" / "@"
|
|||
|
/ "," / ";" / ":" / "\" / <">
|
|||
|
/ "/" / "[" / "]" / "?" / "="
|
|||
|
; Must be in quoted-string,
|
|||
|
; to use within parameter values
|
|||
|
|
|||
|
|
|||
|
type := "application" / "audio" ; case-insensitive
|
|||
|
/ "image" / "message"
|
|||
|
/ "multipart" / "text"
|
|||
|
/ "video" / extension-token
|
|||
|
; All values case-insensitive
|
|||
|
|
|||
|
value := token / quoted-string
|
|||
|
|
|||
|
version := "MIME-Version" ":" 1*DIGIT "." 1*DIGIT
|
|||
|
|
|||
|
video-type := "video" "/" ("mpeg" / extension-token)
|
|||
|
|
|||
|
x-token := <The two characters "X-" or "x-" followed, with no
|
|||
|
intervening white space, by any token>
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 71]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
Appendix E -- IANA Registration Procedures
|
|||
|
|
|||
|
MIME has been carefully designed to have extensible mechanisms, and
|
|||
|
it is expected that the set of content-type/subtype pairs and their
|
|||
|
associated parameters will grow significantly with time. Several
|
|||
|
other MIME fields, notably character set names, access-type
|
|||
|
parameters for the message/external-body type, and possibly even
|
|||
|
Content-Transfer-Encoding values, are likely to have new values
|
|||
|
defined over time. In order to ensure that the set of such values is
|
|||
|
developed in an orderly, well-specified, and public manner, MIME
|
|||
|
defines a registration process which uses the Internet Assigned
|
|||
|
Numbers Authority (IANA) as a central registry for such values.
|
|||
|
|
|||
|
In general, parameters in the content-type header field are used to
|
|||
|
convey supplemental information for various content types, and their
|
|||
|
use is defined when the content-type and subtype are defined. New
|
|||
|
parameters should not be defined as a way to introduce new
|
|||
|
functionality.
|
|||
|
|
|||
|
In order to simplify and standardize the registration process, this
|
|||
|
appendix gives templates for the registration of new values with
|
|||
|
IANA. Each of these is given in the form of an email message
|
|||
|
template, to be filled in by the registering party.
|
|||
|
|
|||
|
E.1 Registration of New Content-type/subtype Values
|
|||
|
|
|||
|
Note that MIME is generally expected to be extended by subtypes. If
|
|||
|
a new fundamental top-level type is needed, its specification must be
|
|||
|
published as an RFC or submitted in a form suitable to become an RFC,
|
|||
|
and be subject to the Internet standards process.
|
|||
|
|
|||
|
To: IANA@isi.edu
|
|||
|
Subject: Registration of new MIME
|
|||
|
content-type/subtype
|
|||
|
|
|||
|
MIME type name:
|
|||
|
|
|||
|
(If the above is not an existing top-level MIME type,
|
|||
|
please explain why an existing type cannot be used.)
|
|||
|
|
|||
|
MIME subtype name:
|
|||
|
|
|||
|
Required parameters:
|
|||
|
|
|||
|
Optional parameters:
|
|||
|
|
|||
|
Encoding considerations:
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 72]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
Security considerations:
|
|||
|
|
|||
|
Published specification:
|
|||
|
|
|||
|
(The published specification must be an Internet RFC or
|
|||
|
RFC-to-be if a new top-level type is being defined, and
|
|||
|
must be a publicly available specification in any
|
|||
|
case.)
|
|||
|
|
|||
|
Person & email address to contact for further information:
|
|||
|
|
|||
|
E.2 Registration of New Access-type Values
|
|||
|
for Message/external-body
|
|||
|
|
|||
|
To: IANA@isi.edu
|
|||
|
Subject: Registration of new MIME Access-type for
|
|||
|
Message/external-body content-type
|
|||
|
|
|||
|
MIME access-type name:
|
|||
|
|
|||
|
Required parameters:
|
|||
|
|
|||
|
Optional parameters:
|
|||
|
|
|||
|
Published specification:
|
|||
|
|
|||
|
(The published specification must be an Internet RFC or
|
|||
|
RFC-to-be.)
|
|||
|
|
|||
|
Person & email address to contact for further information:
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 73]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
Appendix F -- Summary of the Seven Content-types
|
|||
|
|
|||
|
Content-type: text
|
|||
|
|
|||
|
Subtypes defined by this document: plain
|
|||
|
|
|||
|
Important Parameters: charset
|
|||
|
|
|||
|
Encoding notes: quoted-printable generally preferred if an encoding
|
|||
|
is needed and the character set is mostly an ASCII superset.
|
|||
|
|
|||
|
Security considerations: Rich text formats such as TeX and Troff
|
|||
|
often contain mechanisms for executing arbitrary commands or file
|
|||
|
system operations, and should not be used automatically unless
|
|||
|
these security problems have been addressed. Even plain text may
|
|||
|
contain control characters that can be used to exploit the
|
|||
|
capabilities of "intelligent" terminals and cause security
|
|||
|
violations. User interfaces designed to run on such terminals
|
|||
|
should be aware of and try to prevent such problems.
|
|||
|
|
|||
|
________________________________________________________
|
|||
|
Content-type: multipart
|
|||
|
|
|||
|
Subtypes defined by this document: mixed, alternative,
|
|||
|
digest, parallel.
|
|||
|
|
|||
|
Important Parameters: boundary
|
|||
|
|
|||
|
Encoding notes: No content-transfer-encoding is permitted.
|
|||
|
|
|||
|
________________________________________________________
|
|||
|
Content-type: message
|
|||
|
|
|||
|
Subtypes defined by this document: rfc822, partial, external-body
|
|||
|
|
|||
|
Important Parameters: id, number, total, access-type, expiration,
|
|||
|
size, permission, name, site, directory, mode, server, subject
|
|||
|
|
|||
|
Encoding notes: No content-transfer-encoding is permitted.
|
|||
|
Specifically, only "7bit" is permitted for "message/partial" or
|
|||
|
"message/external-body", and only "7bit", "8bit", or "binary" are
|
|||
|
permitted for other subtypes of "message".
|
|||
|
______________________________________________________________
|
|||
|
Content-type: application
|
|||
|
|
|||
|
Subtypes defined by this document: octet-stream, postscript
|
|||
|
|
|||
|
Important Parameters: type, padding
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 74]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
Deprecated Parameters: name and conversions were
|
|||
|
defined in RFC 1341.
|
|||
|
|
|||
|
Encoding notes: base64 preferred for unreadable subtypes.
|
|||
|
|
|||
|
Security considerations: This type is intended for the
|
|||
|
transmission of data to be interpreted by locally-installed
|
|||
|
programs. If used, for example, to transmit executable
|
|||
|
binary programs or programs in general-purpose interpreted
|
|||
|
languages, such as LISP programs or shell scripts, severe
|
|||
|
security problems could result. Authors of mail-reading
|
|||
|
agents are cautioned against giving their systems the power
|
|||
|
to execute mail-based application data without carefully
|
|||
|
considering the security implications. While it is
|
|||
|
certainly possible to define safe application formats and
|
|||
|
even safe interpreters for unsafe formats, each interpreter
|
|||
|
should be evaluated separately for possible security
|
|||
|
problems.
|
|||
|
________________________________________________________________
|
|||
|
Content-type: image
|
|||
|
|
|||
|
Subtypes defined by this document: jpeg, gif
|
|||
|
|
|||
|
Important Parameters: none
|
|||
|
|
|||
|
Encoding notes: base64 generally preferred
|
|||
|
________________________________________________________________
|
|||
|
Content-type: audio
|
|||
|
|
|||
|
Subtypes defined by this document: basic
|
|||
|
|
|||
|
Important Parameters: none
|
|||
|
|
|||
|
Encoding notes: base64 generally preferred
|
|||
|
________________________________________________________________
|
|||
|
Content-type: video
|
|||
|
|
|||
|
Subtypes defined by this document: mpeg
|
|||
|
|
|||
|
Important Parameters: none
|
|||
|
|
|||
|
Encoding notes: base64 generally preferred
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 75]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
Appendix G -- Canonical Encoding Model
|
|||
|
|
|||
|
There was some confusion, in earlier drafts of this memo, regarding
|
|||
|
the model for when email data was to be converted to canonical form
|
|||
|
and encoded, and in particular how this process would affect the
|
|||
|
treatment of CRLFs, given that the representation of newlines varies
|
|||
|
greatly from system to system. For this reason, a canonical model
|
|||
|
for encoding is presented below.
|
|||
|
|
|||
|
The process of composing a MIME entity can be modeled as being done
|
|||
|
in a number of steps. Note that these steps are roughly similar to
|
|||
|
those steps used in RFC 1421 and are performed for each 'innermost
|
|||
|
level' body:
|
|||
|
|
|||
|
Step 1. Creation of local form.
|
|||
|
|
|||
|
The body to be transmitted is created in the system's native format.
|
|||
|
The native character set is used, and where appropriate local end of
|
|||
|
line conventions are used as well. The body may be a UNIX-style text
|
|||
|
file, or a Sun raster image, or a VMS indexed file, or audio data in
|
|||
|
a system-dependent format stored only in memory, or anything else
|
|||
|
that corresponds to the local model for the representation of some
|
|||
|
form of information. Fundamentally, the data is created in the
|
|||
|
"native" form specified by the type/subtype information.
|
|||
|
|
|||
|
Step 2. Conversion to canonical form.
|
|||
|
|
|||
|
The entire body, including "out-of-band" information such as record
|
|||
|
lengths and possibly file attribute information, is converted to a
|
|||
|
universal canonical form. The specific content type of the body as
|
|||
|
well as its associated attributes dictate the nature of the canonical
|
|||
|
form that is used. Conversion to the proper canonical form may
|
|||
|
involve character set conversion, transformation of audio data,
|
|||
|
compression, or various other operations specific to the various
|
|||
|
content types. If character set conversion is involved, however,
|
|||
|
care must be taken to understand the semantics of the content-type,
|
|||
|
which may have strong implications for any character set conversion,
|
|||
|
e.g. with regard to syntactically meaningful characters in a text
|
|||
|
subtype other than "plain".
|
|||
|
|
|||
|
For example, in the case of text/plain data, the text must be
|
|||
|
converted to a supported character set and lines must be delimited
|
|||
|
with CRLF delimiters in accordance with RFC822. Note that the
|
|||
|
restriction on line lengths implied by RFC822 is eliminated if the
|
|||
|
next step employs either quoted-printable or base64 encoding.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 76]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
Step 3. Apply transfer encoding.
|
|||
|
|
|||
|
A Content-Transfer-Encoding appropriate for this body is applied.
|
|||
|
Note that there is no fixed relationship between the content type and
|
|||
|
the transfer encoding. In particular, it may be appropriate to base
|
|||
|
the choice of base64 or quoted-printable on character frequency
|
|||
|
counts which are specific to a given instance of a body.
|
|||
|
|
|||
|
Step 4. Insertion into entity.
|
|||
|
|
|||
|
The encoded object is inserted into a MIME entity with appropriate
|
|||
|
headers. The entity is then inserted into the body of a higher-level
|
|||
|
entity (message or multipart) if needed.
|
|||
|
|
|||
|
It is vital to note that these steps are only a model; they are
|
|||
|
specifically NOT a blueprint for how an actual system would be built.
|
|||
|
In particular, the model fails to account for two common designs:
|
|||
|
|
|||
|
1. In many cases the conversion to a canonical form prior to
|
|||
|
encoding will be subsumed into the encoder itself, which
|
|||
|
understands local formats directly. For example, the local
|
|||
|
newline convention for text bodies might be carried through to the
|
|||
|
encoder itself along with knowledge of what that format is.
|
|||
|
|
|||
|
2. The output of the encoders may have to pass through one or
|
|||
|
more additional steps prior to being transmitted as a message. As
|
|||
|
such, the output of the encoder may not be conformant with the
|
|||
|
formats specified by RFC822. In particular, once again it may be
|
|||
|
appropriate for the converter's output to be expressed using local
|
|||
|
newline conventions rather than using the standard RFC822 CRLF
|
|||
|
delimiters.
|
|||
|
|
|||
|
Other implementation variations are conceivable as well. The vital
|
|||
|
aspect of this discussion is that, in spite of any optimizations,
|
|||
|
collapsings of required steps, or insertion of additional processing,
|
|||
|
the resulting messages must be consistent with those produced by the
|
|||
|
model described here. For example, a message with the following
|
|||
|
header fields:
|
|||
|
|
|||
|
Content-type: text/foo; charset=bar
|
|||
|
Content-Transfer-Encoding: base64
|
|||
|
|
|||
|
must be first represented in the text/foo form, then (if necessary)
|
|||
|
represented in the "bar" character set, and finally transformed via
|
|||
|
the base64 algorithm into a mail-safe form.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 77]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
Appendix H -- Changes from RFC 1341
|
|||
|
|
|||
|
This document is a relatively minor revision of RFC 1341. For
|
|||
|
the convenience of those familiar with RFC 1341, the technical
|
|||
|
changes from that document are summarized in this appendix.
|
|||
|
|
|||
|
1. The definition of "tspecials" has been changed to no longer
|
|||
|
include ".".
|
|||
|
|
|||
|
2. The Content-ID field is now mandatory for message/external-body
|
|||
|
parts.
|
|||
|
|
|||
|
3. The text/richtext type (including the old Section 7.1.3 and
|
|||
|
Appendix D) has been moved to a separate document.
|
|||
|
|
|||
|
4. The rules on header merging for message/partial data have been
|
|||
|
changed to treat the Encrypted and MIME-Version headers as special
|
|||
|
cases.
|
|||
|
|
|||
|
5. The definition of the external-body access-type parameter has
|
|||
|
been changed so that it can only indicate a single access method
|
|||
|
(which was all that made sense).
|
|||
|
|
|||
|
6. There is a new "Subject" parameter for message/external-body,
|
|||
|
access-type mail-server, to permit MIME-based use of mail servers
|
|||
|
that rely on Subject field information.
|
|||
|
|
|||
|
7. The "conversions" parameter for application/octet-stream has been
|
|||
|
removed.
|
|||
|
|
|||
|
8. Section 7.4.1 now deprecates the use of the "name" parameter for
|
|||
|
application/octet-stream, as this will be superseded in the future by
|
|||
|
a Content-Disposition header.
|
|||
|
|
|||
|
9. The formal grammar for multipart bodies has been changed so that
|
|||
|
a CRLF is no longer required before the first boundary line.
|
|||
|
|
|||
|
10. MIME entities of type "message/partial" and "message/external-
|
|||
|
body" are now required to use only the "7bit" transfer-encoding.
|
|||
|
(Specifically, "binary" and "8bit" are not permitted.)
|
|||
|
|
|||
|
11. The "application/oda" content-type has been removed.
|
|||
|
|
|||
|
12. A note has been added to the end of section 7.2.3, explaining
|
|||
|
the semantics of Content-ID in a multipart/alternative MIME entity.
|
|||
|
|
|||
|
13. The formal syntax for the "MIME-Version" field has been
|
|||
|
tightened, but in a way that is completely compatible with the only
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 78]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
version number defined in RFC 1341.
|
|||
|
|
|||
|
14. In Section 7.3.1, the definition of message/rfc822 has been
|
|||
|
relaxed regarding mandatory fields.
|
|||
|
|
|||
|
All other changes from RFC 1341 were editorial changes and do not
|
|||
|
affect the technical content of MIME. Considerable formal grammar
|
|||
|
has been added, but this reflects the prose specification that was
|
|||
|
already in place.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 79]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
References
|
|||
|
|
|||
|
[US-ASCII] Coded Character Set--7-Bit American Standard Code for
|
|||
|
Information Interchange, ANSI X3.4-1986.
|
|||
|
|
|||
|
[ATK] Borenstein, Nathaniel S., Multimedia Applications Development
|
|||
|
with the Andrew Toolkit, Prentice-Hall, 1990.
|
|||
|
|
|||
|
[GIF] Graphics Interchange Format (Version 89a), Compuserve, Inc.,
|
|||
|
Columbus, Ohio, 1990.
|
|||
|
|
|||
|
[ISO-2022] International Standard--Information Processing--ISO 7-bit
|
|||
|
and 8-bit coded character sets--Code extension techniques, ISO
|
|||
|
2022:1986.
|
|||
|
|
|||
|
[ISO-8859] Information Processing -- 8-bit Single-Byte Coded Graphic
|
|||
|
Character Sets -- Part 1: Latin Alphabet No. 1, ISO 8859-1:1987. Part
|
|||
|
2: Latin alphabet No. 2, ISO 8859-2, 1987. Part 3: Latin alphabet
|
|||
|
No. 3, ISO 8859-3, 1988. Part 4: Latin alphabet No. 4, ISO 8859-4,
|
|||
|
1988. Part 5: Latin/Cyrillic alphabet, ISO 8859-5, 1988. Part 6:
|
|||
|
Latin/Arabic alphabet, ISO 8859-6, 1987. Part 7: Latin/Greek
|
|||
|
alphabet, ISO 8859-7, 1987. Part 8: Latin/Hebrew alphabet, ISO
|
|||
|
8859-8, 1988. Part 9: Latin alphabet No. 5, ISO 8859-9, 1990.
|
|||
|
|
|||
|
[ISO-646] International Standard--Information Processing--ISO 7-bit
|
|||
|
coded character set for information interchange, ISO 646:1983.
|
|||
|
|
|||
|
[MPEG] Video Coding Draft Standard ISO 11172 CD, ISO IEC/TJC1/SC2/WG11
|
|||
|
(Motion Picture Experts Group), May, 1991.
|
|||
|
|
|||
|
[PCM] CCITT, Fascicle III.4 - Recommendation G.711, Geneva, 1972,
|
|||
|
"Pulse Code Modulation (PCM) of Voice Frequencies".
|
|||
|
|
|||
|
[POSTSCRIPT] Adobe Systems, Inc., PostScript Language Reference
|
|||
|
Manual, Addison-Wesley, 1985.
|
|||
|
|
|||
|
[POSTSCRIPT2] Adobe Systems, Inc., PostScript Language Reference
|
|||
|
Manual, Addison-Wesley, Second Edition, 1990.
|
|||
|
|
|||
|
[X400] Schicker, Pietro, "Message Handling Systems, X.400", Message
|
|||
|
Handling Systems and Distributed Applications, E. Stefferud, O-j.
|
|||
|
Jacobsen, and P. Schicker, eds., North-Holland, 1989, pp. 3-41.
|
|||
|
|
|||
|
[RFC-783] Sollins, K., "TFTP Protocol (revision 2)", RFC 783, MIT,
|
|||
|
June 1981.
|
|||
|
|
|||
|
[RFC-821] Postel, J., "Simple Mail Transfer Protocol", STD 10, RFC
|
|||
|
821, USC/Information Sciences Institute, August 1982.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 80]
|
|||
|
|
|||
|
RFC 1521 MIME September 1993
|
|||
|
|
|||
|
|
|||
|
[RFC-822] Crocker, D., "Standard for the Format of ARPA Internet Text
|
|||
|
Messages", STD 11, RFC 822, UDEL, August 1982.
|
|||
|
|
|||
|
[RFC-934] Rose, M., and E. Stefferud, "Proposed Standard for Message
|
|||
|
Encapsulation", RFC 934, Delaware and NMA, January 1985.
|
|||
|
|
|||
|
[RFC-959] Postel, J. and J. Reynolds, "File Transfer Protocol",
|
|||
|
STD 9, RFC 959, USC/Information Sciences Institute, October 1985.
|
|||
|
|
|||
|
[RFC-1049] Sirbu, M., "Content-Type Header Field for Internet
|
|||
|
Messages", STD 11, RFC 1049, CMU, March 1988.
|
|||
|
|
|||
|
[RFC-1421] Linn, J., "Privacy Enhancement for Internet Electronic Mail:
|
|||
|
Part I - Message Encryption and Authentication Procedures", RFC
|
|||
|
1421, IAB IRTF PSRG, IETF PEM WG, February 1993.
|
|||
|
|
|||
|
[RFC-1154] Robinson, D. and R. Ullmann, "Encoding Header Field for
|
|||
|
Internet Messages", RFC 1154, Prime Computer, Inc., April 1990.
|
|||
|
|
|||
|
[RFC-1341] Borenstein, N., and N. Freed, "MIME (Multipurpose Internet
|
|||
|
Mail Extensions): Mechanisms for Specifying and Describing the Format
|
|||
|
of Internet Message Bodies", RFC 1341, Bellcore, Innosoft, June 1992.
|
|||
|
|
|||
|
[RFC-1342] Moore, K., "Representation of Non-Ascii Text in Internet
|
|||
|
Message Headers", RFC 1342, University of Tennessee, June 1992.
|
|||
|
|
|||
|
[RFC-1343] Borenstein, N., "A User Agent Configuration Mechanism
|
|||
|
for Multimedia Mail Format Information", RFC 1343, Bellcore, June
|
|||
|
1992.
|
|||
|
|
|||
|
[RFC-1344] Borenstein, N., "Implications of MIME for Internet
|
|||
|
Mail Gateways", RFC 1344, Bellcore, June 1992.
|
|||
|
|
|||
|
[RFC-1345] Simonsen, K., "Character Mnemonics & Character Sets",
|
|||
|
RFC 1345, Rationel Almen Planlaegning, June 1992.
|
|||
|
|
|||
|
[RFC-1426] Klensin, J., (WG Chair), Freed, N., (Editor), Rose, M.,
|
|||
|
Stefferud, E., and D. Crocker, "SMTP Service Extension for 8bit-MIME
|
|||
|
transport", RFC 1426, United Nations Universit, Innosoft, Dover Beach
|
|||
|
Consulting, Inc., Network Management Associates, Inc., The Branch
|
|||
|
Office, February 1993.
|
|||
|
|
|||
|
[RFC-1522] Moore, K., "Representation of Non-Ascii Text in Internet
|
|||
|
Message Headers" RFC 1522, University of Tennessee, September 1993.
|
|||
|
|
|||
|
[RFC-1340] Reynolds, J., and J. Postel, "Assigned Numbers", STD 2, RFC
|
|||
|
1340, USC/Information Sciences Institute, July 1992.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 81]
|
|||
|
|