2468 lines
103 KiB
Plaintext
2468 lines
103 KiB
Plaintext
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Network Working Group N. Freed
|
|||
|
Request for Comments: 2046 Innosoft
|
|||
|
Obsoletes: 1521, 1522, 1590 N. Borenstein
|
|||
|
Category: Standards Track First Virtual
|
|||
|
November 1996
|
|||
|
|
|||
|
|
|||
|
Multipurpose Internet Mail Extensions
|
|||
|
(MIME) Part Two:
|
|||
|
Media Types
|
|||
|
|
|||
|
Status of this Memo
|
|||
|
|
|||
|
This document specifies an Internet standards track protocol for the
|
|||
|
Internet community, and requests discussion and suggestions for
|
|||
|
improvements. Please refer to the current edition of the "Internet
|
|||
|
Official Protocol Standards" (STD 1) for the standardization state
|
|||
|
and status of this protocol. Distribution of this memo is unlimited.
|
|||
|
|
|||
|
Abstract
|
|||
|
|
|||
|
STD 11, RFC 822 defines a message representation protocol specifying
|
|||
|
considerable detail about US-ASCII message headers, but which leaves
|
|||
|
the message content, or message body, as flat US-ASCII text. This
|
|||
|
set of documents, collectively called the Multipurpose Internet Mail
|
|||
|
Extensions, or MIME, redefines the format of messages to allow for
|
|||
|
|
|||
|
(1) textual message bodies in character sets other than
|
|||
|
US-ASCII,
|
|||
|
|
|||
|
(2) an extensible set of different formats for non-textual
|
|||
|
message bodies,
|
|||
|
|
|||
|
(3) multi-part message bodies, and
|
|||
|
|
|||
|
(4) textual header information in character sets other than
|
|||
|
US-ASCII.
|
|||
|
|
|||
|
These documents are based on earlier work documented in RFC 934, STD
|
|||
|
11, and RFC 1049, but extends and revises them. Because RFC 822 said
|
|||
|
so little about message bodies, these documents are largely
|
|||
|
orthogonal to (rather than a revision of) RFC 822.
|
|||
|
|
|||
|
The initial document in this set, RFC 2045, specifies the various
|
|||
|
headers used to describe the structure of MIME messages. This second
|
|||
|
document defines the general structure of the MIME media typing
|
|||
|
system and defines an initial set of media types. The third document,
|
|||
|
RFC 2047, describes extensions to RFC 822 to allow non-US-ASCII text
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 1]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
data in Internet mail header fields. The fourth document, RFC 2048,
|
|||
|
specifies various IANA registration procedures for MIME-related
|
|||
|
facilities. The fifth and final document, RFC 2049, describes MIME
|
|||
|
conformance criteria as well as providing some illustrative examples
|
|||
|
of MIME message formats, acknowledgements, and the bibliography.
|
|||
|
|
|||
|
These documents are revisions of RFCs 1521 and 1522, which themselves
|
|||
|
were revisions of RFCs 1341 and 1342. An appendix in RFC 2049
|
|||
|
describes differences and changes from previous versions.
|
|||
|
|
|||
|
Table of Contents
|
|||
|
|
|||
|
1. Introduction ......................................... 3
|
|||
|
2. Definition of a Top-Level Media Type ................. 4
|
|||
|
3. Overview Of The Initial Top-Level Media Types ........ 4
|
|||
|
4. Discrete Media Type Values ........................... 6
|
|||
|
4.1 Text Media Type ..................................... 6
|
|||
|
4.1.1 Representation of Line Breaks ..................... 7
|
|||
|
4.1.2 Charset Parameter ................................. 7
|
|||
|
4.1.3 Plain Subtype ..................................... 11
|
|||
|
4.1.4 Unrecognized Subtypes ............................. 11
|
|||
|
4.2 Image Media Type .................................... 11
|
|||
|
4.3 Audio Media Type .................................... 11
|
|||
|
4.4 Video Media Type .................................... 12
|
|||
|
4.5 Application Media Type .............................. 12
|
|||
|
4.5.1 Octet-Stream Subtype .............................. 13
|
|||
|
4.5.2 PostScript Subtype ................................ 14
|
|||
|
4.5.3 Other Application Subtypes ........................ 17
|
|||
|
5. Composite Media Type Values .......................... 17
|
|||
|
5.1 Multipart Media Type ................................ 17
|
|||
|
5.1.1 Common Syntax ..................................... 19
|
|||
|
5.1.2 Handling Nested Messages and Multiparts ........... 24
|
|||
|
5.1.3 Mixed Subtype ..................................... 24
|
|||
|
5.1.4 Alternative Subtype ............................... 24
|
|||
|
5.1.5 Digest Subtype .................................... 26
|
|||
|
5.1.6 Parallel Subtype .................................. 27
|
|||
|
5.1.7 Other Multipart Subtypes .......................... 28
|
|||
|
5.2 Message Media Type .................................. 28
|
|||
|
5.2.1 RFC822 Subtype .................................... 28
|
|||
|
5.2.2 Partial Subtype ................................... 29
|
|||
|
5.2.2.1 Message Fragmentation and Reassembly ............ 30
|
|||
|
5.2.2.2 Fragmentation and Reassembly Example ............ 31
|
|||
|
5.2.3 External-Body Subtype ............................. 33
|
|||
|
5.2.4 Other Message Subtypes ............................ 40
|
|||
|
6. Experimental Media Type Values ....................... 40
|
|||
|
7. Summary .............................................. 41
|
|||
|
8. Security Considerations .............................. 41
|
|||
|
9. Authors' Addresses ................................... 42
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 2]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
A. Collected Grammar .................................... 43
|
|||
|
|
|||
|
1. Introduction
|
|||
|
|
|||
|
The first document in this set, RFC 2045, defines a number of header
|
|||
|
fields, including Content-Type. The Content-Type field is used to
|
|||
|
specify the nature of the data in the body of a MIME entity, by
|
|||
|
giving media type and subtype identifiers, and by providing auxiliary
|
|||
|
information that may be required for certain media types. After the
|
|||
|
type and subtype names, the remainder of the header field is simply a
|
|||
|
set of parameters, specified in an attribute/value notation. The
|
|||
|
ordering of parameters is not significant.
|
|||
|
|
|||
|
In general, the top-level media type is used to declare the general
|
|||
|
type of data, while the subtype specifies a specific format for that
|
|||
|
type of data. Thus, a media type of "image/xyz" is enough to tell a
|
|||
|
user agent that the data is an image, even if the user agent has no
|
|||
|
knowledge of the specific image format "xyz". Such information can
|
|||
|
be used, for example, to decide whether or not to show a user the raw
|
|||
|
data from an unrecognized subtype -- such an action might be
|
|||
|
reasonable for unrecognized subtypes of "text", but not for
|
|||
|
unrecognized subtypes of "image" or "audio". For this reason,
|
|||
|
registered subtypes of "text", "image", "audio", and "video" should
|
|||
|
not contain embedded information that is really of a different type.
|
|||
|
Such compound formats should be represented using the "multipart" or
|
|||
|
"application" types.
|
|||
|
|
|||
|
Parameters are modifiers of the media subtype, and as such do not
|
|||
|
fundamentally affect the nature of the content. The set of
|
|||
|
meaningful parameters depends on the media type and subtype. Most
|
|||
|
parameters are associated with a single specific subtype. However, a
|
|||
|
given top-level media type may define parameters which are applicable
|
|||
|
to any subtype of that type. Parameters may be required by their
|
|||
|
defining media type or subtype or they may be optional. MIME
|
|||
|
implementations must also ignore any parameters whose names they do
|
|||
|
not recognize.
|
|||
|
|
|||
|
MIME's Content-Type header field and media type mechanism has been
|
|||
|
carefully designed to be extensible, and it is expected that the set
|
|||
|
of media type/subtype pairs and their associated parameters will grow
|
|||
|
significantly over time. Several other MIME facilities, such as
|
|||
|
transfer encodings and "message/external-body" access types, are
|
|||
|
likely to have new values defined over time. In order to ensure that
|
|||
|
the set of such values is developed in an orderly, well-specified,
|
|||
|
and public manner, MIME sets up a registration process which uses the
|
|||
|
Internet Assigned Numbers Authority (IANA) as a central registry for
|
|||
|
MIME's various areas of extensibility. The registration process for
|
|||
|
these areas is described in a companion document, RFC 2048.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 3]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
The initial seven standard top-level media type are defined and
|
|||
|
described in the remainder of this document.
|
|||
|
|
|||
|
2. Definition of a Top-Level Media Type
|
|||
|
|
|||
|
The definition of a top-level media type consists of:
|
|||
|
|
|||
|
(1) a name and a description of the type, including
|
|||
|
criteria for whether a particular type would qualify
|
|||
|
under that type,
|
|||
|
|
|||
|
(2) the names and definitions of parameters, if any, which
|
|||
|
are defined for all subtypes of that type (including
|
|||
|
whether such parameters are required or optional),
|
|||
|
|
|||
|
(3) how a user agent and/or gateway should handle unknown
|
|||
|
subtypes of this type,
|
|||
|
|
|||
|
(4) general considerations on gatewaying entities of this
|
|||
|
top-level type, if any, and
|
|||
|
|
|||
|
(5) any restrictions on content-transfer-encodings for
|
|||
|
entities of this top-level type.
|
|||
|
|
|||
|
3. Overview Of The Initial Top-Level Media Types
|
|||
|
|
|||
|
The five discrete top-level media types are:
|
|||
|
|
|||
|
(1) text -- textual information. The subtype "plain" in
|
|||
|
particular indicates plain text containing no
|
|||
|
formatting commands or directives of any sort. Plain
|
|||
|
text is intended to be displayed "as-is". No special
|
|||
|
software is required to get the full meaning of the
|
|||
|
text, aside from support for the indicated character
|
|||
|
set. Other subtypes are to be used for enriched text in
|
|||
|
forms where application software may enhance the
|
|||
|
appearance of the text, but such software must not be
|
|||
|
required in order to get the general idea of the
|
|||
|
content. Possible subtypes of "text" thus include any
|
|||
|
word processor format that can be read without
|
|||
|
resorting to software that understands the format. In
|
|||
|
particular, formats that employ embeddded binary
|
|||
|
formatting information are not considered directly
|
|||
|
readable. A very simple and portable subtype,
|
|||
|
"richtext", was defined in RFC 1341, with a further
|
|||
|
revision in RFC 1896 under the name "enriched".
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 4]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
(2) image -- image data. "Image" requires a display device
|
|||
|
(such as a graphical display, a graphics printer, or a
|
|||
|
FAX machine) to view the information. An initial
|
|||
|
subtype is defined for the widely-used image format
|
|||
|
JPEG. . subtypes are defined for two widely-used image
|
|||
|
formats, jpeg and gif.
|
|||
|
|
|||
|
(3) audio -- audio data. "Audio" requires an audio output
|
|||
|
device (such as a speaker or a telephone) to "display"
|
|||
|
the contents. An initial subtype "basic" is defined in
|
|||
|
this document.
|
|||
|
|
|||
|
(4) video -- video data. "Video" requires the capability
|
|||
|
to display moving images, typically including
|
|||
|
specialized hardware and software. An initial subtype
|
|||
|
"mpeg" is defined in this document.
|
|||
|
|
|||
|
(5) application -- some other kind of data, typically
|
|||
|
either uninterpreted binary data or information to be
|
|||
|
processed by an application. The subtype "octet-
|
|||
|
stream" is to be used in the case of uninterpreted
|
|||
|
binary data, in which case the simplest recommended
|
|||
|
action is to offer to write the information into a file
|
|||
|
for the user. The "PostScript" subtype is also defined
|
|||
|
for the transport of PostScript material. Other
|
|||
|
expected uses for "application" include spreadsheets,
|
|||
|
data for mail-based scheduling systems, and languages
|
|||
|
for "active" (computational) messaging, and word
|
|||
|
processing formats that are not directly readable.
|
|||
|
Note that security considerations may exist for some
|
|||
|
types of application data, most notably
|
|||
|
"application/PostScript" and any form of active
|
|||
|
messaging. These issues are discussed later in this
|
|||
|
document.
|
|||
|
|
|||
|
The two composite top-level media types are:
|
|||
|
|
|||
|
(1) multipart -- data consisting of multiple entities of
|
|||
|
independent data types. Four subtypes are initially
|
|||
|
defined, including the basic "mixed" subtype specifying
|
|||
|
a generic mixed set of parts, "alternative" for
|
|||
|
representing the same data in multiple formats,
|
|||
|
"parallel" for parts intended to be viewed
|
|||
|
simultaneously, and "digest" for multipart entities in
|
|||
|
which each part has a default type of "message/rfc822".
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 5]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
(2) message -- an encapsulated message. A body of media
|
|||
|
type "message" is itself all or a portion of some kind
|
|||
|
of message object. Such objects may or may not in turn
|
|||
|
contain other entities. The "rfc822" subtype is used
|
|||
|
when the encapsulated content is itself an RFC 822
|
|||
|
message. The "partial" subtype is defined for partial
|
|||
|
RFC 822 messages, to permit the fragmented transmission
|
|||
|
of bodies that are thought to be too large to be passed
|
|||
|
through transport facilities in one piece. Another
|
|||
|
subtype, "external-body", is defined for specifying
|
|||
|
large bodies by reference to an external data source.
|
|||
|
|
|||
|
It should be noted that the list of media type values given here may
|
|||
|
be augmented in time, via the mechanisms described above, and that
|
|||
|
the set of subtypes is expected to grow substantially.
|
|||
|
|
|||
|
4. Discrete Media Type Values
|
|||
|
|
|||
|
Five of the seven initial media type values refer to discrete bodies.
|
|||
|
The content of these types must be handled by non-MIME mechanisms;
|
|||
|
they are opaque to MIME processors.
|
|||
|
|
|||
|
4.1. Text Media Type
|
|||
|
|
|||
|
The "text" media type is intended for sending material which is
|
|||
|
principally textual in form. A "charset" parameter may be used to
|
|||
|
indicate the character set of the body text for "text" subtypes,
|
|||
|
notably including the subtype "text/plain", which is a generic
|
|||
|
subtype for plain text. Plain text does not provide for or allow
|
|||
|
formatting commands, font attribute specifications, processing
|
|||
|
instructions, interpretation directives, or content markup. Plain
|
|||
|
text is seen simply as a linear sequence of characters, possibly
|
|||
|
interrupted by line breaks or page breaks. Plain text may allow the
|
|||
|
stacking of several characters in the same position in the text.
|
|||
|
Plain text in scripts like Arabic and Hebrew may also include
|
|||
|
facilitites that allow the arbitrary mixing of text segments with
|
|||
|
opposite writing directions.
|
|||
|
|
|||
|
Beyond plain text, there are many formats for representing what might
|
|||
|
be known as "rich text". An interesting characteristic of many such
|
|||
|
representations is that they are to some extent readable even without
|
|||
|
the software that interprets them. It is useful, then, to
|
|||
|
distinguish them, at the highest level, from such unreadable data as
|
|||
|
images, audio, or text represented in an unreadable form. In the
|
|||
|
absence of appropriate interpretation software, it is reasonable to
|
|||
|
show subtypes of "text" to the user, while it is not reasonable to do
|
|||
|
so with most nontextual data. Such formatted textual data should be
|
|||
|
represented using subtypes of "text".
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 6]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
4.1.1. Representation of Line Breaks
|
|||
|
|
|||
|
The canonical form of any MIME "text" subtype MUST always represent a
|
|||
|
line break as a CRLF sequence. Similarly, any occurrence of CRLF in
|
|||
|
MIME "text" MUST represent a line break. Use of CR and LF outside of
|
|||
|
line break sequences is also forbidden.
|
|||
|
|
|||
|
This rule applies regardless of format or character set or sets
|
|||
|
involved.
|
|||
|
|
|||
|
NOTE: The proper interpretation of line breaks when a body is
|
|||
|
displayed depends on the media type. In particular, while it is
|
|||
|
appropriate to treat a line break as a transition to a new line when
|
|||
|
displaying a "text/plain" body, this treatment is actually incorrect
|
|||
|
for other subtypes of "text" like "text/enriched" [RFC-1896].
|
|||
|
Similarly, whether or not line breaks should be added during display
|
|||
|
operations is also a function of the media type. It should not be
|
|||
|
necessary to add any line breaks to display "text/plain" correctly,
|
|||
|
whereas proper display of "text/enriched" requires the appropriate
|
|||
|
addition of line breaks.
|
|||
|
|
|||
|
NOTE: Some protocols defines a maximum line length. E.g. SMTP [RFC-
|
|||
|
821] allows a maximum of 998 octets before the next CRLF sequence.
|
|||
|
To be transported by such protocols, data which includes too long
|
|||
|
segments without CRLF sequences must be encoded with a suitable
|
|||
|
content-transfer-encoding.
|
|||
|
|
|||
|
4.1.2. Charset Parameter
|
|||
|
|
|||
|
A critical parameter that may be specified in the Content-Type field
|
|||
|
for "text/plain" data is the character set. This is specified with a
|
|||
|
"charset" parameter, as in:
|
|||
|
|
|||
|
Content-type: text/plain; charset=iso-8859-1
|
|||
|
|
|||
|
Unlike some other parameter values, the values of the charset
|
|||
|
parameter are NOT case sensitive. The default character set, which
|
|||
|
must be assumed in the absence of a charset parameter, is US-ASCII.
|
|||
|
|
|||
|
The specification for any future subtypes of "text" must specify
|
|||
|
whether or not they will also utilize a "charset" parameter, and may
|
|||
|
possibly restrict its values as well. For other subtypes of "text"
|
|||
|
than "text/plain", the semantics of the "charset" parameter should be
|
|||
|
defined to be identical to those specified here for "text/plain",
|
|||
|
i.e., the body consists entirely of characters in the given charset.
|
|||
|
In particular, definers of future "text" subtypes should pay close
|
|||
|
attention to the implications of multioctet character sets for their
|
|||
|
subtype definitions.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 7]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
The charset parameter for subtypes of "text" gives a name of a
|
|||
|
character set, as "character set" is defined in RFC 2045. The rules
|
|||
|
regarding line breaks detailed in the previous section must also be
|
|||
|
observed -- a character set whose definition does not conform to
|
|||
|
these rules cannot be used in a MIME "text" subtype.
|
|||
|
|
|||
|
An initial list of predefined character set names can be found at the
|
|||
|
end of this section. Additional character sets may be registered
|
|||
|
with IANA.
|
|||
|
|
|||
|
Other media types than subtypes of "text" might choose to employ the
|
|||
|
charset parameter as defined here, but with the CRLF/line break
|
|||
|
restriction removed. Therefore, all character sets that conform to
|
|||
|
the general definition of "character set" in RFC 2045 can be
|
|||
|
registered for MIME use.
|
|||
|
|
|||
|
Note that if the specified character set includes 8-bit characters
|
|||
|
and such characters are used in the body, a Content-Transfer-Encoding
|
|||
|
header field and a corresponding encoding on the data are required in
|
|||
|
order to transmit the body via some mail transfer protocols, such as
|
|||
|
SMTP [RFC-821].
|
|||
|
|
|||
|
The default character set, US-ASCII, has been the subject of some
|
|||
|
confusion and ambiguity in the past. Not only were there some
|
|||
|
ambiguities in the definition, there have been wide variations in
|
|||
|
practice. In order to eliminate such ambiguity and variations in the
|
|||
|
future, it is strongly recommended that new user agents explicitly
|
|||
|
specify a character set as a media type parameter in the Content-Type
|
|||
|
header field. "US-ASCII" does not indicate an arbitrary 7-bit
|
|||
|
character set, but specifies that all octets in the body must be
|
|||
|
interpreted as characters according to the US-ASCII character set.
|
|||
|
National and application-oriented versions of ISO 646 [ISO-646] are
|
|||
|
usually NOT identical to US-ASCII, and in that case their use in
|
|||
|
Internet mail is explicitly discouraged. The omission of the ISO 646
|
|||
|
character set from this document is deliberate in this regard. The
|
|||
|
character set name of "US-ASCII" explicitly refers to the character
|
|||
|
set defined in ANSI X3.4-1986 [US- ASCII]. The new international
|
|||
|
reference version (IRV) of the 1991 edition of ISO 646 is identical
|
|||
|
to US-ASCII. The character set name "ASCII" is reserved and must not
|
|||
|
be used for any purpose.
|
|||
|
|
|||
|
NOTE: RFC 821 explicitly specifies "ASCII", and references an earlier
|
|||
|
version of the American Standard. Insofar as one of the purposes of
|
|||
|
specifying a media type and character set is to permit the receiver
|
|||
|
to unambiguously determine how the sender intended the coded message
|
|||
|
to be interpreted, assuming anything other than "strict ASCII" as the
|
|||
|
default would risk unintentional and incompatible changes to the
|
|||
|
semantics of messages now being transmitted. This also implies that
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 8]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
messages containing characters coded according to other versions of
|
|||
|
ISO 646 than US-ASCII and the 1991 IRV, or using code-switching
|
|||
|
procedures (e.g., those of ISO 2022), as well as 8bit or multiple
|
|||
|
octet character encodings MUST use an appropriate character set
|
|||
|
specification to be consistent with MIME.
|
|||
|
|
|||
|
The complete US-ASCII character set is listed in ANSI X3.4- 1986.
|
|||
|
Note that the control characters including DEL (0-31, 127) have no
|
|||
|
defined meaning in apart from the combination CRLF (US-ASCII values
|
|||
|
13 and 10) indicating a new line. Two of the characters have de
|
|||
|
facto meanings in wide use: FF (12) often means "start subsequent
|
|||
|
text on the beginning of a new page"; and TAB or HT (9) often (though
|
|||
|
not always) means "move the cursor to the next available column after
|
|||
|
the current position where the column number is a multiple of 8
|
|||
|
(counting the first column as column 0)." Aside from these
|
|||
|
conventions, any use of the control characters or DEL in a body must
|
|||
|
either occur
|
|||
|
|
|||
|
(1) because a subtype of text other than "plain"
|
|||
|
specifically assigns some additional meaning, or
|
|||
|
|
|||
|
(2) within the context of a private agreement between the
|
|||
|
sender and recipient. Such private agreements are
|
|||
|
discouraged and should be replaced by the other
|
|||
|
capabilities of this document.
|
|||
|
|
|||
|
NOTE: An enormous proliferation of character sets exist beyond US-
|
|||
|
ASCII. A large number of partially or totally overlapping character
|
|||
|
sets is NOT a good thing. A SINGLE character set that can be used
|
|||
|
universally for representing all of the world's languages in Internet
|
|||
|
mail would be preferrable. Unfortunately, existing practice in
|
|||
|
several communities seems to point to the continued use of multiple
|
|||
|
character sets in the near future. A small number of standard
|
|||
|
character sets are, therefore, defined for Internet use in this
|
|||
|
document.
|
|||
|
|
|||
|
The defined charset values are:
|
|||
|
|
|||
|
(1) US-ASCII -- as defined in ANSI X3.4-1986 [US-ASCII].
|
|||
|
|
|||
|
(2) ISO-8859-X -- where "X" is to be replaced, as
|
|||
|
necessary, for the parts of ISO-8859 [ISO-8859]. Note
|
|||
|
that the ISO 646 character sets have deliberately been
|
|||
|
omitted in favor of their 8859 replacements, which are
|
|||
|
the designated character sets for Internet mail. As of
|
|||
|
the publication of this document, the legitimate values
|
|||
|
for "X" are the digits 1 through 10.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 9]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
Characters in the range 128-159 has no assigned meaning in ISO-8859-
|
|||
|
X. Characters with values below 128 in ISO-8859-X have the same
|
|||
|
assigned meaning as they do in US-ASCII.
|
|||
|
|
|||
|
Part 6 of ISO 8859 (Latin/Arabic alphabet) and part 8 (Latin/Hebrew
|
|||
|
alphabet) includes both characters for which the normal writing
|
|||
|
direction is right to left and characters for which it is left to
|
|||
|
right, but do not define a canonical ordering method for representing
|
|||
|
bi-directional text. The charset values "ISO-8859-6" and "ISO-8859-
|
|||
|
8", however, specify that the visual method is used [RFC-1556].
|
|||
|
|
|||
|
All of these character sets are used as pure 7bit or 8bit sets
|
|||
|
without any shift or escape functions. The meaning of shift and
|
|||
|
escape sequences in these character sets is not defined.
|
|||
|
|
|||
|
The character sets specified above are the ones that were relatively
|
|||
|
uncontroversial during the drafting of MIME. This document does not
|
|||
|
endorse the use of any particular character set other than US-ASCII,
|
|||
|
and recognizes that the future evolution of world character sets
|
|||
|
remains unclear.
|
|||
|
|
|||
|
Note that the character set used, if anything other than US- ASCII,
|
|||
|
must always be explicitly specified in the Content-Type field.
|
|||
|
|
|||
|
No character set name other than those defined above may be used in
|
|||
|
Internet mail without the publication of a formal specification and
|
|||
|
its registration with IANA, or by private agreement, in which case
|
|||
|
the character set name must begin with "X-".
|
|||
|
|
|||
|
Implementors are discouraged from defining new character sets unless
|
|||
|
absolutely necessary.
|
|||
|
|
|||
|
The "charset" parameter has been defined primarily for the purpose of
|
|||
|
textual data, and is described in this section for that reason.
|
|||
|
However, it is conceivable that non-textual data might also wish to
|
|||
|
specify a charset value for some purpose, in which case the same
|
|||
|
syntax and values should be used.
|
|||
|
|
|||
|
In general, composition software should always use the "lowest common
|
|||
|
denominator" character set possible. For example, if a body contains
|
|||
|
only US-ASCII characters, it SHOULD be marked as being in the US-
|
|||
|
ASCII character set, not ISO-8859-1, which, like all the ISO-8859
|
|||
|
family of character sets, is a superset of US-ASCII. More generally,
|
|||
|
if a widely-used character set is a subset of another character set,
|
|||
|
and a body contains only characters in the widely-used subset, it
|
|||
|
should be labelled as being in that subset. This will increase the
|
|||
|
chances that the recipient will be able to view the resulting entity
|
|||
|
correctly.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 10]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
4.1.3. Plain Subtype
|
|||
|
|
|||
|
The simplest and most important subtype of "text" is "plain". This
|
|||
|
indicates plain text that does not contain any formatting commands or
|
|||
|
directives. Plain text is intended to be displayed "as-is", that is,
|
|||
|
no interpretation of embedded formatting commands, font attribute
|
|||
|
specifications, processing instructions, interpretation directives,
|
|||
|
or content markup should be necessary for proper display. The
|
|||
|
default media type of "text/plain; charset=us-ascii" for Internet
|
|||
|
mail describes existing Internet practice. That is, it is the type
|
|||
|
of body defined by RFC 822.
|
|||
|
|
|||
|
No other "text" subtype is defined by this document.
|
|||
|
|
|||
|
4.1.4. Unrecognized Subtypes
|
|||
|
|
|||
|
Unrecognized subtypes of "text" should be treated as subtype "plain"
|
|||
|
as long as the MIME implementation knows how to handle the charset.
|
|||
|
Unrecognized subtypes which also specify an unrecognized charset
|
|||
|
should be treated as "application/octet- stream".
|
|||
|
|
|||
|
4.2. Image Media Type
|
|||
|
|
|||
|
A media type of "image" indicates that the body contains an image.
|
|||
|
The subtype names the specific image format. These names are not
|
|||
|
case sensitive. An initial subtype is "jpeg" for the JPEG format
|
|||
|
using JFIF encoding [JPEG].
|
|||
|
|
|||
|
The list of "image" subtypes given here is neither exclusive nor
|
|||
|
exhaustive, and is expected to grow as more types are registered with
|
|||
|
IANA, as described in RFC 2048.
|
|||
|
|
|||
|
Unrecognized subtypes of "image" should at a miniumum be treated as
|
|||
|
"application/octet-stream". Implementations may optionally elect to
|
|||
|
pass subtypes of "image" that they do not specifically recognize to a
|
|||
|
secure and robust general-purpose image viewing application, if such
|
|||
|
an application is available.
|
|||
|
|
|||
|
NOTE: Using of a generic-purpose image viewing application this way
|
|||
|
inherits the security problems of the most dangerous type supported
|
|||
|
by the application.
|
|||
|
|
|||
|
4.3. Audio Media Type
|
|||
|
|
|||
|
A media type of "audio" indicates that the body contains audio data.
|
|||
|
Although there is not yet a consensus on an "ideal" audio format for
|
|||
|
use with computers, there is a pressing need for a format capable of
|
|||
|
providing interoperable behavior.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 11]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
The initial subtype of "basic" is specified to meet this requirement
|
|||
|
by providing an absolutely minimal lowest common denominator audio
|
|||
|
format. It is expected that richer formats for higher quality and/or
|
|||
|
lower bandwidth audio will be defined by a later document.
|
|||
|
|
|||
|
The content of the "audio/basic" subtype is single channel audio
|
|||
|
encoded using 8bit ISDN mu-law [PCM] at a sample rate of 8000 Hz.
|
|||
|
|
|||
|
Unrecognized subtypes of "audio" should at a miniumum be treated as
|
|||
|
"application/octet-stream". Implementations may optionally elect to
|
|||
|
pass subtypes of "audio" that they do not specifically recognize to a
|
|||
|
robust general-purpose audio playing application, if such an
|
|||
|
application is available.
|
|||
|
|
|||
|
4.4. Video Media Type
|
|||
|
|
|||
|
A media type of "video" indicates that the body contains a time-
|
|||
|
varying-picture image, possibly with color and coordinated sound.
|
|||
|
The term 'video' is used in its most generic sense, rather than with
|
|||
|
reference to any particular technology or format, and is not meant to
|
|||
|
preclude subtypes such as animated drawings encoded compactly. The
|
|||
|
subtype "mpeg" refers to video coded according to the MPEG standard
|
|||
|
[MPEG].
|
|||
|
|
|||
|
Note that although in general this document strongly discourages the
|
|||
|
mixing of multiple media in a single body, it is recognized that many
|
|||
|
so-called video formats include a representation for synchronized
|
|||
|
audio, and this is explicitly permitted for subtypes of "video".
|
|||
|
|
|||
|
Unrecognized subtypes of "video" should at a minumum be treated as
|
|||
|
"application/octet-stream". Implementations may optionally elect to
|
|||
|
pass subtypes of "video" that they do not specifically recognize to a
|
|||
|
robust general-purpose video display application, if such an
|
|||
|
application is available.
|
|||
|
|
|||
|
4.5. Application Media Type
|
|||
|
|
|||
|
The "application" media type is to be used for discrete data which do
|
|||
|
not fit in any of the other categories, and particularly for data to
|
|||
|
be processed by some type of application program. This is
|
|||
|
information which must be processed by an application before it is
|
|||
|
viewable or usable by a user. Expected uses for the "application"
|
|||
|
media type include file transfer, spreadsheets, data for mail-based
|
|||
|
scheduling systems, and languages for "active" (computational)
|
|||
|
material. (The latter, in particular, can pose security problems
|
|||
|
which must be understood by implementors, and are considered in
|
|||
|
detail in the discussion of the "application/PostScript" media type.)
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 12]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
For example, a meeting scheduler might define a standard
|
|||
|
representation for information about proposed meeting dates. An
|
|||
|
intelligent user agent would use this information to conduct a dialog
|
|||
|
with the user, and might then send additional material based on that
|
|||
|
dialog. More generally, there have been several "active" messaging
|
|||
|
languages developed in which programs in a suitably specialized
|
|||
|
language are transported to a remote location and automatically run
|
|||
|
in the recipient's environment.
|
|||
|
|
|||
|
Such applications may be defined as subtypes of the "application"
|
|||
|
media type. This document defines two subtypes:
|
|||
|
|
|||
|
octet-stream, and PostScript.
|
|||
|
|
|||
|
The subtype of "application" will often be either the name or include
|
|||
|
part of the name of the application for which the data are intended.
|
|||
|
This does not mean, however, that any application program name may be
|
|||
|
used freely as a subtype of "application".
|
|||
|
|
|||
|
4.5.1. Octet-Stream Subtype
|
|||
|
|
|||
|
The "octet-stream" subtype is used to indicate that a body contains
|
|||
|
arbitrary binary data. The set of currently defined parameters is:
|
|||
|
|
|||
|
(1) TYPE -- the general type or category of binary data.
|
|||
|
This is intended as information for the human recipient
|
|||
|
rather than for any automatic processing.
|
|||
|
|
|||
|
(2) PADDING -- the number of bits of padding that were
|
|||
|
appended to the bit-stream comprising the actual
|
|||
|
contents to produce the enclosed 8bit byte-oriented
|
|||
|
data. This is useful for enclosing a bit-stream in a
|
|||
|
body when the total number of bits is not a multiple of
|
|||
|
8.
|
|||
|
|
|||
|
Both of these parameters are optional.
|
|||
|
|
|||
|
An additional parameter, "CONVERSIONS", was defined in RFC 1341 but
|
|||
|
has since been removed. RFC 1341 also defined the use of a "NAME"
|
|||
|
parameter which gave a suggested file name to be used if the data
|
|||
|
were to be written to a file. This has been deprecated in
|
|||
|
anticipation of a separate Content-Disposition header field, to be
|
|||
|
defined in a subsequent RFC.
|
|||
|
|
|||
|
The recommended action for an implementation that receives an
|
|||
|
"application/octet-stream" entity is to simply offer to put the data
|
|||
|
in a file, with any Content-Transfer-Encoding undone, or perhaps to
|
|||
|
use it as input to a user-specified process.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 13]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
To reduce the danger of transmitting rogue programs, it is strongly
|
|||
|
recommended that implementations NOT implement a path-search
|
|||
|
mechanism whereby an arbitrary program named in the Content-Type
|
|||
|
parameter (e.g., an "interpreter=" parameter) is found and executed
|
|||
|
using the message body as input.
|
|||
|
|
|||
|
4.5.2. PostScript Subtype
|
|||
|
|
|||
|
A media type of "application/postscript" indicates a PostScript
|
|||
|
program. Currently two variants of the PostScript language are
|
|||
|
allowed; the original level 1 variant is described in [POSTSCRIPT]
|
|||
|
and the more recent level 2 variant is described in [POSTSCRIPT2].
|
|||
|
|
|||
|
PostScript is a registered trademark of Adobe Systems, Inc. Use of
|
|||
|
the MIME media type "application/postscript" implies recognition of
|
|||
|
that trademark and all the rights it entails.
|
|||
|
|
|||
|
The PostScript language definition provides facilities for internal
|
|||
|
labelling of the specific language features a given program uses.
|
|||
|
This labelling, called the PostScript document structuring
|
|||
|
conventions, or DSC, is very general and provides substantially more
|
|||
|
information than just the language level. The use of document
|
|||
|
structuring conventions, while not required, is strongly recommended
|
|||
|
as an aid to interoperability. Documents which lack proper
|
|||
|
structuring conventions cannot be tested to see whether or not they
|
|||
|
will work in a given environment. As such, some systems may assume
|
|||
|
the worst and refuse to process unstructured documents.
|
|||
|
|
|||
|
The execution of general-purpose PostScript interpreters entails
|
|||
|
serious security risks, and implementors are discouraged from simply
|
|||
|
sending PostScript bodies to "off- the-shelf" interpreters. While it
|
|||
|
is usually safe to send PostScript to a printer, where the potential
|
|||
|
for harm is greatly constrained by typical printer environments,
|
|||
|
implementors should consider all of the following before they add
|
|||
|
interactive display of PostScript bodies to their MIME readers.
|
|||
|
|
|||
|
The remainder of this section outlines some, though probably not all,
|
|||
|
of the possible problems with the transport of PostScript entities.
|
|||
|
|
|||
|
(1) Dangerous operations in the PostScript language
|
|||
|
include, but may not be limited to, the PostScript
|
|||
|
operators "deletefile", "renamefile", "filenameforall",
|
|||
|
and "file". "File" is only dangerous when applied to
|
|||
|
something other than standard input or output.
|
|||
|
Implementations may also define additional nonstandard
|
|||
|
file operators; these may also pose a threat to
|
|||
|
security. "Filenameforall", the wildcard file search
|
|||
|
operator, may appear at first glance to be harmless.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 14]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
Note, however, that this operator has the potential to
|
|||
|
reveal information about what files the recipient has
|
|||
|
access to, and this information may itself be
|
|||
|
sensitive. Message senders should avoid the use of
|
|||
|
potentially dangerous file operators, since these
|
|||
|
operators are quite likely to be unavailable in secure
|
|||
|
PostScript implementations. Message receiving and
|
|||
|
displaying software should either completely disable
|
|||
|
all potentially dangerous file operators or take
|
|||
|
special care not to delegate any special authority to
|
|||
|
their operation. These operators should be viewed as
|
|||
|
being done by an outside agency when interpreting
|
|||
|
PostScript documents. Such disabling and/or checking
|
|||
|
should be done completely outside of the reach of the
|
|||
|
PostScript language itself; care should be taken to
|
|||
|
insure that no method exists for re-enabling full-
|
|||
|
function versions of these operators.
|
|||
|
|
|||
|
(2) The PostScript language provides facilities for exiting
|
|||
|
the normal interpreter, or server, loop. Changes made
|
|||
|
in this "outer" environment are customarily retained
|
|||
|
across documents, and may in some cases be retained
|
|||
|
semipermanently in nonvolatile memory. The operators
|
|||
|
associated with exiting the interpreter loop have the
|
|||
|
potential to interfere with subsequent document
|
|||
|
processing. As such, their unrestrained use
|
|||
|
constitutes a threat of service denial. PostScript
|
|||
|
operators that exit the interpreter loop include, but
|
|||
|
may not be limited to, the exitserver and startjob
|
|||
|
operators. Message sending software should not
|
|||
|
generate PostScript that depends on exiting the
|
|||
|
interpreter loop to operate, since the ability to exit
|
|||
|
will probably be unavailable in secure PostScript
|
|||
|
implementations. Message receiving and displaying
|
|||
|
software should completely disable the ability to make
|
|||
|
retained changes to the PostScript environment by
|
|||
|
eliminating or disabling the "startjob" and
|
|||
|
"exitserver" operations. If these operations cannot be
|
|||
|
eliminated or completely disabled the password
|
|||
|
associated with them should at least be set to a hard-
|
|||
|
to-guess value.
|
|||
|
|
|||
|
(3) PostScript provides operators for setting system-wide
|
|||
|
and device-specific parameters. These parameter
|
|||
|
settings may be retained across jobs and may
|
|||
|
potentially pose a threat to the correct operation of
|
|||
|
the interpreter. The PostScript operators that set
|
|||
|
system and device parameters include, but may not be
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 15]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
limited to, the "setsystemparams" and "setdevparams"
|
|||
|
operators. Message sending software should not
|
|||
|
generate PostScript that depends on the setting of
|
|||
|
system or device parameters to operate correctly. The
|
|||
|
ability to set these parameters will probably be
|
|||
|
unavailable in secure PostScript implementations.
|
|||
|
Message receiving and displaying software should
|
|||
|
disable the ability to change system and device
|
|||
|
parameters. If these operators cannot be completely
|
|||
|
disabled the password associated with them should at
|
|||
|
least be set to a hard-to-guess value.
|
|||
|
|
|||
|
(4) Some PostScript implementations provide nonstandard
|
|||
|
facilities for the direct loading and execution of
|
|||
|
machine code. Such facilities are quite obviously open
|
|||
|
to substantial abuse. Message sending software should
|
|||
|
not make use of such features. Besides being totally
|
|||
|
hardware-specific, they are also likely to be
|
|||
|
unavailable in secure implementations of PostScript.
|
|||
|
Message receiving and displaying software should not
|
|||
|
allow such operators to be used if they exist.
|
|||
|
|
|||
|
(5) PostScript is an extensible language, and many, if not
|
|||
|
most, implementations of it provide a number of their
|
|||
|
own extensions. This document does not deal with such
|
|||
|
extensions explicitly since they constitute an unknown
|
|||
|
factor. Message sending software should not make use
|
|||
|
of nonstandard extensions; they are likely to be
|
|||
|
missing from some implementations. Message receiving
|
|||
|
and displaying software should make sure that any
|
|||
|
nonstandard PostScript operators are secure and don't
|
|||
|
present any kind of threat.
|
|||
|
|
|||
|
(6) It is possible to write PostScript that consumes huge
|
|||
|
amounts of various system resources. It is also
|
|||
|
possible to write PostScript programs that loop
|
|||
|
indefinitely. Both types of programs have the
|
|||
|
potential to cause damage if sent to unsuspecting
|
|||
|
recipients. Message-sending software should avoid the
|
|||
|
construction and dissemination of such programs, which
|
|||
|
is antisocial. Message receiving and displaying
|
|||
|
software should provide appropriate mechanisms to abort
|
|||
|
processing after a reasonable amount of time has
|
|||
|
elapsed. In addition, PostScript interpreters should be
|
|||
|
limited to the consumption of only a reasonable amount
|
|||
|
of any given system resource.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 16]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
(7) It is possible to include raw binary information inside
|
|||
|
PostScript in various forms. This is not recommended
|
|||
|
for use in Internet mail, both because it is not
|
|||
|
supported by all PostScript interpreters and because it
|
|||
|
significantly complicates the use of a MIME Content-
|
|||
|
Transfer-Encoding. (Without such binary, PostScript
|
|||
|
may typically be viewed as line-oriented data. The
|
|||
|
treatment of CRLF sequences becomes extremely
|
|||
|
problematic if binary and line-oriented data are mixed
|
|||
|
in a single Postscript data stream.)
|
|||
|
|
|||
|
(8) Finally, bugs may exist in some PostScript interpreters
|
|||
|
which could possibly be exploited to gain unauthorized
|
|||
|
access to a recipient's system. Apart from noting this
|
|||
|
possibility, there is no specific action to take to
|
|||
|
prevent this, apart from the timely correction of such
|
|||
|
bugs if any are found.
|
|||
|
|
|||
|
4.5.3. Other Application Subtypes
|
|||
|
|
|||
|
It is expected that many other subtypes of "application" will be
|
|||
|
defined in the future. MIME implementations must at a minimum treat
|
|||
|
any unrecognized subtypes as being equivalent to "application/octet-
|
|||
|
stream".
|
|||
|
|
|||
|
5. Composite Media Type Values
|
|||
|
|
|||
|
The remaining two of the seven initial Content-Type values refer to
|
|||
|
composite entities. Composite entities are handled using MIME
|
|||
|
mechanisms -- a MIME processor typically handles the body directly.
|
|||
|
|
|||
|
5.1. Multipart Media Type
|
|||
|
|
|||
|
In the case of multipart entities, in which one or more different
|
|||
|
sets of data are combined in a single body, a "multipart" media type
|
|||
|
field must appear in the entity's header. The body must then contain
|
|||
|
one or more body parts, each preceded by a boundary delimiter line,
|
|||
|
and the last one followed by a closing boundary delimiter line.
|
|||
|
After its boundary delimiter line, each body part then consists of a
|
|||
|
header area, a blank line, and a body area. Thus a body part is
|
|||
|
similar to an RFC 822 message in syntax, but different in meaning.
|
|||
|
|
|||
|
A body part is an entity and hence is NOT to be interpreted as
|
|||
|
actually being an RFC 822 message. To begin with, NO header fields
|
|||
|
are actually required in body parts. A body part that starts with a
|
|||
|
blank line, therefore, is allowed and is a body part for which all
|
|||
|
default values are to be assumed. In such a case, the absence of a
|
|||
|
Content-Type header usually indicates that the corresponding body has
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 17]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
a content-type of "text/plain; charset=US-ASCII".
|
|||
|
|
|||
|
The only header fields that have defined meaning for body parts are
|
|||
|
those the names of which begin with "Content-". All other header
|
|||
|
fields may be ignored in body parts. Although they should generally
|
|||
|
be retained if at all possible, they may be discarded by gateways if
|
|||
|
necessary. Such other fields are permitted to appear in body parts
|
|||
|
but must not be depended on. "X-" fields may be created for
|
|||
|
experimental or private purposes, with the recognition that the
|
|||
|
information they contain may be lost at some gateways.
|
|||
|
|
|||
|
NOTE: The distinction between an RFC 822 message and a body part is
|
|||
|
subtle, but important. A gateway between Internet and X.400 mail,
|
|||
|
for example, must be able to tell the difference between a body part
|
|||
|
that contains an image and a body part that contains an encapsulated
|
|||
|
message, the body of which is a JPEG image. In order to represent
|
|||
|
the latter, the body part must have "Content-Type: message/rfc822",
|
|||
|
and its body (after the blank line) must be the encapsulated message,
|
|||
|
with its own "Content-Type: image/jpeg" header field. The use of
|
|||
|
similar syntax facilitates the conversion of messages to body parts,
|
|||
|
and vice versa, but the distinction between the two must be
|
|||
|
understood by implementors. (For the special case in which parts
|
|||
|
actually are messages, a "digest" subtype is also defined.)
|
|||
|
|
|||
|
As stated previously, each body part is preceded by a boundary
|
|||
|
delimiter line that contains the boundary delimiter. The boundary
|
|||
|
delimiter MUST NOT appear inside any of the encapsulated parts, on a
|
|||
|
line by itself or as the prefix of any line. This implies that it is
|
|||
|
crucial that the composing agent be able to choose and specify a
|
|||
|
unique boundary parameter value that does not contain the boundary
|
|||
|
parameter value of an enclosing multipart as a prefix.
|
|||
|
|
|||
|
All present and future subtypes of the "multipart" type must use an
|
|||
|
identical syntax. Subtypes may differ in their semantics, and may
|
|||
|
impose additional restrictions on syntax, but must conform to the
|
|||
|
required syntax for the "multipart" type. This requirement ensures
|
|||
|
that all conformant user agents will at least be able to recognize
|
|||
|
and separate the parts of any multipart entity, even those of an
|
|||
|
unrecognized subtype.
|
|||
|
|
|||
|
As stated in the definition of the Content-Transfer-Encoding field
|
|||
|
[RFC 2045], no encoding other than "7bit", "8bit", or "binary" is
|
|||
|
permitted for entities of type "multipart". The "multipart" boundary
|
|||
|
delimiters and header fields are always represented as 7bit US-ASCII
|
|||
|
in any case (though the header fields may encode non-US-ASCII header
|
|||
|
text as per RFC 2047) and data within the body parts can be encoded
|
|||
|
on a part-by-part basis, with Content-Transfer-Encoding fields for
|
|||
|
each appropriate body part.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 18]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
5.1.1. Common Syntax
|
|||
|
|
|||
|
This section defines a common syntax for subtypes of "multipart".
|
|||
|
All subtypes of "multipart" must use this syntax. A simple example
|
|||
|
of a multipart message also appears in this section. An example of a
|
|||
|
more complex multipart message is given in RFC 2049.
|
|||
|
|
|||
|
The Content-Type field for multipart entities requires one parameter,
|
|||
|
"boundary". The boundary delimiter line is then defined as a line
|
|||
|
consisting entirely of two hyphen characters ("-", decimal value 45)
|
|||
|
followed by the boundary parameter value from the Content-Type header
|
|||
|
field, optional linear whitespace, and a terminating CRLF.
|
|||
|
|
|||
|
NOTE: The hyphens are for rough compatibility with the earlier RFC
|
|||
|
934 method of message encapsulation, and for ease of searching for
|
|||
|
the boundaries in some implementations. However, it should be noted
|
|||
|
that multipart messages are NOT completely compatible with RFC 934
|
|||
|
encapsulations; in particular, they do not obey RFC 934 quoting
|
|||
|
conventions for embedded lines that begin with hyphens. This
|
|||
|
mechanism was chosen over the RFC 934 mechanism because the latter
|
|||
|
causes lines to grow with each level of quoting. The combination of
|
|||
|
this growth with the fact that SMTP implementations sometimes wrap
|
|||
|
long lines made the RFC 934 mechanism unsuitable for use in the event
|
|||
|
that deeply-nested multipart structuring is ever desired.
|
|||
|
|
|||
|
WARNING TO IMPLEMENTORS: The grammar for parameters on the Content-
|
|||
|
type field is such that it is often necessary to enclose the boundary
|
|||
|
parameter values in quotes on the Content-type line. This is not
|
|||
|
always necessary, but never hurts. Implementors should be sure to
|
|||
|
study the grammar carefully in order to avoid producing invalid
|
|||
|
Content-type fields. Thus, a typical "multipart" Content-Type header
|
|||
|
field might look like this:
|
|||
|
|
|||
|
Content-Type: multipart/mixed; boundary=gc0p4Jq0M2Yt08j34c0p
|
|||
|
|
|||
|
But the following is not valid:
|
|||
|
|
|||
|
Content-Type: multipart/mixed; boundary=gc0pJq0M:08jU534c0p
|
|||
|
|
|||
|
(because of the colon) and must instead be represented as
|
|||
|
|
|||
|
Content-Type: multipart/mixed; boundary="gc0pJq0M:08jU534c0p"
|
|||
|
|
|||
|
This Content-Type value indicates that the content consists of one or
|
|||
|
more parts, each with a structure that is syntactically identical to
|
|||
|
an RFC 822 message, except that the header area is allowed to be
|
|||
|
completely empty, and that the parts are each preceded by the line
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 19]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
--gc0pJq0M:08jU534c0p
|
|||
|
|
|||
|
The boundary delimiter MUST occur at the beginning of a line, i.e.,
|
|||
|
following a CRLF, and the initial CRLF is considered to be attached
|
|||
|
to the boundary delimiter line rather than part of the preceding
|
|||
|
part. The boundary may be followed by zero or more characters of
|
|||
|
linear whitespace. It is then terminated by either another CRLF and
|
|||
|
the header fields for the next part, or by two CRLFs, in which case
|
|||
|
there are no header fields for the next part. If no Content-Type
|
|||
|
field is present it is assumed to be "message/rfc822" in a
|
|||
|
"multipart/digest" and "text/plain" otherwise.
|
|||
|
|
|||
|
NOTE: The CRLF preceding the boundary delimiter line is conceptually
|
|||
|
attached to the boundary so that it is possible to have a part that
|
|||
|
does not end with a CRLF (line break). Body parts that must be
|
|||
|
considered to end with line breaks, therefore, must have two CRLFs
|
|||
|
preceding the boundary delimiter line, the first of which is part of
|
|||
|
the preceding body part, and the second of which is part of the
|
|||
|
encapsulation boundary.
|
|||
|
|
|||
|
Boundary delimiters must not appear within the encapsulated material,
|
|||
|
and must be no longer than 70 characters, not counting the two
|
|||
|
leading hyphens.
|
|||
|
|
|||
|
The boundary delimiter line following the last body part is a
|
|||
|
distinguished delimiter that indicates that no further body parts
|
|||
|
will follow. Such a delimiter line is identical to the previous
|
|||
|
delimiter lines, with the addition of two more hyphens after the
|
|||
|
boundary parameter value.
|
|||
|
|
|||
|
--gc0pJq0M:08jU534c0p--
|
|||
|
|
|||
|
NOTE TO IMPLEMENTORS: Boundary string comparisons must compare the
|
|||
|
boundary value with the beginning of each candidate line. An exact
|
|||
|
match of the entire candidate line is not required; it is sufficient
|
|||
|
that the boundary appear in its entirety following the CRLF.
|
|||
|
|
|||
|
There appears to be room for additional information prior to the
|
|||
|
first boundary delimiter line and following the final boundary
|
|||
|
delimiter line. These areas should generally be left blank, and
|
|||
|
implementations must ignore anything that appears before the first
|
|||
|
boundary delimiter line or after the last one.
|
|||
|
|
|||
|
NOTE: These "preamble" and "epilogue" areas are generally not used
|
|||
|
because of the lack of proper typing of these parts and the lack of
|
|||
|
clear semantics for handling these areas at gateways, particularly
|
|||
|
X.400 gateways. However, rather than leaving the preamble area
|
|||
|
blank, many MIME implementations have found this to be a convenient
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 20]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
place to insert an explanatory note for recipients who read the
|
|||
|
message with pre-MIME software, since such notes will be ignored by
|
|||
|
MIME-compliant software.
|
|||
|
|
|||
|
NOTE: Because boundary delimiters must not appear in the body parts
|
|||
|
being encapsulated, a user agent must exercise care to choose a
|
|||
|
unique boundary parameter value. The boundary parameter value in the
|
|||
|
example above could have been the result of an algorithm designed to
|
|||
|
produce boundary delimiters with a very low probability of already
|
|||
|
existing in the data to be encapsulated without having to prescan the
|
|||
|
data. Alternate algorithms might result in more "readable" boundary
|
|||
|
delimiters for a recipient with an old user agent, but would require
|
|||
|
more attention to the possibility that the boundary delimiter might
|
|||
|
appear at the beginning of some line in the encapsulated part. The
|
|||
|
simplest boundary delimiter line possible is something like "---",
|
|||
|
with a closing boundary delimiter line of "-----".
|
|||
|
|
|||
|
As a very simple example, the following multipart message has two
|
|||
|
parts, both of them plain text, one of them explicitly typed and one
|
|||
|
of them implicitly typed:
|
|||
|
|
|||
|
From: Nathaniel Borenstein <nsb@bellcore.com>
|
|||
|
To: Ned Freed <ned@innosoft.com>
|
|||
|
Date: Sun, 21 Mar 1993 23:56:48 -0800 (PST)
|
|||
|
Subject: Sample message
|
|||
|
MIME-Version: 1.0
|
|||
|
Content-type: multipart/mixed; boundary="simple boundary"
|
|||
|
|
|||
|
This is the preamble. It is to be ignored, though it
|
|||
|
is a handy place for composition agents to include an
|
|||
|
explanatory note to non-MIME conformant readers.
|
|||
|
|
|||
|
--simple boundary
|
|||
|
|
|||
|
This is implicitly typed plain US-ASCII text.
|
|||
|
It does NOT end with a linebreak.
|
|||
|
--simple boundary
|
|||
|
Content-type: text/plain; charset=us-ascii
|
|||
|
|
|||
|
This is explicitly typed plain US-ASCII text.
|
|||
|
It DOES end with a linebreak.
|
|||
|
|
|||
|
--simple boundary--
|
|||
|
|
|||
|
This is the epilogue. It is also to be ignored.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 21]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
The use of a media type of "multipart" in a body part within another
|
|||
|
"multipart" entity is explicitly allowed. In such cases, for obvious
|
|||
|
reasons, care must be taken to ensure that each nested "multipart"
|
|||
|
entity uses a different boundary delimiter. See RFC 2049 for an
|
|||
|
example of nested "multipart" entities.
|
|||
|
|
|||
|
The use of the "multipart" media type with only a single body part
|
|||
|
may be useful in certain contexts, and is explicitly permitted.
|
|||
|
|
|||
|
NOTE: Experience has shown that a "multipart" media type with a
|
|||
|
single body part is useful for sending non-text media types. It has
|
|||
|
the advantage of providing the preamble as a place to include
|
|||
|
decoding instructions. In addition, a number of SMTP gateways move
|
|||
|
or remove the MIME headers, and a clever MIME decoder can take a good
|
|||
|
guess at multipart boundaries even in the absence of the Content-Type
|
|||
|
header and thereby successfully decode the message.
|
|||
|
|
|||
|
The only mandatory global parameter for the "multipart" media type is
|
|||
|
the boundary parameter, which consists of 1 to 70 characters from a
|
|||
|
set of characters known to be very robust through mail gateways, and
|
|||
|
NOT ending with white space. (If a boundary delimiter line appears to
|
|||
|
end with white space, the white space must be presumed to have been
|
|||
|
added by a gateway, and must be deleted.) It is formally specified
|
|||
|
by the following BNF:
|
|||
|
|
|||
|
boundary := 0*69<bchars> bcharsnospace
|
|||
|
|
|||
|
bchars := bcharsnospace / " "
|
|||
|
|
|||
|
bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" /
|
|||
|
"+" / "_" / "," / "-" / "." /
|
|||
|
"/" / ":" / "=" / "?"
|
|||
|
|
|||
|
Overall, the body of a "multipart" entity may be specified as
|
|||
|
follows:
|
|||
|
|
|||
|
dash-boundary := "--" boundary
|
|||
|
; boundary taken from the value of
|
|||
|
; boundary parameter of the
|
|||
|
; Content-Type field.
|
|||
|
|
|||
|
multipart-body := [preamble CRLF]
|
|||
|
dash-boundary transport-padding CRLF
|
|||
|
body-part *encapsulation
|
|||
|
close-delimiter transport-padding
|
|||
|
[CRLF epilogue]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 22]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
transport-padding := *LWSP-char
|
|||
|
; Composers MUST NOT generate
|
|||
|
; non-zero length transport
|
|||
|
; padding, but receivers MUST
|
|||
|
; be able to handle padding
|
|||
|
; added by message transports.
|
|||
|
|
|||
|
encapsulation := delimiter transport-padding
|
|||
|
CRLF body-part
|
|||
|
|
|||
|
delimiter := CRLF dash-boundary
|
|||
|
|
|||
|
close-delimiter := delimiter "--"
|
|||
|
|
|||
|
preamble := discard-text
|
|||
|
|
|||
|
epilogue := discard-text
|
|||
|
|
|||
|
discard-text := *(*text CRLF) *text
|
|||
|
; May be ignored or discarded.
|
|||
|
|
|||
|
body-part := MIME-part-headers [CRLF *OCTET]
|
|||
|
; Lines in a body-part must not start
|
|||
|
; with the specified dash-boundary and
|
|||
|
; the delimiter must not appear anywhere
|
|||
|
; in the body part. Note that the
|
|||
|
; semantics of a body-part differ from
|
|||
|
; the semantics of a message, as
|
|||
|
; described in the text.
|
|||
|
|
|||
|
OCTET := <any 0-255 octet value>
|
|||
|
|
|||
|
IMPORTANT: The free insertion of linear-white-space and RFC 822
|
|||
|
comments between the elements shown in this BNF is NOT allowed since
|
|||
|
this BNF does not specify a structured header field.
|
|||
|
|
|||
|
NOTE: In certain transport enclaves, RFC 822 restrictions such as
|
|||
|
the one that limits bodies to printable US-ASCII characters may not
|
|||
|
be in force. (That is, the transport domains may exist that resemble
|
|||
|
standard Internet mail transport as specified in RFC 821 and assumed
|
|||
|
by RFC 822, but without certain restrictions.) The relaxation of
|
|||
|
these restrictions should be construed as locally extending the
|
|||
|
definition of bodies, for example to include octets outside of the
|
|||
|
US-ASCII range, as long as these extensions are supported by the
|
|||
|
transport and adequately documented in the Content- Transfer-Encoding
|
|||
|
header field. However, in no event are headers (either message
|
|||
|
headers or body part headers) allowed to contain anything other than
|
|||
|
US-ASCII characters.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 23]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
NOTE: Conspicuously missing from the "multipart" type is a notion of
|
|||
|
structured, related body parts. It is recommended that those wishing
|
|||
|
to provide more structured or integrated multipart messaging
|
|||
|
facilities should define subtypes of multipart that are syntactically
|
|||
|
identical but define relationships between the various parts. For
|
|||
|
example, subtypes of multipart could be defined that include a
|
|||
|
distinguished part which in turn is used to specify the relationships
|
|||
|
between the other parts, probably referring to them by their
|
|||
|
Content-ID field. Old implementations will not recognize the new
|
|||
|
subtype if this approach is used, but will treat it as
|
|||
|
multipart/mixed and will thus be able to show the user the parts that
|
|||
|
are recognized.
|
|||
|
|
|||
|
5.1.2. Handling Nested Messages and Multiparts
|
|||
|
|
|||
|
The "message/rfc822" subtype defined in a subsequent section of this
|
|||
|
document has no terminating condition other than running out of data.
|
|||
|
Similarly, an improperly truncated "multipart" entity may not have
|
|||
|
any terminating boundary marker, and can turn up operationally due to
|
|||
|
mail system malfunctions.
|
|||
|
|
|||
|
It is essential that such entities be handled correctly when they are
|
|||
|
themselves imbedded inside of another "multipart" structure. MIME
|
|||
|
implementations are therefore required to recognize outer level
|
|||
|
boundary markers at ANY level of inner nesting. It is not sufficient
|
|||
|
to only check for the next expected marker or other terminating
|
|||
|
condition.
|
|||
|
|
|||
|
5.1.3. Mixed Subtype
|
|||
|
|
|||
|
The "mixed" subtype of "multipart" is intended for use when the body
|
|||
|
parts are independent and need to be bundled in a particular order.
|
|||
|
Any "multipart" subtypes that an implementation does not recognize
|
|||
|
must be treated as being of subtype "mixed".
|
|||
|
|
|||
|
5.1.4. Alternative Subtype
|
|||
|
|
|||
|
The "multipart/alternative" type is syntactically identical to
|
|||
|
"multipart/mixed", but the semantics are different. In particular,
|
|||
|
each of the body parts is an "alternative" version of the same
|
|||
|
information.
|
|||
|
|
|||
|
Systems should recognize that the content of the various parts are
|
|||
|
interchangeable. Systems should choose the "best" type based on the
|
|||
|
local environment and references, in some cases even through user
|
|||
|
interaction. As with "multipart/mixed", the order of body parts is
|
|||
|
significant. In this case, the alternatives appear in an order of
|
|||
|
increasing faithfulness to the original content. In general, the
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 24]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
best choice is the LAST part of a type supported by the recipient
|
|||
|
system's local environment.
|
|||
|
|
|||
|
"Multipart/alternative" may be used, for example, to send a message
|
|||
|
in a fancy text format in such a way that it can easily be displayed
|
|||
|
anywhere:
|
|||
|
|
|||
|
From: Nathaniel Borenstein <nsb@bellcore.com>
|
|||
|
To: Ned Freed <ned@innosoft.com>
|
|||
|
Date: Mon, 22 Mar 1993 09:41:09 -0800 (PST)
|
|||
|
Subject: Formatted text mail
|
|||
|
MIME-Version: 1.0
|
|||
|
Content-Type: multipart/alternative; boundary=boundary42
|
|||
|
|
|||
|
--boundary42
|
|||
|
Content-Type: text/plain; charset=us-ascii
|
|||
|
|
|||
|
... plain text version of message goes here ...
|
|||
|
|
|||
|
--boundary42
|
|||
|
Content-Type: text/enriched
|
|||
|
|
|||
|
... RFC 1896 text/enriched version of same message
|
|||
|
goes here ...
|
|||
|
|
|||
|
--boundary42
|
|||
|
Content-Type: application/x-whatever
|
|||
|
|
|||
|
... fanciest version of same message goes here ...
|
|||
|
|
|||
|
--boundary42--
|
|||
|
|
|||
|
In this example, users whose mail systems understood the
|
|||
|
"application/x-whatever" format would see only the fancy version,
|
|||
|
while other users would see only the enriched or plain text version,
|
|||
|
depending on the capabilities of their system.
|
|||
|
|
|||
|
In general, user agents that compose "multipart/alternative" entities
|
|||
|
must place the body parts in increasing order of preference, that is,
|
|||
|
with the preferred format last. For fancy text, the sending user
|
|||
|
agent should put the plainest format first and the richest format
|
|||
|
last. Receiving user agents should pick and display the last format
|
|||
|
they are capable of displaying. In the case where one of the
|
|||
|
alternatives is itself of type "multipart" and contains unrecognized
|
|||
|
sub-parts, the user agent may choose either to show that alternative,
|
|||
|
an earlier alternative, or both.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 25]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
NOTE: From an implementor's perspective, it might seem more sensible
|
|||
|
to reverse this ordering, and have the plainest alternative last.
|
|||
|
However, placing the plainest alternative first is the friendliest
|
|||
|
possible option when "multipart/alternative" entities are viewed
|
|||
|
using a non-MIME-conformant viewer. While this approach does impose
|
|||
|
some burden on conformant MIME viewers, interoperability with older
|
|||
|
mail readers was deemed to be more important in this case.
|
|||
|
|
|||
|
It may be the case that some user agents, if they can recognize more
|
|||
|
than one of the formats, will prefer to offer the user the choice of
|
|||
|
which format to view. This makes sense, for example, if a message
|
|||
|
includes both a nicely- formatted image version and an easily-edited
|
|||
|
text version. What is most critical, however, is that the user not
|
|||
|
automatically be shown multiple versions of the same data. Either
|
|||
|
the user should be shown the last recognized version or should be
|
|||
|
given the choice.
|
|||
|
|
|||
|
THE SEMANTICS OF CONTENT-ID IN MULTIPART/ALTERNATIVE: Each part of a
|
|||
|
"multipart/alternative" entity represents the same data, but the
|
|||
|
mappings between the two are not necessarily without information
|
|||
|
loss. For example, information is lost when translating ODA to
|
|||
|
PostScript or plain text. It is recommended that each part should
|
|||
|
have a different Content-ID value in the case where the information
|
|||
|
content of the two parts is not identical. And when the information
|
|||
|
content is identical -- for example, where several parts of type
|
|||
|
"message/external-body" specify alternate ways to access the
|
|||
|
identical data -- the same Content-ID field value should be used, to
|
|||
|
optimize any caching mechanisms that might be present on the
|
|||
|
recipient's end. However, the Content-ID values used by the parts
|
|||
|
should NOT be the same Content-ID value that describes the
|
|||
|
"multipart/alternative" as a whole, if there is any such Content-ID
|
|||
|
field. That is, one Content-ID value will refer to the
|
|||
|
"multipart/alternative" entity, while one or more other Content-ID
|
|||
|
values will refer to the parts inside it.
|
|||
|
|
|||
|
5.1.5. Digest Subtype
|
|||
|
|
|||
|
This document defines a "digest" subtype of the "multipart" Content-
|
|||
|
Type. This type is syntactically identical to "multipart/mixed", but
|
|||
|
the semantics are different. In particular, in a digest, the default
|
|||
|
Content-Type value for a body part is changed from "text/plain" to
|
|||
|
"message/rfc822". This is done to allow a more readable digest
|
|||
|
format that is largely compatible (except for the quoting convention)
|
|||
|
with RFC 934.
|
|||
|
|
|||
|
Note: Though it is possible to specify a Content-Type value for a
|
|||
|
body part in a digest which is other than "message/rfc822", such as a
|
|||
|
"text/plain" part containing a description of the material in the
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 26]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
digest, actually doing so is undesireble. The "multipart/digest"
|
|||
|
Content-Type is intended to be used to send collections of messages.
|
|||
|
If a "text/plain" part is needed, it should be included as a seperate
|
|||
|
part of a "multipart/mixed" message.
|
|||
|
|
|||
|
A digest in this format might, then, look something like this:
|
|||
|
|
|||
|
From: Moderator-Address
|
|||
|
To: Recipient-List
|
|||
|
Date: Mon, 22 Mar 1994 13:34:51 +0000
|
|||
|
Subject: Internet Digest, volume 42
|
|||
|
MIME-Version: 1.0
|
|||
|
Content-Type: multipart/mixed;
|
|||
|
boundary="---- main boundary ----"
|
|||
|
|
|||
|
------ main boundary ----
|
|||
|
|
|||
|
...Introductory text or table of contents...
|
|||
|
|
|||
|
------ main boundary ----
|
|||
|
Content-Type: multipart/digest;
|
|||
|
boundary="---- next message ----"
|
|||
|
|
|||
|
------ next message ----
|
|||
|
|
|||
|
From: someone-else
|
|||
|
Date: Fri, 26 Mar 1993 11:13:32 +0200
|
|||
|
Subject: my opinion
|
|||
|
|
|||
|
...body goes here ...
|
|||
|
|
|||
|
------ next message ----
|
|||
|
|
|||
|
From: someone-else-again
|
|||
|
Date: Fri, 26 Mar 1993 10:07:13 -0500
|
|||
|
Subject: my different opinion
|
|||
|
|
|||
|
... another body goes here ...
|
|||
|
|
|||
|
------ next message ------
|
|||
|
|
|||
|
------ main boundary ------
|
|||
|
|
|||
|
5.1.6. Parallel Subtype
|
|||
|
|
|||
|
This document defines a "parallel" subtype of the "multipart"
|
|||
|
Content-Type. This type is syntactically identical to
|
|||
|
"multipart/mixed", but the semantics are different. In particular,
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 27]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
in a parallel entity, the order of body parts is not significant.
|
|||
|
|
|||
|
A common presentation of this type is to display all of the parts
|
|||
|
simultaneously on hardware and software that are capable of doing so.
|
|||
|
However, composing agents should be aware that many mail readers will
|
|||
|
lack this capability and will show the parts serially in any event.
|
|||
|
|
|||
|
5.1.7. Other Multipart Subtypes
|
|||
|
|
|||
|
Other "multipart" subtypes are expected in the future. MIME
|
|||
|
implementations must in general treat unrecognized subtypes of
|
|||
|
"multipart" as being equivalent to "multipart/mixed".
|
|||
|
|
|||
|
5.2. Message Media Type
|
|||
|
|
|||
|
It is frequently desirable, in sending mail, to encapsulate another
|
|||
|
mail message. A special media type, "message", is defined to
|
|||
|
facilitate this. In particular, the "rfc822" subtype of "message" is
|
|||
|
used to encapsulate RFC 822 messages.
|
|||
|
|
|||
|
NOTE: It has been suggested that subtypes of "message" might be
|
|||
|
defined for forwarded or rejected messages. However, forwarded and
|
|||
|
rejected messages can be handled as multipart messages in which the
|
|||
|
first part contains any control or descriptive information, and a
|
|||
|
second part, of type "message/rfc822", is the forwarded or rejected
|
|||
|
message. Composing rejection and forwarding messages in this manner
|
|||
|
will preserve the type information on the original message and allow
|
|||
|
it to be correctly presented to the recipient, and hence is strongly
|
|||
|
encouraged.
|
|||
|
|
|||
|
Subtypes of "message" often impose restrictions on what encodings are
|
|||
|
allowed. These restrictions are described in conjunction with each
|
|||
|
specific subtype.
|
|||
|
|
|||
|
Mail gateways, relays, and other mail handling agents are commonly
|
|||
|
known to alter the top-level header of an RFC 822 message. In
|
|||
|
particular, they frequently add, remove, or reorder header fields.
|
|||
|
These operations are explicitly forbidden for the encapsulated
|
|||
|
headers embedded in the bodies of messages of type "message."
|
|||
|
|
|||
|
5.2.1. RFC822 Subtype
|
|||
|
|
|||
|
A media type of "message/rfc822" indicates that the body contains an
|
|||
|
encapsulated message, with the syntax of an RFC 822 message.
|
|||
|
However, unlike top-level RFC 822 messages, the restriction that each
|
|||
|
"message/rfc822" body must include a "From", "Date", and at least one
|
|||
|
destination header is removed and replaced with the requirement that
|
|||
|
at least one of "From", "Subject", or "Date" must be present.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 28]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
It should be noted that, despite the use of the numbers "822", a
|
|||
|
"message/rfc822" entity isn't restricted to material in strict
|
|||
|
conformance to RFC822, nor are the semantics of "message/rfc822"
|
|||
|
objects restricted to the semantics defined in RFC822. More
|
|||
|
specifically, a "message/rfc822" message could well be a News article
|
|||
|
or a MIME message.
|
|||
|
|
|||
|
No encoding other than "7bit", "8bit", or "binary" is permitted for
|
|||
|
the body of a "message/rfc822" entity. The message header fields are
|
|||
|
always US-ASCII in any case, and data within the body can still be
|
|||
|
encoded, in which case the Content-Transfer-Encoding header field in
|
|||
|
the encapsulated message will reflect this. Non-US-ASCII text in the
|
|||
|
headers of an encapsulated message can be specified using the
|
|||
|
mechanisms described in RFC 2047.
|
|||
|
|
|||
|
5.2.2. Partial Subtype
|
|||
|
|
|||
|
The "partial" subtype is defined to allow large entities to be
|
|||
|
delivered as several separate pieces of mail and automatically
|
|||
|
reassembled by a receiving user agent. (The concept is similar to IP
|
|||
|
fragmentation and reassembly in the basic Internet Protocols.) This
|
|||
|
mechanism can be used when intermediate transport agents limit the
|
|||
|
size of individual messages that can be sent. The media type
|
|||
|
"message/partial" thus indicates that the body contains a fragment of
|
|||
|
a larger entity.
|
|||
|
|
|||
|
Because data of type "message" may never be encoded in base64 or
|
|||
|
quoted-printable, a problem might arise if "message/partial" entities
|
|||
|
are constructed in an environment that supports binary or 8bit
|
|||
|
transport. The problem is that the binary data would be split into
|
|||
|
multiple "message/partial" messages, each of them requiring binary
|
|||
|
transport. If such messages were encountered at a gateway into a
|
|||
|
7bit transport environment, there would be no way to properly encode
|
|||
|
them for the 7bit world, aside from waiting for all of the fragments,
|
|||
|
reassembling the inner message, and then encoding the reassembled
|
|||
|
data in base64 or quoted-printable. Since it is possible that
|
|||
|
different fragments might go through different gateways, even this is
|
|||
|
not an acceptable solution. For this reason, it is specified that
|
|||
|
entities of type "message/partial" must always have a content-
|
|||
|
transfer-encoding of 7bit (the default). In particular, even in
|
|||
|
environments that support binary or 8bit transport, the use of a
|
|||
|
content- transfer-encoding of "8bit" or "binary" is explicitly
|
|||
|
prohibited for MIME entities of type "message/partial". This in turn
|
|||
|
implies that the inner message must not use "8bit" or "binary"
|
|||
|
encoding.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 29]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
Because some message transfer agents may choose to automatically
|
|||
|
fragment large messages, and because such agents may use very
|
|||
|
different fragmentation thresholds, it is possible that the pieces of
|
|||
|
a partial message, upon reassembly, may prove themselves to comprise
|
|||
|
a partial message. This is explicitly permitted.
|
|||
|
|
|||
|
Three parameters must be specified in the Content-Type field of type
|
|||
|
"message/partial": The first, "id", is a unique identifier, as close
|
|||
|
to a world-unique identifier as possible, to be used to match the
|
|||
|
fragments together. (In general, the identifier is essentially a
|
|||
|
message-id; if placed in double quotes, it can be ANY message-id, in
|
|||
|
accordance with the BNF for "parameter" given in RFC 2045.) The
|
|||
|
second, "number", an integer, is the fragment number, which indicates
|
|||
|
where this fragment fits into the sequence of fragments. The third,
|
|||
|
"total", another integer, is the total number of fragments. This
|
|||
|
third subfield is required on the final fragment, and is optional
|
|||
|
(though encouraged) on the earlier fragments. Note also that these
|
|||
|
parameters may be given in any order.
|
|||
|
|
|||
|
Thus, the second piece of a 3-piece message may have either of the
|
|||
|
following header fields:
|
|||
|
|
|||
|
Content-Type: Message/Partial; number=2; total=3;
|
|||
|
id="oc=jpbe0M2Yt4s@thumper.bellcore.com"
|
|||
|
|
|||
|
Content-Type: Message/Partial;
|
|||
|
id="oc=jpbe0M2Yt4s@thumper.bellcore.com";
|
|||
|
number=2
|
|||
|
|
|||
|
But the third piece MUST specify the total number of fragments:
|
|||
|
|
|||
|
Content-Type: Message/Partial; number=3; total=3;
|
|||
|
id="oc=jpbe0M2Yt4s@thumper.bellcore.com"
|
|||
|
|
|||
|
Note that fragment numbering begins with 1, not 0.
|
|||
|
|
|||
|
When the fragments of an entity broken up in this manner are put
|
|||
|
together, the result is always a complete MIME entity, which may have
|
|||
|
its own Content-Type header field, and thus may contain any other
|
|||
|
data type.
|
|||
|
|
|||
|
5.2.2.1. Message Fragmentation and Reassembly
|
|||
|
|
|||
|
The semantics of a reassembled partial message must be those of the
|
|||
|
"inner" message, rather than of a message containing the inner
|
|||
|
message. This makes it possible, for example, to send a large audio
|
|||
|
message as several partial messages, and still have it appear to the
|
|||
|
recipient as a simple audio message rather than as an encapsulated
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 30]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
message containing an audio message. That is, the encapsulation of
|
|||
|
the message is considered to be "transparent".
|
|||
|
|
|||
|
When generating and reassembling the pieces of a "message/partial"
|
|||
|
message, the headers of the encapsulated message must be merged with
|
|||
|
the headers of the enclosing entities. In this process the following
|
|||
|
rules must be observed:
|
|||
|
|
|||
|
(1) Fragmentation agents must split messages at line
|
|||
|
boundaries only. This restriction is imposed because
|
|||
|
splits at points other than the ends of lines in turn
|
|||
|
depends on message transports being able to preserve
|
|||
|
the semantics of messages that don't end with a CRLF
|
|||
|
sequence. Many transports are incapable of preserving
|
|||
|
such semantics.
|
|||
|
|
|||
|
(2) All of the header fields from the initial enclosing
|
|||
|
message, except those that start with "Content-" and
|
|||
|
the specific header fields "Subject", "Message-ID",
|
|||
|
"Encrypted", and "MIME-Version", must be copied, in
|
|||
|
order, to the new message.
|
|||
|
|
|||
|
(3) The header fields in the enclosed message which start
|
|||
|
with "Content-", plus the "Subject", "Message-ID",
|
|||
|
"Encrypted", and "MIME-Version" fields, must be
|
|||
|
appended, in order, to the header fields of the new
|
|||
|
message. Any header fields in the enclosed message
|
|||
|
which do not start with "Content-" (except for the
|
|||
|
"Subject", "Message-ID", "Encrypted", and "MIME-
|
|||
|
Version" fields) will be ignored and dropped.
|
|||
|
|
|||
|
(4) All of the header fields from the second and any
|
|||
|
subsequent enclosing messages are discarded by the
|
|||
|
reassembly process.
|
|||
|
|
|||
|
5.2.2.2. Fragmentation and Reassembly Example
|
|||
|
|
|||
|
If an audio message is broken into two pieces, the first piece might
|
|||
|
look something like this:
|
|||
|
|
|||
|
X-Weird-Header-1: Foo
|
|||
|
From: Bill@host.com
|
|||
|
To: joe@otherhost.com
|
|||
|
Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST)
|
|||
|
Subject: Audio mail (part 1 of 2)
|
|||
|
Message-ID: <id1@host.com>
|
|||
|
MIME-Version: 1.0
|
|||
|
Content-type: message/partial; id="ABC@host.com";
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 31]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
number=1; total=2
|
|||
|
|
|||
|
X-Weird-Header-1: Bar
|
|||
|
X-Weird-Header-2: Hello
|
|||
|
Message-ID: <anotherid@foo.com>
|
|||
|
Subject: Audio mail
|
|||
|
MIME-Version: 1.0
|
|||
|
Content-type: audio/basic
|
|||
|
Content-transfer-encoding: base64
|
|||
|
|
|||
|
... first half of encoded audio data goes here ...
|
|||
|
|
|||
|
and the second half might look something like this:
|
|||
|
|
|||
|
From: Bill@host.com
|
|||
|
To: joe@otherhost.com
|
|||
|
Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST)
|
|||
|
Subject: Audio mail (part 2 of 2)
|
|||
|
MIME-Version: 1.0
|
|||
|
Message-ID: <id2@host.com>
|
|||
|
Content-type: message/partial;
|
|||
|
id="ABC@host.com"; number=2; total=2
|
|||
|
|
|||
|
... second half of encoded audio data goes here ...
|
|||
|
|
|||
|
Then, when the fragmented message is reassembled, the resulting
|
|||
|
message to be displayed to the user should look something like this:
|
|||
|
|
|||
|
X-Weird-Header-1: Foo
|
|||
|
From: Bill@host.com
|
|||
|
To: joe@otherhost.com
|
|||
|
Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST)
|
|||
|
Subject: Audio mail
|
|||
|
Message-ID: <anotherid@foo.com>
|
|||
|
MIME-Version: 1.0
|
|||
|
Content-type: audio/basic
|
|||
|
Content-transfer-encoding: base64
|
|||
|
|
|||
|
... first half of encoded audio data goes here ...
|
|||
|
... second half of encoded audio data goes here ...
|
|||
|
|
|||
|
The inclusion of a "References" field in the headers of the second
|
|||
|
and subsequent pieces of a fragmented message that references the
|
|||
|
Message-Id on the previous piece may be of benefit to mail readers
|
|||
|
that understand and track references. However, the generation of
|
|||
|
such "References" fields is entirely optional.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 32]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
Finally, it should be noted that the "Encrypted" header field has
|
|||
|
been made obsolete by Privacy Enhanced Messaging (PEM) [RFC-1421,
|
|||
|
RFC-1422, RFC-1423, RFC-1424], but the rules above are nevertheless
|
|||
|
believed to describe the correct way to treat it if it is encountered
|
|||
|
in the context of conversion to and from "message/partial" fragments.
|
|||
|
|
|||
|
5.2.3. External-Body Subtype
|
|||
|
|
|||
|
The external-body subtype indicates that the actual body data are not
|
|||
|
included, but merely referenced. In this case, the parameters
|
|||
|
describe a mechanism for accessing the external data.
|
|||
|
|
|||
|
When a MIME entity is of type "message/external-body", it consists of
|
|||
|
a header, two consecutive CRLFs, and the message header for the
|
|||
|
encapsulated message. If another pair of consecutive CRLFs appears,
|
|||
|
this of course ends the message header for the encapsulated message.
|
|||
|
However, since the encapsulated message's body is itself external, it
|
|||
|
does NOT appear in the area that follows. For example, consider the
|
|||
|
following message:
|
|||
|
|
|||
|
Content-type: message/external-body;
|
|||
|
access-type=local-file;
|
|||
|
name="/u/nsb/Me.jpeg"
|
|||
|
|
|||
|
Content-type: image/jpeg
|
|||
|
Content-ID: <id42@guppylake.bellcore.com>
|
|||
|
Content-Transfer-Encoding: binary
|
|||
|
|
|||
|
THIS IS NOT REALLY THE BODY!
|
|||
|
|
|||
|
The area at the end, which might be called the "phantom body", is
|
|||
|
ignored for most external-body messages. However, it may be used to
|
|||
|
contain auxiliary information for some such messages, as indeed it is
|
|||
|
when the access-type is "mail- server". The only access-type defined
|
|||
|
in this document that uses the phantom body is "mail-server", but
|
|||
|
other access-types may be defined in the future in other
|
|||
|
specifications that use this area.
|
|||
|
|
|||
|
The encapsulated headers in ALL "message/external-body" entities MUST
|
|||
|
include a Content-ID header field to give a unique identifier by
|
|||
|
which to reference the data. This identifier may be used for caching
|
|||
|
mechanisms, and for recognizing the receipt of the data when the
|
|||
|
access-type is "mail-server".
|
|||
|
|
|||
|
Note that, as specified here, the tokens that describe external-body
|
|||
|
data, such as file names and mail server commands, are required to be
|
|||
|
in the US-ASCII character set.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 33]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
If this proves problematic in practice, a new mechanism may be
|
|||
|
required as a future extension to MIME, either as newly defined
|
|||
|
access-types for "message/external-body" or by some other mechanism.
|
|||
|
|
|||
|
As with "message/partial", MIME entities of type "message/external-
|
|||
|
body" MUST have a content-transfer-encoding of 7bit (the default).
|
|||
|
In particular, even in environments that support binary or 8bit
|
|||
|
transport, the use of a content- transfer-encoding of "8bit" or
|
|||
|
"binary" is explicitly prohibited for entities of type
|
|||
|
"message/external-body".
|
|||
|
|
|||
|
5.2.3.1. General External-Body Parameters
|
|||
|
|
|||
|
The parameters that may be used with any "message/external- body"
|
|||
|
are:
|
|||
|
|
|||
|
(1) ACCESS-TYPE -- A word indicating the supported access
|
|||
|
mechanism by which the file or data may be obtained.
|
|||
|
This word is not case sensitive. Values include, but
|
|||
|
are not limited to, "FTP", "ANON-FTP", "TFTP", "LOCAL-
|
|||
|
FILE", and "MAIL-SERVER". Future values, except for
|
|||
|
experimental values beginning with "X-", must be
|
|||
|
registered with IANA, as described in RFC 2048.
|
|||
|
This parameter is unconditionally mandatory and MUST be
|
|||
|
present on EVERY "message/external-body".
|
|||
|
|
|||
|
(2) EXPIRATION -- The date (in the RFC 822 "date-time"
|
|||
|
syntax, as extended by RFC 1123 to permit 4 digits in
|
|||
|
the year field) after which the existence of the
|
|||
|
external data is not guaranteed. This parameter may be
|
|||
|
used with ANY access-type and is ALWAYS optional.
|
|||
|
|
|||
|
(3) SIZE -- The size (in octets) of the data. The intent
|
|||
|
of this parameter is to help the recipient decide
|
|||
|
whether or not to expend the necessary resources to
|
|||
|
retrieve the external data. Note that this describes
|
|||
|
the size of the data in its canonical form, that is,
|
|||
|
before any Content-Transfer-Encoding has been applied
|
|||
|
or after the data have been decoded. This parameter
|
|||
|
may be used with ANY access-type and is ALWAYS
|
|||
|
optional.
|
|||
|
|
|||
|
(4) PERMISSION -- A case-insensitive field that indicates
|
|||
|
whether or not it is expected that clients might also
|
|||
|
attempt to overwrite the data. By default, or if
|
|||
|
permission is "read", the assumption is that they are
|
|||
|
not, and that if the data is retrieved once, it is
|
|||
|
never needed again. If PERMISSION is "read-write",
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 34]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
this assumption is invalid, and any local copy must be
|
|||
|
considered no more than a cache. "Read" and "Read-
|
|||
|
write" are the only defined values of permission. This
|
|||
|
parameter may be used with ANY access-type and is
|
|||
|
ALWAYS optional.
|
|||
|
|
|||
|
The precise semantics of the access-types defined here are described
|
|||
|
in the sections that follow.
|
|||
|
|
|||
|
5.2.3.2. The 'ftp' and 'tftp' Access-Types
|
|||
|
|
|||
|
An access-type of FTP or TFTP indicates that the message body is
|
|||
|
accessible as a file using the FTP [RFC-959] or TFTP [RFC- 783]
|
|||
|
protocols, respectively. For these access-types, the following
|
|||
|
additional parameters are mandatory:
|
|||
|
|
|||
|
(1) NAME -- The name of the file that contains the actual
|
|||
|
body data.
|
|||
|
|
|||
|
(2) SITE -- A machine from which the file may be obtained,
|
|||
|
using the given protocol. This must be a fully
|
|||
|
qualified domain name, not a nickname.
|
|||
|
|
|||
|
(3) Before any data are retrieved, using FTP, the user will
|
|||
|
generally need to be asked to provide a login id and a
|
|||
|
password for the machine named by the site parameter.
|
|||
|
For security reasons, such an id and password are not
|
|||
|
specified as content-type parameters, but must be
|
|||
|
obtained from the user.
|
|||
|
|
|||
|
In addition, the following parameters are optional:
|
|||
|
|
|||
|
(1) DIRECTORY -- A directory from which the data named by
|
|||
|
NAME should be retrieved.
|
|||
|
|
|||
|
(2) MODE -- A case-insensitive string indicating the mode
|
|||
|
to be used when retrieving the information. The valid
|
|||
|
values for access-type "TFTP" are "NETASCII", "OCTET",
|
|||
|
and "MAIL", as specified by the TFTP protocol [RFC-
|
|||
|
783]. The valid values for access-type "FTP" are
|
|||
|
"ASCII", "EBCDIC", "IMAGE", and "LOCALn" where "n" is a
|
|||
|
decimal integer, typically 8. These correspond to the
|
|||
|
representation types "A" "E" "I" and "L n" as specified
|
|||
|
by the FTP protocol [RFC-959]. Note that "BINARY" and
|
|||
|
"TENEX" are not valid values for MODE and that "OCTET"
|
|||
|
or "IMAGE" or "LOCAL8" should be used instead. IF MODE
|
|||
|
is not specified, the default value is "NETASCII" for
|
|||
|
TFTP and "ASCII" otherwise.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 35]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
5.2.3.3. The 'anon-ftp' Access-Type
|
|||
|
|
|||
|
The "anon-ftp" access-type is identical to the "ftp" access type,
|
|||
|
except that the user need not be asked to provide a name and password
|
|||
|
for the specified site. Instead, the ftp protocol will be used with
|
|||
|
login "anonymous" and a password that corresponds to the user's mail
|
|||
|
address.
|
|||
|
|
|||
|
5.2.3.4. The 'local-file' Access-Type
|
|||
|
|
|||
|
An access-type of "local-file" indicates that the actual body is
|
|||
|
accessible as a file on the local machine. Two additional parameters
|
|||
|
are defined for this access type:
|
|||
|
|
|||
|
(1) NAME -- The name of the file that contains the actual
|
|||
|
body data. This parameter is mandatory for the
|
|||
|
"local-file" access-type.
|
|||
|
|
|||
|
(2) SITE -- A domain specifier for a machine or set of
|
|||
|
machines that are known to have access to the data
|
|||
|
file. This optional parameter is used to describe the
|
|||
|
locality of reference for the data, that is, the site
|
|||
|
or sites at which the file is expected to be visible.
|
|||
|
Asterisks may be used for wildcard matching to a part
|
|||
|
of a domain name, such as "*.bellcore.com", to indicate
|
|||
|
a set of machines on which the data should be directly
|
|||
|
visible, while a single asterisk may be used to
|
|||
|
indicate a file that is expected to be universally
|
|||
|
available, e.g., via a global file system.
|
|||
|
|
|||
|
5.2.3.5. The 'mail-server' Access-Type
|
|||
|
|
|||
|
The "mail-server" access-type indicates that the actual body is
|
|||
|
available from a mail server. Two additional parameters are defined
|
|||
|
for this access-type:
|
|||
|
|
|||
|
(1) SERVER -- The addr-spec of the mail server from which
|
|||
|
the actual body data can be obtained. This parameter
|
|||
|
is mandatory for the "mail-server" access-type.
|
|||
|
|
|||
|
(2) SUBJECT -- The subject that is to be used in the mail
|
|||
|
that is sent to obtain the data. Note that keying mail
|
|||
|
servers on Subject lines is NOT recommended, but such
|
|||
|
mail servers are known to exist. This is an optional
|
|||
|
parameter.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 36]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
Because mail servers accept a variety of syntaxes, some of which is
|
|||
|
multiline, the full command to be sent to a mail server is not
|
|||
|
included as a parameter in the content-type header field. Instead,
|
|||
|
it is provided as the "phantom body" when the media type is
|
|||
|
"message/external-body" and the access-type is mail-server.
|
|||
|
|
|||
|
Note that MIME does not define a mail server syntax. Rather, it
|
|||
|
allows the inclusion of arbitrary mail server commands in the phantom
|
|||
|
body. Implementations must include the phantom body in the body of
|
|||
|
the message it sends to the mail server address to retrieve the
|
|||
|
relevant data.
|
|||
|
|
|||
|
Unlike other access-types, mail-server access is asynchronous and
|
|||
|
will happen at an unpredictable time in the future. For this reason,
|
|||
|
it is important that there be a mechanism by which the returned data
|
|||
|
can be matched up with the original "message/external-body" entity.
|
|||
|
MIME mail servers must use the same Content-ID field on the returned
|
|||
|
message that was used in the original "message/external-body"
|
|||
|
entities, to facilitate such matching.
|
|||
|
|
|||
|
5.2.3.6. External-Body Security Issues
|
|||
|
|
|||
|
"Message/external-body" entities give rise to two important security
|
|||
|
issues:
|
|||
|
|
|||
|
(1) Accessing data via a "message/external-body" reference
|
|||
|
effectively results in the message recipient performing
|
|||
|
an operation that was specified by the message
|
|||
|
originator. It is therefore possible for the message
|
|||
|
originator to trick a recipient into doing something
|
|||
|
they would not have done otherwise. For example, an
|
|||
|
originator could specify a action that attempts
|
|||
|
retrieval of material that the recipient is not
|
|||
|
authorized to obtain, causing the recipient to
|
|||
|
unwittingly violate some security policy. For this
|
|||
|
reason, user agents capable of resolving external
|
|||
|
references must always take steps to describe the
|
|||
|
action they are to take to the recipient and ask for
|
|||
|
explicit permisssion prior to performing it.
|
|||
|
|
|||
|
The 'mail-server' access-type is particularly
|
|||
|
vulnerable, in that it causes the recipient to send a
|
|||
|
new message whose contents are specified by the
|
|||
|
original message's originator. Given the potential for
|
|||
|
abuse, any such request messages that are constructed
|
|||
|
should contain a clear indication that they were
|
|||
|
generated automatically (e.g. in a Comments: header
|
|||
|
field) in an attempt to resolve a MIME
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 37]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
"message/external-body" reference.
|
|||
|
|
|||
|
(2) MIME will sometimes be used in environments that
|
|||
|
provide some guarantee of message integrity and
|
|||
|
authenticity. If present, such guarantees may apply
|
|||
|
only to the actual direct content of messages -- they
|
|||
|
may or may not apply to data accessed through MIME's
|
|||
|
"message/external-body" mechanism. In particular, it
|
|||
|
may be possible to subvert certain access mechanisms
|
|||
|
even when the messaging system itself is secure.
|
|||
|
|
|||
|
It should be noted that this problem exists either with
|
|||
|
or without the availabilty of MIME mechanisms. A
|
|||
|
casual reference to an FTP site containing a document
|
|||
|
in the text of a secure message brings up similar
|
|||
|
issues -- the only difference is that MIME provides for
|
|||
|
automatic retrieval of such material, and users may
|
|||
|
place unwarranted trust is such automatic retrieval
|
|||
|
mechanisms.
|
|||
|
|
|||
|
5.2.3.7. Examples and Further Explanations
|
|||
|
|
|||
|
When the external-body mechanism is used in conjunction with the
|
|||
|
"multipart/alternative" media type it extends the functionality of
|
|||
|
"multipart/alternative" to include the case where the same entity is
|
|||
|
provided in the same format but via different accces mechanisms.
|
|||
|
When this is done the originator of the message must order the parts
|
|||
|
first in terms of preferred formats and then by preferred access
|
|||
|
mechanisms. The recipient's viewer should then evaluate the list
|
|||
|
both in terms of format and access mechanisms.
|
|||
|
|
|||
|
With the emerging possibility of very wide-area file systems, it
|
|||
|
becomes very hard to know in advance the set of machines where a file
|
|||
|
will and will not be accessible directly from the file system.
|
|||
|
Therefore it may make sense to provide both a file name, to be tried
|
|||
|
directly, and the name of one or more sites from which the file is
|
|||
|
known to be accessible. An implementation can try to retrieve remote
|
|||
|
files using FTP or any other protocol, using anonymous file retrieval
|
|||
|
or prompting the user for the necessary name and password. If an
|
|||
|
external body is accessible via multiple mechanisms, the sender may
|
|||
|
include multiple entities of type "message/external-body" within the
|
|||
|
body parts of an enclosing "multipart/alternative" entity.
|
|||
|
|
|||
|
However, the external-body mechanism is not intended to be limited to
|
|||
|
file retrieval, as shown by the mail-server access-type. Beyond
|
|||
|
this, one can imagine, for example, using a video server for external
|
|||
|
references to video clips.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 38]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
The embedded message header fields which appear in the body of the
|
|||
|
"message/external-body" data must be used to declare the media type
|
|||
|
of the external body if it is anything other than plain US-ASCII
|
|||
|
text, since the external body does not have a header section to
|
|||
|
declare its type. Similarly, any Content-transfer-encoding other
|
|||
|
than "7bit" must also be declared here. Thus a complete
|
|||
|
"message/external-body" message, referring to an object in PostScript
|
|||
|
format, might look like this:
|
|||
|
|
|||
|
From: Whomever
|
|||
|
To: Someone
|
|||
|
Date: Whenever
|
|||
|
Subject: whatever
|
|||
|
MIME-Version: 1.0
|
|||
|
Message-ID: <id1@host.com>
|
|||
|
Content-Type: multipart/alternative; boundary=42
|
|||
|
Content-ID: <id001@guppylake.bellcore.com>
|
|||
|
|
|||
|
--42
|
|||
|
Content-Type: message/external-body; name="BodyFormats.ps";
|
|||
|
site="thumper.bellcore.com"; mode="image";
|
|||
|
access-type=ANON-FTP; directory="pub";
|
|||
|
expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)"
|
|||
|
|
|||
|
Content-type: application/postscript
|
|||
|
Content-ID: <id42@guppylake.bellcore.com>
|
|||
|
|
|||
|
--42
|
|||
|
Content-Type: message/external-body; access-type=local-file;
|
|||
|
name="/u/nsb/writing/rfcs/RFC-MIME.ps";
|
|||
|
site="thumper.bellcore.com";
|
|||
|
expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)"
|
|||
|
|
|||
|
Content-type: application/postscript
|
|||
|
Content-ID: <id42@guppylake.bellcore.com>
|
|||
|
|
|||
|
--42
|
|||
|
Content-Type: message/external-body;
|
|||
|
access-type=mail-server
|
|||
|
server="listserv@bogus.bitnet";
|
|||
|
expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)"
|
|||
|
|
|||
|
Content-type: application/postscript
|
|||
|
Content-ID: <id42@guppylake.bellcore.com>
|
|||
|
|
|||
|
get RFC-MIME.DOC
|
|||
|
|
|||
|
--42--
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 39]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
Note that in the above examples, the default Content-transfer-
|
|||
|
encoding of "7bit" is assumed for the external postscript data.
|
|||
|
|
|||
|
Like the "message/partial" type, the "message/external-body" media
|
|||
|
type is intended to be transparent, that is, to convey the data type
|
|||
|
in the external body rather than to convey a message with a body of
|
|||
|
that type. Thus the headers on the outer and inner parts must be
|
|||
|
merged using the same rules as for "message/partial". In particular,
|
|||
|
this means that the Content-type and Subject fields are overridden,
|
|||
|
but the From field is preserved.
|
|||
|
|
|||
|
Note that since the external bodies are not transported along with
|
|||
|
the external body reference, they need not conform to transport
|
|||
|
limitations that apply to the reference itself. In particular,
|
|||
|
Internet mail transports may impose 7bit and line length limits, but
|
|||
|
these do not automatically apply to binary external body references.
|
|||
|
Thus a Content-Transfer-Encoding is not generally necessary, though
|
|||
|
it is permitted.
|
|||
|
|
|||
|
Note that the body of a message of type "message/external-body" is
|
|||
|
governed by the basic syntax for an RFC 822 message. In particular,
|
|||
|
anything before the first consecutive pair of CRLFs is header
|
|||
|
information, while anything after it is body information, which is
|
|||
|
ignored for most access-types.
|
|||
|
|
|||
|
5.2.4. Other Message Subtypes
|
|||
|
|
|||
|
MIME implementations must in general treat unrecognized subtypes of
|
|||
|
"message" as being equivalent to "application/octet-stream".
|
|||
|
|
|||
|
Future subtypes of "message" intended for use with email should be
|
|||
|
restricted to "7bit" encoding. A type other than "message" should be
|
|||
|
used if restriction to "7bit" is not possible.
|
|||
|
|
|||
|
6. Experimental Media Type Values
|
|||
|
|
|||
|
A media type value beginning with the characters "X-" is a private
|
|||
|
value, to be used by consenting systems by mutual agreement. Any
|
|||
|
format without a rigorous and public definition must be named with an
|
|||
|
"X-" prefix, and publicly specified values shall never begin with
|
|||
|
"X-". (Older versions of the widely used Andrew system use the "X-
|
|||
|
BE2" name, so new systems should probably choose a different name.)
|
|||
|
|
|||
|
In general, the use of "X-" top-level types is strongly discouraged.
|
|||
|
Implementors should invent subtypes of the existing types whenever
|
|||
|
possible. In many cases, a subtype of "application" will be more
|
|||
|
appropriate than a new top-level type.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 40]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
7. Summary
|
|||
|
|
|||
|
The five discrete media types provide provide a standardized
|
|||
|
mechanism for tagging entities as "audio", "image", or several other
|
|||
|
kinds of data. The composite "multipart" and "message" media types
|
|||
|
allow mixing and hierarchical structuring of entities of different
|
|||
|
types in a single message. A distinguished parameter syntax allows
|
|||
|
further specification of data format details, particularly the
|
|||
|
specification of alternate character sets. Additional optional
|
|||
|
header fields provide mechanisms for certain extensions deemed
|
|||
|
desirable by many implementors. Finally, a number of useful media
|
|||
|
types are defined for general use by consenting user agents, notably
|
|||
|
"message/partial" and "message/external-body".
|
|||
|
|
|||
|
9. Security Considerations
|
|||
|
|
|||
|
Security issues are discussed in the context of the
|
|||
|
"application/postscript" type, the "message/external-body" type, and
|
|||
|
in RFC 2048. Implementors should pay special attention to the
|
|||
|
security implications of any media types that can cause the remote
|
|||
|
execution of any actions in the recipient's environment. In such
|
|||
|
cases, the discussion of the "application/postscript" type may serve
|
|||
|
as a model for considering other media types with remote execution
|
|||
|
capabilities.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 41]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
9. Authors' Addresses
|
|||
|
|
|||
|
For more information, the authors of this document are best contacted
|
|||
|
via Internet mail:
|
|||
|
|
|||
|
Ned Freed
|
|||
|
Innosoft International, Inc.
|
|||
|
1050 East Garvey Avenue South
|
|||
|
West Covina, CA 91790
|
|||
|
USA
|
|||
|
|
|||
|
Phone: +1 818 919 3600
|
|||
|
Fax: +1 818 919 3614
|
|||
|
EMail: ned@innosoft.com
|
|||
|
|
|||
|
|
|||
|
Nathaniel S. Borenstein
|
|||
|
First Virtual Holdings
|
|||
|
25 Washington Avenue
|
|||
|
Morristown, NJ 07960
|
|||
|
USA
|
|||
|
|
|||
|
Phone: +1 201 540 8967
|
|||
|
Fax: +1 201 993 3032
|
|||
|
EMail: nsb@nsb.fv.com
|
|||
|
|
|||
|
|
|||
|
MIME is a result of the work of the Internet Engineering Task Force
|
|||
|
Working Group on RFC 822 Extensions. The chairman of that group,
|
|||
|
Greg Vaudreuil, may be reached at:
|
|||
|
|
|||
|
Gregory M. Vaudreuil
|
|||
|
Octel Network Services
|
|||
|
17080 Dallas Parkway
|
|||
|
Dallas, TX 75248-1905
|
|||
|
USA
|
|||
|
|
|||
|
EMail: Greg.Vaudreuil@Octel.Com
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 42]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
Appendix A -- Collected Grammar
|
|||
|
|
|||
|
This appendix contains the complete BNF grammar for all the syntax
|
|||
|
specified by this document.
|
|||
|
|
|||
|
By itself, however, this grammar is incomplete. It refers by name to
|
|||
|
several syntax rules that are defined by RFC 822. Rather than
|
|||
|
reproduce those definitions here, and risk unintentional differences
|
|||
|
between the two, this document simply refers the reader to RFC 822
|
|||
|
for the remaining definitions. Wherever a term is undefined, it
|
|||
|
refers to the RFC 822 definition.
|
|||
|
|
|||
|
boundary := 0*69<bchars> bcharsnospace
|
|||
|
|
|||
|
bchars := bcharsnospace / " "
|
|||
|
|
|||
|
bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" /
|
|||
|
"+" / "_" / "," / "-" / "." /
|
|||
|
"/" / ":" / "=" / "?"
|
|||
|
|
|||
|
body-part := <"message" as defined in RFC 822, with all
|
|||
|
header fields optional, not starting with the
|
|||
|
specified dash-boundary, and with the
|
|||
|
delimiter not occurring anywhere in the
|
|||
|
body part. Note that the semantics of a
|
|||
|
part differ from the semantics of a message,
|
|||
|
as described in the text.>
|
|||
|
|
|||
|
close-delimiter := delimiter "--"
|
|||
|
|
|||
|
dash-boundary := "--" boundary
|
|||
|
; boundary taken from the value of
|
|||
|
; boundary parameter of the
|
|||
|
; Content-Type field.
|
|||
|
|
|||
|
delimiter := CRLF dash-boundary
|
|||
|
|
|||
|
discard-text := *(*text CRLF)
|
|||
|
; May be ignored or discarded.
|
|||
|
|
|||
|
encapsulation := delimiter transport-padding
|
|||
|
CRLF body-part
|
|||
|
|
|||
|
epilogue := discard-text
|
|||
|
|
|||
|
multipart-body := [preamble CRLF]
|
|||
|
dash-boundary transport-padding CRLF
|
|||
|
body-part *encapsulation
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 43]
|
|||
|
|
|||
|
RFC 2046 Media Types November 1996
|
|||
|
|
|||
|
|
|||
|
close-delimiter transport-padding
|
|||
|
[CRLF epilogue]
|
|||
|
|
|||
|
preamble := discard-text
|
|||
|
|
|||
|
transport-padding := *LWSP-char
|
|||
|
; Composers MUST NOT generate
|
|||
|
; non-zero length transport
|
|||
|
; padding, but receivers MUST
|
|||
|
; be able to handle padding
|
|||
|
; added by message transports.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Freed & Borenstein Standards Track [Page 44]
|
|||
|
|