8.1 Pixel and Overlay Data, and Related Data Elements
Pixel Data (7FE0,0010), Float Pixel Data (7FE0,0008), Double Float Pixel Data (7FE0,0009) and Overlay Data (60xx,3000) shall be used for the exchange of encoded graphical image data. These Data Elements along with additional Data Elements, specified as Attributes of the Image Information Entities defined in PS3.3, shall be used to describe the way in which the Pixel Data and Overlay Data are encoded and shall be interpreted. Finally, depending on the negotiated Transfer Syntax (see Section 10 and Annex A), Pixel Data may be compressed.
Pixel Data (7FE0,0010) and Overlay Data (60xx,3000) have a VR of OW or OB, depending on the negotiated Transfer Syntax (see Annex A). The only difference between OW and OB being that OB, an octet-stream, shall be unaffected by Byte Ordering (see Section 7.3).
Float Pixel Data (7FE0,0008) has a
Value Representation of OF.
Double Float Pixel Data (7FE0,0009)
has a Value Representation of OD.
For Pixel Data Values encoded in OF
and OD, any value that is permitted by [IEEE 754] may be used, including NaN,
Positive Infinity and Negative Infinity. See Table 6.2-1
Note
Floating point binary32 and binary64 pixel data values are
not arbitrarily constrained to finite numbers, since it may be important for the
application to signal that the result of a calculation that produced a pixel is an
infinite value or not a number.
8.1.1 Pixel Data Encoding of Related Data Elements
Encoded Pixel Data of various bit depths shall be accommodated. The following three Data Elements shall define the Pixel structure:
Each Pixel Cell shall contain a single Pixel Sample Value. The size of the Pixel Cell shall be specified by Bits Allocated (0028,0100). Bits Stored (0028,0101) defines the total number of these allocated bits that will be used to represent a Pixel Sample Value. Bits Stored (0028,0101) shall never be larger than Bits Allocated (0028,0100). High Bit (0028,0102) specifies where the high order bit of the Bits Stored (0028,0101) is to be placed with respect to the Bits Allocated (0028,0100) specification. Bits Allocated (0028,0100) shall either be 1, or a multiple of 8. High Bit (0028,0102) shall be one less than Bits Stored (0028,0101).
Note
-
For example, in Pixel Data with 16 bits (2 bytes) allocated, 12 bits stored, and bit 11 specified as the high bit, one pixel sample is encoded in each 16-bit word, with the 4 most significant bits of each word not containing Pixel Data. See Annex D for other examples of the basic encoding schemes.
-
Formerly, bits not used for Pixel Sample
Values were described as being usable for overlay planes, but this usage has
been retired. See PS3.5-2004.
-
Formerly, High Bit
(0028,0102) was not restricted to be one less than Bits Stored (0028,0101)
in this Part, or in the general case, though almost all Information Object
Definitions in PS3.3 imposed such a restriction. See PS3.5
2014c.
-
Receiving applications may not assume
anything about the contents of unused bits, and in particular may not assume
that they are zero, or that they contain sign extension
bits.
Additional restrictions that are placed on acceptable Values for Bits Allocated (0028,0100), Bits Stored (0028,0101), and High Bit (0028,0102) for Pixel Data (7FE0,0010) are specified in the Information Object Definitions in PS3.3.
Restrictions are placed on acceptable Values for Bits Allocated (0028,0100) for Float Pixel Data (7FE0,0008) and Double Float Pixel Data (7FE0,0009), such that only a single Pixel Cell entirely occupies the allocated bits specified by Bits Allocated (0028,0100), hence Bits Stored (0028,0101) and High Bit (0028,0102) are not sent.
Also, the Value Field containing Pixel Data, like all other Value Fields in DICOM, shall be an even number of bytes in length. This means that the Value Field may need to be padded with data that is not part of the image and shall not be considered significant. If needed, the padding bits shall be appended to the end of the Value Field, and shall be used only to extend the data to the next even byte increment of length.
Note
The 32-bit Value Length Field limits the maximum size of large data Value Fields such as Pixel Data sent in a Native Format (encoded in Transfer Syntaxes that use only the unencapsulated form).
In a multi-frame object that is transmitted in Native Format, the individual frames are not padded. The individual frames shall be concatenated and padding bits (if necessary) applied to the complete Value Field. At least one frame shall be present.
Note
-
Receiving applications should be aware that some older applications may send Pixel Data with excess padding, which was not explicitly prohibited in earlier versions of the Standard. Applications should be prepared to accept such Pixel Data Data Elements, but may delete the excess padding. In no case should a sending application place private data in the padding data.
-
In a multi-frame object with a Bits Allocated (0028,0100) of 1 that is transmitted in Native Format, the individual frames are not padded, therefore successive bits are packed into bytes or words as described in 8.2.
I.e., a frame other than the first frame may start in the middle of a byte or word.
This is consistent with the historical encoding of Multi-frame Overlays described in 8.1.
The field of bits representing the value of a Pixel Sample shall be a binary 2's complement integer or an unsigned integer, as specified by the Data Element Pixel Representation (0028,0103). The sign bit shall be the High Bit in a Pixel Sample Value that is a 2's complement integer. The minimum actual Pixel Sample Value encountered in the Pixel Data is specified by Smallest Image Pixel Value (0028,0106) while the maximum Value is specified by Largest Image Pixel Value (0028,0107).
8.1.2 Overlay Data Encoding of Related Data Elements
Encoded Overlay Planes always have a bit depth of 1, and are encoded separately from the Pixel Data in Overlay Data (60xx,3000). The following two Data Elements shall define the Overlay Plane structure:
-
Overlay Bits Allocated (60xx,0100)
-
Overlay Bit Position (60xx,0102)
Note
-
There is no Data Element analogous to Bits Stored (0028,0101) since Overlay Planes always have a bit depth of 1.
-
Restrictions on the allowed Values for these Data Elements are defined in PS3.3. Formerly overlay data stored in unused bits of Pixel Data (7FE0,0010) was described, and these Attributes had meaningful Values but this usage has been retired. See PS3.5-2004. For overlays encoded in Overlay Data (60xx,3000), Overlay Bits Allocated (60xx,0100) is always 1 and Overlay Bit Position (60xx,0102) is always 0.
For Overlay Data (60xx,3000), the Value Representation OW is most often required. The Value Representation OB may also be used for Overlay Data in cases where the Value Representation is explicitly conveyed (see Annex A).
Note
The DICOM Default Little Endian Transfer Syntax (Implicit VR Little Endian) does not explicitly convey Value Representation and therefore the VR of OB may not be used for Overlay Data when using the Default Transfer Syntax.
Overlay Data is encoded as the direct concatenation of the bits of a single Overlay Plane, where the first bit of an Overlay Plane is encoded in the least significant bit, immediately followed by the next bit of the Overlay Plane in the next most significant bit.
For a Multi-frame Overlay, the individual frames are not padded.
The individual frames shall be concatenated and padding bits (if necessary) applied to the complete Value Field.
When the Overlay Data crosses a word boundary in the OW case, or a byte boundary in the OB case, it shall continue to be encoded, least significant bit to most significant bit, in the next word, or byte, respectively (see Annex D). For Overlay Data encoded with the Value Representation OW, the byte ordering of the resulting 2-byte words is defined by the Little Endian Transfer Syntaxes negotiated at the Association Establishment (see Annex A).
Note
For Overlay Data encoded with the Value Representation OB, the Overlay Data encoding is unaffected by byte ordering.
8.2 Native or Encapsulated Format Encoding
Pixel data conveyed in the Pixel Data (7FE0,0010) may be sent either in a Native (uncompressed) Format or in an Encapsulated Format (e.g., compressed).
If Pixel Data (7FE0,0010) is sent in a Native Format, then the Photometric Interpretation (0028,0004) shall be other than:
-
YBR_RCT
-
YBR_ICT
-
YBR_PARTIAL_420
Note
These Values are not permitted because they are not encodable in an uncompressed form.
Pixel Data conveyed in the Float Pixel Data (7FE0,0008) or Double Float Pixel Data (7FE0,0009) shall be in a Native (uncompressed) Format if encoded in a Standard Transfer Syntax.
Note
-
In future, if Standard Transfer Syntaxes are defined for compression of Float Pixel Data (7FE0,0008) or Double Float Pixel Data (7FE0,0009), this constraint may be relaxed and Encapsulated Format permitted.
-
This constraint does not apply to Private Transfer Syntaxes.
If Pixel Data (7FE0,0010) is sent in a Native Format, the Value Representation OW is most often required. The Value Representation OB may also be used for Pixel Data (7FE0,0010) in cases where Bits Allocated has a Value less than or equal to 8, but only with Transfer Syntaxes where the Value Representation is explicitly conveyed (see Annex A).
Note
-
The DICOM Default Little Endian Transfer Syntax (Implicit VR Little Endian) does not explicitly convey Value Representation and therefore the VR of OB may not be used for Pixel Data (7FE0,0010) when using the Default Transfer Syntax.
-
The 32-bit Value Length Field limits the maximum size of large data Value Fields such as Pixel Data sent in a Native Format.
Float Pixel Data (7FE0,0008) is sent in Native Format; the Value Representation shall be OF, Bits Allocated (0028,0100) shall be 32, Bits Stored (0028,0101), High Bit (0028,0102) and Pixel Representation (0028,0103) shall not be present.
Double Float Pixel Data (7FE0,0009) is sent in Native Format; the Value Representation shall be OD, Bits Allocated (0028,0100) shall be 64, Bits Stored (0028,0101) and High Bit (0028,0102) and Pixel Representation (0028,0103) shall not be present.
It is not permitted to have more than one of Pixel Data Provider URL (0028,7FE0), Pixel Data (7FE0,0010), Float Pixel Data (7FE0,0008) or Double Float Pixel Data (7FE0,0009) in the top level Data Set.
Note
Pixel Data encoded in Float Pixel Data (7FE0,0008) or Double Float Pixel Data (7FE0,0009) can be considered as consisting of Pixel Cells that entirely occupy the allocated bits, and therefore do not cross word boundaries.
Native format Pixel Cells are encoded as the direct concatenation of the bits of each Pixel Cell, the least significant bit of each Pixel Cell is encoded in the least significant bit of the encoded word or byte, immediately followed by the next most significant bit of each Pixel Cell in the next most significant bit of the encoded word or byte, successively until all bits of the Pixel Cell have been encoded, then immediately followed by the least significant bit of the next Pixel Cell in the next most significant bit of the encoded word or byte. The number of bits of each Pixel Cell is defined by the Bits Allocated (0028,0100) Data Element Value. When a Pixel Cell crosses a word boundary in the OW case, or a byte boundary in the OB case, it shall continue to be encoded, least significant bit to most significant bit, in the next word, or byte, respectively (see Annex D). For Pixel Data (7FE0,0010) encoded with the Value Representation OW, the byte ordering of the resulting 2-byte words is defined by the Little Endian Transfer Syntaxes negotiated at the Association Establishment (see Annex A).
Note
-
For Pixel Data (7FE0,0010) encoded with the Value Representation OB, the Pixel Data (7FE0,0010) encoding is unaffected by byte ordering.
-
If encoding Pixel Data (7FE0,0010) with a Value for Bits Allocated (0028,0100) not equal to 16 be sure to read and understand Annex D.
If sent in an Encapsulated Format (i.e., other than the Native Format) the Value Representation OB is used. The Pixel Cells are encoded according to the encoding process defined by one of the negotiated Transfer Syntaxes (see Annex A).
A Fragmentable Encapsulated Transfer Syntax allows the encapsulated pixel stream of encoded pixel data to be split into one or more Fragments.
A Non-Fragmentable Encapsulated Transfer Syntax requires the entire encapsulated pixel stream of encoded pixel data to be encoded in a single Fragment.
Each Fragment conveys its own explicit even length (see Section A.4).
The Sequence of Fragments of the encapsulated pixel stream is terminated by a Sequence Delimiter Item, thus allowing the support of encoding processes where the resulting length of the entire pixel stream is not known until it is entirely encoded. Encapsulated Formats support both Single-Frame and Multi-Frame images (as defined in PS3.3). At least one Frame shall be present, and hence at least one Fragment will be present.
Note
-
Depending on the Fragmentable Encapsulated Transfer Syntax, a frame may be entirely contained within a single fragment, or may span multiple fragments to support buffering during compression or to avoid exceeding the maximum size of a fixed length fragment. A recipient can detect fragmentation of frames by comparing the number of fragments (the number of Items minus one for the Basic Offset Table) with the number of frames. Some performance optimizations may be available to a recipient in the absence of fragmentation of frames, but an implementation that fails to support such fragmentation does not conform to the Standard.
-
The total size of the encapsulated pixel stream, not including any trailing padding in the last Fragment, if known, may be encoded in Encapsulated Pixel Data Value Total Length (7FE0,0003); see PS3.3 Section C.7.6.6 “Multi-frame Module” and PS3.3 Section C.7.6.16 “Multi-frame Functional Groups Module”.
8.2.1 JPEG Image Compression
DICOM provides a mechanism for supporting the use of JPEG Image Compression through the Encapsulated Format. Annex A defines a number of Transfer Syntaxes that reference the JPEG Standard and provide a number of lossless (bit preserving) and lossy compression schemes.
Note
The context where the usage of lossy compression of medical images is clinically acceptable is beyond the scope of the DICOM Standard. The policies associated with the selection of appropriate compression parameters (e.g., compression ratio) for JPEG lossy compression is also beyond the scope of this Standard.
In order to facilitate interoperability of implementations conforming to the DICOM Standard that elect to use one or more of the Transfer Syntaxes for JPEG Image Compression, the following policy is specified:
-
Any implementation that conforms to the DICOM Standard and has elected to support any one of the Transfer Syntaxes for Lossless JPEG Image Compression, shall support the following lossless compression: The subset (first-order horizontal prediction [Selection Value 1) of JPEG Process 14 (DPCM, non-hierarchical with Huffman coding) (see Annex F).
-
Any implementation that conforms to the DICOM Standard and has elected to support any one of the Transfer Syntaxes for 8-bit Lossy JPEG Image Compression, shall support the JPEG Baseline Compression (coding Process 1).
-
Any implementation that conforms to the DICOM Standard and has elected to support any one of the Transfer Syntaxes for 12-bit Lossy JPEG Image Compression, shall support the JPEG Compression Process 4.
Note
The DICOM conformance statement shall differentiate whether or not the implementation is capable of simply receiving or receiving and processing JPEG encoded images (see PS3.2).
The use of the DICOM Encapsulated Format to support JPEG Compressed Pixel Data requires that the Data Elements that are related to the Pixel Data encoding (e.g., Photometric Interpretation, Samples per Pixel, Planar Configuration, Bits Allocated, Bits Stored, High Bit, Pixel Representation, Rows, Columns, etc.) shall contain Values that are consistent with the characteristics of the compressed data stream.
The requirements when using a Standard Photometric Interpretation (i.e., a Defined Term from Section C.7.6.3.1.2 in PS3.3
) are specified in Table 8.2.1-1 and Table 8.2.1-2. No other Standard Photometric Interpretation Values shall be used.
Table 8.2.1-1. Valid Values of Pixel Data Related Attributes for JPEG Lossy Transfer Syntaxes using Standard Photometric Interpretations
Photometric Interpretation
|
Transfer Syntax
|
Transfer Syntax UID
|
Samples per Pixel
|
Planar Configuration
|
Pixel Representation
|
Bits Allocated
|
Bits Stored
|
High Bit
|
MONOCHROME1
MONOCHROME2
|
JPEG Baseline
|
1.2.840.10008.1.2.4.50
|
1
|
absent
|
0
|
8
|
8
|
7
|
MONOCHROME1
MONOCHROME2
|
JPEG Extended
|
1.2.840.10008.1.2.4.51
|
1
|
absent
|
0
|
8
|
8
|
7
|
MONOCHROME1
MONOCHROME2
|
JPEG Extended
|
1.2.840.10008.1.2.4.51
|
1
|
absent
|
0
|
16
|
12
|
11
|
YBR_FULL_422
RGB
|
JPEG Baseline
|
1.2.840.10008.1.2.4.50
|
3
|
0
|
0
|
8
|
8
|
7
|
Table 8.2.1-2. Valid Values of Pixel Data Related Attributes for JPEG Lossless Transfer Syntaxes using Standard Photometric Interpretations
Photometric Interpretation
|
Transfer Syntax
|
Transfer Syntax UID
|
Samples per Pixel
|
Planar Configuration
|
Pixel Representation
|
Bits Allocated
|
Bits Stored
|
High Bit
|
MONOCHROME1
MONOCHROME2
|
JPEG Lossless, Non-Hierarchical
|
1.2.840.10008.1.2.4.57
|
1
|
absent
|
0 or 1
|
8 or 16
|
1-16
|
0-15
|
JPEG Lossless, Non-Hierarchical, SV1
|
1.2.840.10008.1.2.4.70
|
PALETTE COLOR
|
JPEG Lossless, Non-Hierarchical
|
1.2.840.10008.1.2.4.57
|
1
|
absent
|
0
|
8 or 16
|
1-16
|
0-15
|
JPEG Lossless, Non-Hierarchical, SV1
|
1.2.840.10008.1.2.4.70
|
YBR_FULL
RGB
|
JPEG Lossless, Non-Hierarchical
|
1.2.840.10008.1.2.4.57
|
3
|
0
|
0
|
8 or 16
|
1-16
|
0-15
|
JPEG Lossless, Non-Hierarchical, SV1
|
1.2.840.10008.1.2.4.70
|
The Pixel Data characteristics included in the JPEG Interchange Format shall be used to decode the compressed data stream.
If APP2 marker segments with an identifier of "ICC_PROFILE" (as defined in Annex B of [ISO 15076-1]) are present in the compressed data stream, their concatenated value shall be identical to the Value of ICC Profile (0028,2000) Attribute, if present, excluding padding.
Note
-
These requirements were formerly specified in terms of the "uncompressed pixel data from which the compressed data stream was derived". However, since the form of the "original" uncompressed data stream could vary between different implementations, this requirement is now specified in terms of consistency with what is encapsulated.
When decompressing, should the characteristics explicitly specified in the compressed data stream (e.g., spatial subsampling or number of components or planar configuration) be inconsistent with those specified in the DICOM Data Elements, those explicitly specified in the compressed data stream should be used to control the decompression. The DICOM Data Elements, if inconsistent, can be regarded as suggestions as to the form in which an uncompressed Data Set might be encoded, subject to the general and IOD-specific rules for uncompressed Photometric Interpretation and Planar Configuration, which may require that decompressed data be converted to one of the permitted forms.
-
Those characteristics not explicitly specified in the compressed data stream (e.g., the color space of the compressed components, which is not specified in the JPEG Interchange Format), or implied by the definition of the compression scheme (e.g., always unsigned in JPEG), can therefore be determined from the DICOM Data Element in the enclosing Data Set. For example a Photometric Interpretation of "YBR_FULL_422" would describe the color space that is commonly used to lossy compress images using JPEG. It is unusual to use an RGB color space for lossy compression, since no advantage is taken of correlation between the red, green and blue components (e.g., of luminance), and poor compression is achieved; however, for some applications this is permitted, e.g., Whole Slide Microscopy Images, to allow conversion to DICOM from proprietary formats without loss due to color space transformation.
-
The JPEG Interchange Format is distinct from the JPEG File Interchange Format (JFIF). The JPEG Interchange Format is defined in [ISO/IEC 10918-1] section 4.9.1, and refers to the inclusion of decoding tables, as distinct from the "abbreviated format" in which these tables are not sent (and the decoder is assumed to already have them). The JPEG Interchange Format does NOT specify the color space. The JPEG File Interchange Format, not part of the original JPEG standard, but defined in [ECMA TR-098] and [ISO/IEC 10918-5], is often used to store JPEG bit streams in consumer format files, and does include the ability to specify the color space of the components. The JFIF APP0 marker segment is NOT required to be present in DICOM encapsulated JPEG bit streams, and should not be relied upon to recognize the color space. Its presence is not
forbidden (unlike the JP2 information for JPEG 2000 Transfer Syntaxes), but it is recommended that it be absent.
-
Should the compression process be incapable of encoding a particular form of pixel data representation (e.g., JPEG cannot encode signed integers, only unsigned integers), then ideally only the appropriate form should be "fed" into the compression process. However, for certain characteristics described in DICOM Data Elements but not explicitly described in the compressed data stream (such as Pixel Representation), then the DICOM Data Element should be considered to describe what has been compressed (e.g., the pixel data really is to be interpreted as signed if Pixel Representation so specifies).
-
DICOM Data Elements should not describe characteristics that are beyond the capability of the compression scheme used. For example, JPEG lossy processes are limited to 12 bits, hence the Value of Bits Stored should be 12 or less. Bits Allocated is irrelevant, and is likely to be constrained by the Information Object Definition in PS3.3 to Values of 8 or 16. Also, JPEG compressed data streams are always color-by-pixel and should be specified as such (a decoder can essentially ignore this Data Element however as the value for JPEG compressed data is already known).
-
If JPEG Compressed Pixel Data is decompressed and re-encoded in Native (uncompressed) form, then the Data Elements that are related to the Pixel Data encoding are updated accordingly. If color components are converted from YBR_FULL_422 to RGB during decompression and Native re-encoding, the Photometric Interpretation will be changed to RGB in the Data Set with the Native encoding.
8.2.2 Run Length Encoding Image Compression
DICOM provides a mechanism for supporting the use of Run Length Encoding (RLE) Image Compression, which is a byte oriented lossless compression scheme through the encapsulated Format (see PS3.3 of this Standard). Annex G defines RLE Image Compression and its Transfer Syntax.
Note
The RLE Image Compression algorithm described in Annex G is the compression used in the TIFF 6.0 specification known as the "PackBits" scheme.
The use of the DICOM Encapsulated Format to support RLE Compressed Pixel Data requires that the Data Elements that are related to the Pixel Data encoding (e.g., Photometric Interpretation, Samples per Pixel, Planar Configuration, Bits Allocated, Bits Stored, High Bit, Pixel Representation, Rows, Columns, etc.) shall contain Values that are consistent with the compressed data.
The requirements when using a Standard Photometric Interpretation (i.e., a Defined Term from PS.3. C.7.6.3.1.2) are specified in Table 8.2.2-1. No other Standard Photometric Interpretation Values shall be used.
Table 8.2.2-1. Valid Values of Pixel Data Related Attributes for RLE Compression using Standard Photometric Interpretations
Photometric Interpretation
|
Samples per Pixel
|
Planar Configuration
|
Pixel Representation
|
Bits Allocated
|
Bits Stored
|
High Bit
|
MONOCHROME1
MONOCHROME2
|
1
|
absent
|
0 or 1
|
1, 8 or 16
|
1-16
|
0-15
|
PALETTE COLOR
|
1
|
absent
|
0
|
8 or 16
|
1-16
|
0-15
|
YBR_FULL
|
3
|
0 or 1
|
0
|
8
|
1-8
|
0-7
|
RGB
|
3
|
0 or 1
|
0
|
8 or 16
|
1-16
|
0-15
|
Note
-
These requirements were formerly specified in terms of the "uncompressed pixel data from which the compressed data was derived". However, since the form of the "original" uncompressed data stream could vary between different implementations, this requirement is now specified in terms of consistency with what is encapsulated.
-
Those characteristics not implied by the definition of the compression scheme (e.g., always color-by-plane in RLE), can therefore be determined from the DICOM Data Element in the enclosing Data Set. For example a Photometric Interpretation of "YBR_FULL" would describe the color space that is commonly used to losslessly compress images using RLE. It is unusual to use an RGB color space for RLE compression, since no advantage is taken of correlation between the red, green and blue components (e.g., of luminance), and poor compression is achieved (note however that the conversion from RGB to YBR_FULL is itself lossy. A new photometric interpretation may be proposed in the future that allows lossless conversion from RGB and also results in better RLE compression ratios).
-
DICOM Data Elements should not describe characteristics that are beyond the capability of the compression scheme used. For example, RLE compressed data streams (using the algorithm mandated in the DICOM Standard) are always color-by-plane.
-
If RLE Compressed Pixel Data is decompressed and re-encoded in Native (uncompressed) form, then the Data Elements that are related to the Pixel Data encoding are updated accordingly. If color components are converted from YBR_FULL to RGB during decompression and Native re-encoding, the Photometric Interpretation will be changed to RGB in the Data Set with the Native encoding. It is permitted, however, to leave the YBR_FULL color components unconverted but decompressed in the Native format, in which case the Photometric Interpretation in the Data Set with the Native encoding would be YBR_FULL.
-
A Bits Allocated (0028,0100) of 1 for monochrome images supports compression of Segmentation IOD Pixel Data with a Segmentation Type (0062,0001) of BINARY.
8.2.3 JPEG-LS Image Compression
DICOM provides a mechanism for supporting the use of JPEG-LS Image Compression through the Encapsulated Format. Annex A defines a number of Transfer Syntaxes that reference the JPEG-LS Standard and provide a number of lossless (bit preserving) and lossy (near-lossless) compression schemes.
Note
The context where the usage of lossy (near-lossless) compression of medical images is clinically acceptable is beyond the scope of the DICOM Standard. The policies associated with the selection of appropriate compression parameters (e.g., compression ratio) for JPEG-LS lossy (near-lossless) compression is also beyond the scope of this Standard.
The use of the DICOM Encapsulated Format to support JPEG-LS Compressed Pixel Data requires that the Data Elements that are related to the Pixel Data encoding (e.g., Photometric Interpretation, Samples per Pixel, Planar Configuration, Bits Allocated, Bits Stored, High Bit, Pixel Representation, Rows, Columns, etc.) shall contain Values that are consistent with the characteristics of the compressed data stream. The Pixel Data characteristics included in the JPEG-LS Interchange Format shall be used to decode the compressed data stream.
The requirements when using a Standard Photometric Interpretation (i.e., a Defined Term from PS.3. C.7.6.3.1.2) are specified in Table 8.2.3-1. No other Standard Photometric Interpretation Values shall be used.
Table 8.2.3-1. Valid Values of Pixel Data Related Attributes for JPEG-LS Compression using Standard Photometric Interpretations
Photometric Interpretation
|
Transfer Syntax
|
Transfer Syntax UID
|
Samples per Pixel
|
Planar Configuration
|
Pixel Representation
|
Bits Allocated
|
Bits Stored
|
High Bit
|
MONOCHROME1
MONOCHROME2
|
JPEG-LS Lossless
|
1.2.840.10008.1.2.4.80
|
1
|
absent
|
0 or 1
|
8 or 16
|
2-16
|
1-15
|
JPEG-LS Lossy (Near-Lossless)
|
1.2.840.10008.1.2.4.81
|
PALETTE COLOR
|
JPEG-LS Lossless
|
1.2.840.10008.1.2.4.80
|
1
|
absent
|
0
|
8 or 16
|
2-16
|
1-15
|
YBR_FULL
|
JPEG-LS Lossless
|
1.2.840.10008.1.2.4.80
|
3
|
0
|
0
|
8
|
2-8
|
1-7
|
JPEG-LS Lossy (Near-Lossless)
|
1.2.840.10008.1.2.4.81
|
RGB
|
JPEG-LS Lossless
|
1.2.840.10008.1.2.4.80
|
3
|
0
|
0
|
8 or 16
|
2-16
|
1-15
|
JPEG-LS Lossy (Near-Lossless)
|
1.2.840.10008.1.2.4.81
|
Note
-
See also the notes in Section 8.2.1.
-
No color transformation Photometric Interpretation specific for JPEG-LS is currently defined in DICOM. Annex F of ISO 14495-2 describes a
"Sample transformation for inverse colour transform"
and a marker segment to encode its parameters, but this is not known to have been implemented. Common practice is to compress the RGB components unconverted, which sacrifices compression performance, and send the Photometric Interpretation as RGB. Though the YBR_RCT Photometric Interpretation and component conversion could theoretically be used, in the absence of DC shifting it results in signed values to be encoded, which are not supported by JPEG-LS.
-
If JPEG-LS Compressed Pixel Data is decompressed and re-encoded in Native (uncompressed) form, then the Data Elements that are related to the Pixel Data encoding are updated accordingly. If color components are converted from any other Photometric Interpretation to RGB during decompression and Native re-encoding, the Photometric Interpretation will be changed to RGB in the Data Set with the Native encoding.
-
The lower limit of 2 on Bits Stored (0028,0101) reflects the minimum JPEG-LS sample precision of 2.
The Value of Planar Configuration (0028,0006) is irrelevant since the manner of encoding components is specified in the JPEG-LS bit stream as component, line or sample interleaved, hence it shall be set to 0.
8.2.4 JPEG 2000 Image Compression
DICOM provides a mechanism for supporting the use of JPEG 2000 Image Compression through the Encapsulated Format. Annex A defines a number of Transfer Syntaxes that reference the JPEG 2000 Standard and provide lossless (bit preserving) and lossy compression schemes.
Note
The context where the usage of lossy compression of medical images is clinically acceptable is beyond the scope of the DICOM Standard. The policies associated with the selection of appropriate compression parameters (e.g., compression ratio) for JPEG 2000 lossy compression are also beyond the scope of this Standard.
The use of the DICOM Encapsulated Format to support JPEG 2000 Compressed Pixel Data requires that the Data Elements that are related to the Pixel Data encoding (e.g., Photometric Interpretation, Samples per Pixel, Planar Configuration, Bits Allocated, Bits Stored, High Bit, Pixel Representation, Rows, Columns, etc.) shall contain Values that are consistent with the characteristics of the compressed data stream. The Pixel Data characteristics included in the JPEG 2000 bit stream shall be used to decode the compressed data stream.
The requirements when using a Standard Photometric Interpretation (i.e., a Defined Term from PS.3. C.7.6.3.1.2) are specified in Table 8.2.4-1. No other Standard Photometric Interpretation Values shall be used.
Table 8.2.4-1. Valid Values of Pixel Data Related Attributes for JPEG 2000 Transfer Syntaxes using Standard Photometric Interpretations
Photometric Interpretation
|
Transfer Syntax
|
Transfer Syntax UID
|
Samples per Pixel
|
Planar Configuration
|
Pixel Representation
|
Bits Allocated
|
Bits Stored
|
High Bit
|
MONOCHROME1
MONOCHROME2
|
JPEG 2000 (Lossless Only)
|
1.2.840.10008.1.2.4.90
|
1
|
absent
|
0 or 1
|
1, 8, 16, 24, 32 or 40
|
1-38
|
0-37
|
JPEG 2000
|
1.2.840.10008.1.2.4.91
|
PALETTE COLOR
|
JPEG 2000 (Lossless Only)
|
1.2.840.10008.1.2.4.90
|
1
|
absent
|
0
|
8 or 16
|
1-16
|
0-15
|
YBR_RCT
|
JPEG 2000 (Lossless Only)
|
1.2.840.10008.1.2.4.90
|
3
|
0
|
0
|
8, 16, 24, 32 or 40
|
1-38
|
0-37
|
JPEG 2000
|
1.2.840.10008.1.2.4.91
|
YBR_ICT
|
JPEG 2000
|
1.2.840.10008.1.2.4.91
|
3
|
0
|
0
|
8, 16, 24, 32 or 40
|
1-38
|
0-37
|
RGB
|
JPEG 2000 (Lossless Only)
|
1.2.840.10008.1.2.4.90
|
3
|
0
|
0
|
8, 16, 24, 32 or 40
|
1-38
|
0-37
|
JPEG 2000
|
1.2.840.10008.1.2.4.91
|
YBR_FULL
|
JPEG 2000 (Lossless Only)
|
1.2.840.10008.1.2.4.90
|
3
|
0
|
0
|
8, 16, 24, 32 or 40
|
1-38
|
0-37
|
JPEG 2000
|
1.2.840.10008.1.2.4.91
|
Note
These requirements are specified in terms of consistency with what is encapsulated, rather than in terms of the uncompressed pixel data from which the compressed data stream may have been derived.
When decompressing, should the characteristics explicitly specified in the compressed data stream be inconsistent with those specified in the DICOM Data Elements, those explicitly specified in the compressed data stream should be used to control the decompression. The DICOM Data Elements, if inconsistent, can be regarded as suggestions as to the form in which an uncompressed Data Set might be encoded, subject to the general and IOD-specific rules for uncompressed Photometric Interpretation and Planar Configuration, which may require that decompressed data be converted to one of the permitted forms.
The JPEG 2000 bit stream specifies whether or not a reversible or irreversible multi-component (color) transformation [ISO 15444-1 Annex G], if any, has been applied. If no multi-component transformation has been applied, then the components shall correspond to those specified by the DICOM Attribute Photometric Interpretation (0028,0004). If the JPEG 2000 Part 1 reversible multi-component transformation has been applied then the DICOM Attribute Photometric Interpretation (0028,0004) shall be YBR_RCT. If the JPEG 2000 Part 1 irreversible multi-component transformation has been applied then the DICOM Attribute Photometric Interpretation (0028,0004) shall be YBR_ICT.
Note
-
For example, single component may be present, and the Photometric Interpretation (0028,0004) may be MONOCHROME2.
-
The application of a JPEG 2000 Part 1 reversible multi-component transformation is signaled in the JPEG 2000 bit stream by a value of 1 rather than 0 in the SGcod Multiple component transformation type of the COD marker segment [ISO 15444-1 Table A.17]. No other Value of Photometric Interpretation than YBR_RCT or YBR_ICT is permitted when SGcod Multiple component transformation type is 1.
-
Though it would be unusual, would not take advantage of correlation between the red, green and blue components, and would not achieve effective compression, a Photometric Interpretation of RGB could be specified as long as no multi-component transformation [ISO 15444-1 Annex G] was specified by the JPEG 2000 bit stream. For some applications the use of RGB is permitted, e.g., Whole Slide Microscopy Images, to allow conversion to DICOM from proprietary formats without loss due to color space transformation. Alternative methods of decorrelation of the color components than those specified in [ISO 15444-1 Annex G] are permitted as defined in PS3.3, such as a Photometric Interpretation of YBR_FULL; this may be useful when converting existing YBR_FULL Pixel Data (e.g., in a different Transfer Syntax) without further loss.
In either case (Photometric Interpretation of RGB or YBR_FULL), the value of SGcod Multiple component transformation type would be 0.
PS3.3 may constrain the Values of Photometric Interpretation for specific IODs.
-
Despite the application of a multi-component color transformation and its reflection in the Photometric Interpretation Attribute, the "color space" remains undefined. There is currently no means of conveying "standard color spaces" either by fixed values (such as sRGB) or by ICC profiles. Note in particular that the JP2 file header is not sent in the JPEG 2000 bit stream that is encapsulated in DICOM.
-
If JPEG 2000 Compressed Pixel Data is decompressed and re-encoded in Native (uncompressed) form, then the Data Elements that are related to the Pixel Data encoding are updated accordingly. If color components are converted from YBR_ICT or YBR_RCT to RGB during decompression and Native re-encoding, the Photometric Interpretation will be changed to RGB in the Data Set with the Native encoding.
-
The upper limit of 40 on Bits Allocated (0028,0100) and 38 on Bits Stored (0028,0101) reflects the maximum JPEG 2000 sample precision of 38 and the DICOM requirement to describe Bits Allocated (0028,0100) as multiples of bytes (octets).
-
A Bits Allocated (0028,0100) of 1 for monochrome images supports compression of Segmentation IOD Pixel Data with a Segmentation Type (0062,0001) of BINARY.
The JPEG 2000 bit stream is capable of encoding both signed and unsigned pixel values, hence the Value of Pixel Representation (0028,0103) may be either 0 or 1 for monochrome Photometric Interpretations depending on what has been encoded (as specified in the SIZ marker segment in the precision and sign of component parameter).
The Value of Planar Configuration (0028,0006) is irrelevant since the manner of encoding components is specified in the JPEG 2000 standard, hence it shall be set to 0.
8.2.5 MPEG2 Main Profile / Main Level Video Compression
DICOM provides a mechanism for supporting the use of MPEG2 Main Profile / Main Level Video Compression through the Encapsulated Format. Annex A defines Non-Fragmentable and Fragmentable Encapsulated Transfer Syntaxes that reference the MPEG2 Main Profile / Main Level Standard.
Note
MPEG2 compression is inherently lossy. The context where the usage of lossy compression of medical images is clinically acceptable is beyond the scope of the DICOM Standard. The policies associated with the selection of appropriate compression parameters (e.g., compression ratio) for MPEG2 Main Profile / Main Level are also beyond the scope of this Standard.
The use of the DICOM Encapsulated Format to support MPEG2 Main Profile / Main Level compressed pixel data requires that the Data Elements that are related to the Pixel Data encoding (e.g., Photometric Interpretation, Samples per Pixel, Planar Configuration, Bits Allocated, Bits Stored, High Bit, Pixel Representation, Rows, Columns, etc.) shall contain Values that are consistent with the characteristics of the compressed data stream, with some specific exceptions noted here. The Pixel Data characteristics included in the MPEG2 Main Profile / Main Level bit stream shall be used to decode the compressed data stream.
Note
These requirements are specified in terms of consistency with what is encapsulated, rather than in terms of the uncompressed pixel data from which the compressed data stream may have been derived.
When decompressing, should the characteristics explicitly specified in the compressed data stream be inconsistent with those specified in the DICOM Data Elements, those explicitly specified in the compressed data stream should be used to control the decompression. The DICOM Data Elements, if inconsistent, can be regarded as suggestions as to the form in which an uncompressed Data Set might be encoded, subject to the general and IOD-specific rules for uncompressed Photometric Interpretation and Planar Configuration, which may require that decompressed data be converted to one of the permitted forms.
The MPEG2 Main Profile / Main Level bit stream specifies whether or not a reversible or irreversible multi-component (color) transformation, if any, has been applied. If no multi-component transformation has been applied, then the components shall correspond to those specified by the DICOM Attribute Photometric Interpretation (0028,0004). MPEG2 Main Profile / Main Level applies an irreversible multi-component transformation, so DICOM Attribute Photometric Interpretation (0028,0004) shall be YBR_PARTIAL_420 in the case of multi-component data, and MONOCHROME2 in the case of single component data (even though the MPEG2 bit stream itself is always encoded as three components, one luminance and two chrominance).
Note
-
If MPEG2 Compressed Pixel Data is decompressed and re-encoded in Native (uncompressed) form, then the Data Elements that are related to the Pixel Data encoding are updated accordingly. If color components are converted from YBR_PARTIAL_420 to RGB during decompression and Native re-encoding, the Photometric Interpretation will be changed to RGB in the Data Set with the Native encoding.
-
MPEG2 proposes some video formats. Each of the standards specified is used in a different market, including: ITU-R BT.470-2 System M for SD NTSC and ITU-R BT.470-2 System B/G for SD PAL/SECAM. A PAL based system should therefore be based on ITU-BT.470 System B for each of Color Primaries, Transfer Characteristic (gamma) and matrix coefficients and should take a value of 5 as defined in [ISO/IEC 13818-2].
The Value of Planar Configuration (0028,0006) is irrelevant since the manner of encoding components is specified in the MPEG2 Main Profile / Main Level standard, hence it shall be set to 0.
In summary:
-
Samples per Pixel (0028,0002) shall be 3
-
Photometric Interpretation (0028,0004) shall be YBR_PARTIAL_420
-
Bits Allocated (0028,0100) shall be 8
-
Bits Stored (0028,0101) shall be 8
-
High Bit (0028,0102) shall be 7
-
Pixel Representation (0028,0103) shall be 0
-
Planar Configuration (0028,0006) shall be 0
-
Rows (0028,0010), Columns (0028,0011), Cine Rate (0018,0040) and Frame Time (0018,1063) or Frame Time Vector (0018,1065) shall be consistent with the limitations of Main Profile / Main Level, as specified in Table 8-1.
Table 8-1. MPEG2 Main Profile / Main Level Image Transfer Syntax Rows and Columns Attributes
Video Type
|
Spatial resolution
|
Frame Rate
(see Note 4)
|
Frame Time
(see Note 5)
|
Maximum Rows
|
Maximum Columns
|
525-line NTSC
|
Full
|
30
|
33.33 ms
|
480
|
720
|
625-line PAL
|
Full
|
25
|
40.0 ms
|
576
|
720
|
Note
-
Although different combinations of Values for Rows and Columns are possible while respecting the maximum values listed above, it is recommended that the typical 4:3 ratio of image width to height be maintained in order to avoid image deformation by MPEG2 decoders. A common way to maintain the ratio of width to height is to pad the image with black areas on either side.
-
"Half" definition of pictures (240x352 and 288x352 for NTSC and PAL, respectively) are always supported by decoders.
-
Main Profile / Main Level allows for various different display and pixel aspect ratios, including the use of square pixels, and the use of non-square pixels with display aspect ratios of 4:3 and 16:9. DICOM specifies no additional restrictions beyond what is provided for in Main Profile / Main Level. All permutations allowed by Main Profile / Main Level are valid and are require to be supported by all DICOM decoders.
-
The actual frame rate for NTSC MPEG2 is approximately 29.97 frames/sec.
-
The nominal Frame Time is supplied for the purpose of inclusion on the DICOM Cine Module Attributes, and should be calculated from the actual frame rate.
For the Non-Fragmentable Encapsulated Transfer Syntax, one Fragment shall contain the whole MPEG2 stream.
For the Fragmentable Encapsulated Transfer Syntax, the stream may be segmented into multiple Fragments.
Note
-
If a video stream exceeds the maximum length of one fragment (2^32-2 bytes), it may be sent using a Fragmentable Encapsulated Transfer Syntax. Alternatively, it may be sent using a Non-Fragmentable Encapsulated Transfer Syntax as multiple SOP Instances, but each SOP Instance will contain an independent and playable bit stream, and not depend on the encoded bit stream in other (previous) instances. The manner in which such separate instances are related is not specified in the Standard, but mechanisms such as grouping into the same Series, and references to earlier instances using Referenced Image Sequence may be used.
-
Fragmentable Encapsulated Transfer Syntaxes allow for streams of essentially unlimited length; the only limit imposed is the maximum Number of Frames (0028,0008), which is 2^31-1 frames (largest positive Value in an Integer String VR).
The Basic Offset Table shall be empty (present but zero length).
Note
The Basic Offset Table is not used because MPEG2 contains its own mechanism for describing navigation of frames. To enable decoding of only a part of the sequence, MPEG2 manages a header in any group of pictures (GOP) containing a time_code - a 25-bit integer containing the following: drop_frame_flag, time_code_hours, time_code_minutes, marker_bit, time_code_seconds and time_code_pictures.
The container format for the video bit stream is not constrained. For example, it may MPEG-2 Transport Stream (MPEG-TS), MPEG-2 Program Stream (MPEG-PS), MPEG-2 Elementary Stream (MPEG-ES), MPEG-2 Packetized Elementary Stream (MPEG-PES) (see [ISO/IEC 13818-1]) or MPEG-4 (MP4) container (see [ISO/IEC 14496-12] and [ISO/IEC 14496-14]).
Any audio components present within the MPEG bit stream shall comply with the following restrictions:
-
CBR MPEG-1 LAYER III (MP3) Audio Standard
-
up to 24 bits
-
32 kHz, 44.1 kHz or 48 kHz for the main channel (the complementary channels can be sampled at the half rate, as defined in the Standard)
-
one main mono or stereo channel, and optionally one or more complementary channel(s)
Note
-
MPEG-1 Layer III is standardized in Part 3 of the MPEG-1 standard (see [ISO/IEC 11172-3]).
-
Although MPEG describes each channel as including up to 5 signals (e.g., for surround effects), it is recommended to limit each of the two channels to 2 signals each one (stereo).
8.2.6 MPEG2 Main Profile / High Level Video Compression
MPEG2 Main Profile / High Level corresponds to what is commonly known as HDTV ('High Definition Television'). DICOM provides a mechanism for supporting the use of MPEG2 Main Profile / High Level Video Compression through the Encapsulated Format. Annex A defines Non-Fragmentable and Fragmentable Encapsulated Transfer Syntaxes that reference the MPEG2 Main Profile / High Level Standard.
Note
MPEG2 compression is inherently lossy. The context where the usage of lossy compression of medical images is clinically acceptable is beyond the scope of the DICOM Standard. The policies associated with the selection of appropriate compression parameters (e.g., compression ratio) for MPEG2 Main Profile / High Level are also beyond the scope of this Standard.
The use of the DICOM Encapsulated Format to support MPEG2 Main Profile / High Level compressed pixel data requires that the Data Elements that are related to the Pixel Data encoding (e.g., Photometric Interpretation, Samples per Pixel, Planar Configuration, Bits Allocated, Bits Stored, High Bit, Pixel Representation, Rows, Columns, etc.) shall contain Values that are consistent with the characteristics of the compressed data stream, with some specific exceptions noted here. The Pixel Data characteristics included in the MPEG2 Main Profile / High Level bit stream shall be used to decode the compressed data stream.
Note
These requirements are specified in terms of consistency with what is encapsulated, rather than in terms of the uncompressed pixel data from which the compressed data stream may have been derived.
When decompressing, should the characteristics explicitly specified in the compressed data stream be inconsistent with those specified in the DICOM Data Elements, those explicitly specified in the compressed data stream should be used to control the decompression. The DICOM Data Elements, if inconsistent, can be regarded as suggestions as to the form in which an uncompressed Data Set might be encoded, subject to the general and IOD-specific rules for uncompressed Photometric Interpretation and Planar Configuration, which may require that decompressed data be converted to one of the permitted forms.
Note
If MPEG2 Compressed Pixel Data is decompressed and re-encoded in Native (uncompressed) form, then the Data Elements that are related to the Pixel Data encoding are updated accordingly. If color components are converted from YBR_PARTIAL_420 to RGB during decompression and Native re-encoding, the Photometric Interpretation will be changed to RGB in the Data Set with the Native encoding.
The requirements are:
-
Planar Configuration (0028,0006) shall be 0
Note
The Value of Planar Configuration (0028,0006) is irrelevant since the manner of encoding components is specified in the MPEG2 standard, hence it is set to 0.
-
Samples per Pixel (0028,0002) shall be 3
-
Photometric Interpretation (0028,0004) shall be YBR_PARTIAL_420 or MONOCHROME2
-
Bits Allocated (0028,0100) shall be 8
-
Bits Stored (0028,0101) shall be 8
-
High Bit (0028,0102) shall be 7
-
Pixel Representation (0028,0103) shall be 0
-
Rows (0028,0010) shall be either 720 or 1080
-
Columns (0028,0011) shall be 1280 if Rows is 720, or shall be 1920 if Rows is 1080.
-
The value of MPEG2 aspect_ratio_information shall be 0011 in the encapsulated MPEG2 data stream corresponding to a 'Display Aspect Ratio' (DAR) of 16:9.
-
The DICOM Attribute Pixel Aspect Ratio (0028,0034) shall be absent. This corresponds to a 'Sampling Aspect Ratio' (SAR) of 1:1.
-
Cine Rate (0018,0040) and Frame Time (0018,1063) or Frame Time Vector (0018,1065) shall be consistent with the limitations of Main Profile / High Level, as specified in Table 8-2.
Table 8-2. MPEG2 Main Profile / High Level Image Transfer Syntax Frame Rate Attributes
Video Type
|
Spatial resolution layer
|
Frame Rate (see Note 2)
|
Frame Time (see Note 3)
|
30 Hz HD
|
Single level, Enhancement
|
30
|
33.33 ms
|
25 Hz HD
|
Single level, Enhancement
|
25
|
40.0 ms
|
60 Hz HD
|
Single level, Enhancement
|
60
|
16.67 ms
|
50 Hz HD
|
Single level, Enhancement
|
50
|
20.00 ms
|
Note
-
The requirements on rows and columns are to maximize interoperability between software environments and commonly available hardware MPEG2 encoder/decoder implementations. Should the source picture have a lower value, it should be re-formatted accordingly by scaling and/or pixel padding prior to MPEG2 encoding.
-
The frame rate of the acquiring camera for '30 Hz HD' MPEG2 may be either 30 or 30/1.001 (approximately 29.97) frames/sec. Similarly, the frame rate in the case of 60 Hz may be either 60 or 60/1.001 (approximately 59.94) frames/sec This may lead to small inconsistencies between the video timebase and real time.
-
The Frame Time (0018,1063) may be calculated from the frame rate of the acquiring camera. A frame time of 33.367 ms corresponds to 29.97 frames per second.
-
The value of chroma_format for this profile and level is defined by MPEG as 4:2:0.
-
Examples of screen resolutions supported by MPEG2 Main Profile / High Level are shown in Table 8-y. Frame rates of 50 Hz and 60 Hz (progressive) at the maximum resolution of 1080 by 1920 are not supported by Main Profile / High Level. Interlace at the maximum resolution is supported at a field rate of 50 Hz or 60 Hz, which corresponds to a frame rate of 25 Hz or 30 Hz respectively as described in Table 8-y.
-
An MPEG2 Main Profile / High Level decoder is able to decode bit streams conforming to lower levels. These include the 1080 by 1440 bit streams of MP@H-14, and the Main Level bit streams used in the existing MPEG2 Main Profile / Main Level Transfer Syntax in the Visible Light IOD.
-
MP@H-14 is not supported by this Transfer Syntax.
-
The restriction of DAR to 16:9 is required to ensure interoperability because of limitations in commonly available hardware chip set implementations for MPEG2 Main Profile / High Level.
Table 8-3. Examples of MPEG2 Main Profile / High Level Screen Resolution
Rows
|
Columns
|
Frame rate
|
Video Type
|
Progressive or Interlace
|
1080
|
1920
|
25
|
25 Hz HD
|
P
|
1080
|
1920
|
29.97, 30
|
30 Hz HD
|
P
|
1080
|
1920
|
25
|
25 Hz HD
|
I
|
1080
|
1920
|
29.97, 30
|
30 Hz HD
|
I
|
720
|
1280
|
25
|
25 Hz HD
|
P
|
720
|
1280
|
29.97, 30,
|
30 Hz HD
|
P
|
720
|
1280
|
50
|
50 Hz HD
|
P
|
720
|
1280
|
59.94, 60
|
60 Hz HD
|
P
|
For the Non-Fragmentable Encapsulated Transfer Syntax, one Fragment shall contain the whole MPEG2 bit stream.
For the Fragmentable Encapsulated Transfer Syntax, the stream may be segmented into multiple Fragments.
Note
-
If a video stream exceeds the maximum length of one fragment (2^32-2 bytes), it may be sent using a Fragmentable Encapsulated Transfer Syntax. Alternatively, it may be sent using a Non-Fragmentable Encapsulated Transfer Syntax as multiple SOP Instances, but each SOP Instance will contain an independent and playable bit stream, and not depend on the encoded bit stream in other (previous) instances. The manner in which such separate instances are related is not specified in the Standard, but mechanisms such as grouping into the same Series, and references to earlier instances using Referenced Image Sequence may be used.
-
Fragmentable Encapsulated Transfer Syntaxes allow for streams of essentially unlimited length; the only limit imposed is the maximum Number of Frames (0028,0008), which is 2^31-1 frames (largest positive Value in an Integer String VR).
The Basic Offset Table in the Pixel Data (7FE0,0010) shall be empty (present but zero length).
Note
The Basic Offset Table is not used because MPEG2 contains its own mechanism for describing navigation of frames. To enable decoding of only a part of the sequence, MPEG2 manages a header in any group of pictures (GOP) containing a time_code - a 25-bit integer containing the following: drop_frame_flag, time_code_hours, time_code_minutes, marker_bit, time_code_seconds and time_code_pictures.
The container format for the video bit stream is not constrained. For example, it may MPEG-2 Transport Stream (MPEG-TS), MPEG-2 Program Stream (MPEG-PS), MPEG-2 Elementary Stream (MPEG-ES), MPEG-2 Packetized Elementary Stream (MPEG-PES) (see [ISO/IEC 13818-1]) or MPEG-4 (MP4) container (see [ISO/IEC 14496-12] and [ISO/IEC 14496-14]).
Any audio components present within the MPEG2 Main Profile / High Level bit stream shall comply with the restrictions as for MPEG2 Main Profile / Main Level as stated in Section 8.2.5.
8.2.7 MPEG-4 AVC/H.264 High Profile / Level 4.1 Video Compression
MPEG-4 AVC/H.264 High Profile / Level 4.1 corresponds to what is commonly known as HDTV ('High Definition Television'). DICOM provides a mechanism for supporting the use of MPEG-4 AVC/H.264 Image Compression through the Encapsulated Format. Annex A defines Non-Fragmentable and Fragmentable Encapsulated Transfer Syntaxes that reference the MPEG-4 AVC/H.264 Standard.
Note
MPEG-4 AVC/H.264 compression / High Profile compression is inherently lossy. The context where the usage of lossy compression of medical images is clinically acceptable is beyond the scope of the DICOM Standard. The policies associated with the selection of appropriate compression parameters (e.g., compression ratio) for MPEG-4 AVC/H.264 High Profile / Level 4.1 are also beyond the scope of this Standard.
The use of the DICOM Encapsulated Format to support MPEG-4 AVC/H.264 compressed pixel data requires that the Data Elements that are related to the Pixel Data encoding (e.g., Photometric Interpretation, Samples per Pixel, Planar Configuration, Bits Allocated, Bits Stored, High Bit, Pixel Representation, Rows, Columns, etc.) shall contain Values that are consistent with the characteristics of the compressed data stream, with some specific exceptions noted here. The Pixel Data characteristics included in the MPEG-4 AVC/H.264 bit stream shall be used to decode the compressed data stream.
Note
These requirements are specified in terms of consistency with what is encapsulated, rather than in terms of the uncompressed pixel data from which the compressed data stream may have been derived.
When decompressing, should the characteristics explicitly specified in the compressed data stream be inconsistent with those specified in the DICOM Data Elements, those explicitly specified in the compressed data stream should be used to control the decompression. The DICOM Data Elements, if inconsistent, can be regarded as suggestions as to the form in which an uncompressed Data Set might be encoded, subject to the general and IOD-specific rules for uncompressed Photometric Interpretation and Planar Configuration, which may require that decompressed data be converted to one of the permitted forms.
Note
If MPEG-4 Compressed Pixel Data is decompressed and re-encoded in Native (uncompressed) form, then the Data Elements that are related to the Pixel Data encoding are updated accordingly. If color components are converted from YBR_PARTIAL_420 to RGB during decompression and Native re-encoding, the Photometric Interpretation will be changed to RGB in the Data Set with the Native encoding.
The requirements are:
-
Planar Configuration (0028,0006) shall be 0
-
Samples per Pixel (0028,0002) shall be 3
-
Photometric Interpretation (0028,0004) shall be YBR_PARTIAL_420
-
Bits Allocated (0028,0100) shall be 8
-
Bits Stored (0028,0101) shall be 8
-
High Bit (0028,0102) shall be 7
-
Pixel Representation (0028,0103) shall be 0
-
The value of MPEG-4 AVC/H.264 sample aspect_ratio_idc shall be 1 in the encapsulated MPEG-4 AVC/H.264 bit stream if aspect_ratio_info_present_flag is 1.
-
Pixel Aspect Ratio (0028,0034) shall be absent. This corresponds to a 'Sampling Aspect Ratio' (SAR) of 1:1.
-
The possible Values for Rows (0028,0010), Columns (0028,0011), Cine Rate (0018,0040), and Frame Time (0018,1063) or Frame Time Vector (0018,1065) depend on the used Transfer Syntax.
-
For MPEG-4 AVC/H.264 High Profile / Level 4.1 Transfer Syntax, the Values for these Data Elements shall be compliant with the High Profile / Level 4.1 of the MPEG-4 AVC/H.264 standard ([ISO/IEC 14496-10]) and restricted to a square pixel aspect ratio.
-
For MPEG-4 AVC/H.264 BD-compatible High Profile / Level 4.1 Transfer Syntax, the Values for these Data Elements shall be as specified in Table 8-4.
Table 8-4. Values Permitted for MPEG-4 AVC/H.264 BD-compatible High Profile / Level 4.1
Rows
|
Columns
|
Frame rate
|
Video Type
|
Progressive or Interlace
|
1080
|
1920
|
25
|
25 Hz HD
|
I
|
1080
|
1920
|
29.97
|
30 Hz HD
|
I
|
1080
|
1920
|
24
|
24 Hz HD
|
P
|
1080
|
1920
|
23.976
|
24 Hz HD
|
P
|
720
|
1280
|
50
|
50 Hz HD
|
P
|
720
|
1280
|
59.94
|
60 Hz HD
|
P
|
720
|
1280
|
24
|
24 Hz HD
|
P
|
720
|
1280
|
23.976
|
24 Hz HD
|
P
|
Note
-
The Value of Planar Configuration (0028,0006) is irrelevant since the manner of encoding components is specified in the MPEG-4 AVC/H.264 standard, hence it is set to 0.
-
The limitation on rows and columns are to maximize interoperability between software environments and commonly available hardware MPEG-4 AVC/H.264 encoder/decoder implementations. Source pictures that have a lower value should be re-formatted by scaling and/or pixel padding prior to MPEG-4 AVC/H.264 encoding.
-
The frame rate of the acquiring camera for '30 Hz HD' MPEG-4 AVC/H.264 may be either 30 or 30/1.001 (approximately 29.97) frames/sec. Similarly, the frame rate in the case of 60 Hz may be either 60 or 60/1.001 (approximately 59.94) frames/sec. This may lead to small inconsistencies between the video timebase and real time. The relationship between frame rate and frame time is shown in Table 8-5.
-
The Frame Time (0018,1063) may be calculated from the frame rate of the acquiring camera. A frame rate of 29.97 frames per second corresponds to a frame time of 33.367 ms.
-
The value of chroma_format for this profile and level is defined by MPEG as 4:2:0.
-
Example screen resolutions supported by MPEG-4 AVC/H.264 High Profile / Level 4.1 can be taken from Table 8-4. Frame rates of 50 Hz and 60 Hz (progressive) at the maximum resolution of 1080 by 1920 are not supported by MPEG-4 AVC/H.264 High Profile / Level 4.1. Interlace at the maximum resolution is supported at a field rate of 50 Hz or 60 Hz, which corresponds to a frame rate of 25 Hz or 30 Hz respectively. Smaller resolutions may be used as long as they comply with the square pixel aspect ratio. An example is XGA resolution with an image resolution of 768 by 1024 pixels. For smaller resolutions there are higher frame rates possible. For example it may be up to 80 Hz for XGA.
-
The display aspect ratio is defined implicitly by the pixel resolution of the video picture. Only square pixel aspect ratio is allowed. MPEG-4 AVC/H.264 BD-compatible High Profile / Level 4.1 will only support resolutions that result in a 16:9 display aspect ratio
-
The permitted screen resolutions for the MPEG-4 AVC/H.264 BD-compatible High Profile / Level 4.1 are listed in Table 8-4. Only HD resolutions and no progressive frame rates for 25 or 29.97 frames per seconds are supported. Frame rates of 50 Hz and 60 Hz (progressive) at the maximum resolution of 1080 by 1920 are not supported.
Table 8-5. MPEG-4 AVC/H.264 High Profile / Level 4.1 Image Transfer Syntax Frame Rate Attributes
Video Type
|
Spatial resolution layer
|
Frame Rate (see Note 2)
|
Frame Time (see Note 3)
|
30 Hz HD
|
Single level, Enhancement
|
30
|
33.33 ms
|
25 Hz HD
|
Single level, Enhancement
|
25
|
40.0 ms
|
60 Hz HD
|
Single level, Enhancement
|
60
|
16.67 ms
|
50 Hz HD
|
Single level, Enhancement
|
50
|
20.00 ms
|
For the Non-Fragmentable Encapsulated Transfer Syntax, one Fragment shall contain the whole MPEG-4 AVC/H.264 bit stream.
For the Fragmentable Encapsulated Transfer Syntax, the stream may be segmented into multiple Fragments.
Note
-
If a video stream exceeds the maximum length of one fragment (2^32-2 bytes), it may be sent using a Fragmentable Encapsulated Transfer Syntax. Alternatively, it may be sent using a Non-Fragmentable Encapsulated Transfer Syntax as multiple SOP Instances, but each SOP Instance will contain an independent and playable bit stream, and not depend on the encoded bit stream in other (previous) instances. The manner in which such separate instances are related is not specified in the Standard, but mechanisms such as grouping into the same Series, and references to earlier instances using Referenced Image Sequence may be used.
-
Fragmentable Encapsulated Transfer Syntaxes allow for streams of essentially unlimited length; the only limit imposed is the maximum Number of Frames (0028,0008), which is 2^31-1 frames (largest positive Value in an Integer String VR).
The container format for the video bit stream shall be MPEG-2 Transport Stream, a.k.a. MPEG-TS (see [ISO/IEC 13818-1]) or MPEG-4, a.k.a. MP4 container (see [ISO/IEC 14496-12] and [ISO/IEC 14496-14]). The PTS/DTS of the transport stream shall be used in the MPEG coding.
Any audio components included in the data container shall follow the constraints detailed in Section 8.2.12 Constraints for Audio Data Integration in AVC and HEVC Compressed Bit Streams.
8.2.8 MPEG-4 AVC/H.264 High Profile / Level 4.2 Video Compression
DICOM provides a mechanism for supporting the use of MPEG-4 AVC/H.264 Image Compression through the Encapsulated Format. Annex A defines Transfer Syntaxes that reference the MPEG-4 AVC/H.264 Standard.
Note
MPEG-4 AVC/H.264 compression / High Profile compression is inherently lossy. The context where the usage of lossy compression of medical images is clinically acceptable is beyond the scope of the DICOM Standard. The policies associated with the selection of appropriate compression parameters (e.g., compression ratio) for MPEG-4 AVC/H.264 High Profile / Level 4.2 are also beyond the scope of this Standard.
The use of the DICOM Encapsulated Format to support MPEG-4 AVC/H.264 compressed pixel data requires that the Data Elements that are related to the Pixel Data encoding (e.g., Photometric Interpretation, Samples per Pixel, Planar Configuration, Bits Allocated, Bits Stored, High Bit, Pixel Representation, Rows, Columns, etc.) shall contain Values that are consistent with the characteristics of the compressed data stream, with some specific exceptions noted here. The Pixel Data characteristics included in the MPEG-4 AVC/H.264 bit stream shall be used to decode the compressed data stream.
Note
These requirements are specified in terms of consistency with what is encapsulated, rather than in terms of the uncompressed pixel data from which the compressed data stream may have been derived.
When decompressing, should the characteristics explicitly specified in the compressed data stream be inconsistent with those specified in the DICOM Data Elements, those explicitly specified in the compressed data stream should be used to control the decompression. The DICOM Data Elements, if inconsistent, can be regarded as suggestions as to the form in which an uncompressed Data Set might be encoded, subject to the general and IOD-specific rules for uncompressed Photometric Interpretation and Planar Configuration, which may require that decompressed data be converted to one of the permitted forms.
Note
If MPEG-4 Compressed Pixel Data is decompressed and re-encoded in Native (uncompressed) form, then the Data Elements that are related to the Pixel Data encoding are updated accordingly. If color components are converted from YBR_PARTIAL_420 to RGB during decompression and Native re-encoding, the Photometric Interpretation will be changed to RGB in the Data Set with the Native encoding.
The requirements are:
-
Planar Configuration (0028,0006) shall be 0
-
Samples per Pixel (0028,0002) shall be 3
-
Photometric Interpretation (0028,0004) shall be YBR_PARTIAL_420
-
Bits Allocated (0028,0100) shall be 8
-
Bits Stored (0028,0101) shall be 8
-
High Bit (0028,0102) shall be 7
-
Pixel Representation (0028,0103) shall be 0
-
The value of MPEG-4 AVC/H.264 sample aspect_ratio_idc shall be 1 in the encapsulated MPEG-4 AVC/H.264 bit stream if aspect_ratio_info_present_flag is 1.
-
Pixel Aspect Ratio (0028,0034) shall be absent. This corresponds to a 'Sampling Aspect Ratio' (SAR) of 1:1.
-
The Values for Rows (0028,0010), Columns (0028,0011), Cine Rate (0018,0040), and Frame Time (0018,1063) or Frame Time Vector (0018,1065) shall be compliant with the High Profile / Level 4.2 of the MPEG-4 AVC/H.264 standard ([ISO/IEC 14496-10]) and restricted to a square pixel aspect ratio.
Note
-
The Value of Planar Configuration (0028,0006) is irrelevant since the manner of encoding components is specified in the MPEG-4 AVC/H.264 standard, hence it is set to 0.
-
The frame rate of the acquiring camera for '30 Hz HD' MPEG-4 AVC/H.264 may be either 30 or 30/1.001 (approximately 29.97) frames/sec. Similarly, the frame rate in the case of 60 Hz may be either 60 or 60/1.001 (approximately 59.94) frames/sec. This may lead to small inconsistencies between the video timebase and real time. The relationship between frame rate and frame time is shown in Table 8-7.
-
The Frame Time (0018,1063) may be calculated from the frame rate of the acquiring camera. A frame rate of 29.97 frames per second corresponds to a frame time of 33.367 ms.
-
The value of chroma_format for this profile and level is defined by MPEG as 4:2:0.
Table 8-7. MPEG-4 AVC/H.264 High Profile / Level 4.2 Image Transfer Syntax Frame Rate Attributes
Video Type
|
Frame Rate (see Note 2)
|
Frame Time (see Note 3)
|
30 Hz HD
|
30
|
33.33 ms
|
25 Hz HD
|
25
|
40.0 ms
|
60 Hz HD
|
60
|
16.67 ms
|
50 Hz HD
|
50
|
20.00 ms
|
Stereo Pairs Present (0022,0028) shall be YES if stereoscopic pairs are present, otherwise shall be NO or absent.
Table 8-8. MPEG-4 AVC/H.264 High Profile / Level 4.2 Image Transfer Syntax Stereo Attributes
Transfer Syntax
|
Stereo Pairs Present
|
Stereo Frame Packing Format
|
MPEG-4 AVC/H.264 High Profile / Level 4.2 for 2D Image Compression
|
NO or absent
|
absent
|
MPEG-4 AVC/H.264 High Profile / Level 4.2 for 3D Image Compression
|
YES
|
present
|
For the Non-Fragmentable Encapsulated Transfer Syntax, one Fragment shall contain the whole MPEG-4 AVC/H.264 bit stream.
For the Fragmentable Encapsulated Transfer Syntax, the stream may be segmented into multiple Fragments.
Note
-
If a video stream exceeds the maximum length of one fragment (2^32-2 bytes), it may be sent using a Fragmentable Encapsulated Transfer Syntax. Alternatively, it may be sent using a Non-Fragmentable Encapsulated Transfer Syntax as multiple SOP Instances, but each SOP Instance will contain an independent and playable bit stream, and not depend on the encoded bit stream in other (previous) instances. The manner in which such separate instances are related is not specified in the Standard, but mechanisms such as grouping into the same Series, and references to earlier instances using Referenced Image Sequence may be used.
-
Fragmentable Encapsulated Transfer Syntaxes allow for streams of essentially unlimited length; the only limit imposed is the maximum Number of Frames (0028,0008), which is 2^31-1 frames (largest positive Value in an Integer String VR).
The container format for the video bit stream shall be MPEG-2 Transport Stream, a.k.a. MPEG-TS (see [ISO/IEC 13818-1]) or MPEG-4, a.k.a. MP4 container (see [ISO/IEC 14496-12] and [ISO/IEC 14496-14]). The PTS/DTS of the transport stream shall be used in the MPEG coding.
Any audio components included in the data container shall follow the constraints detailed in Section 8.2.12 Constraints for Audio Data Integration in AVC and HEVC Compressed Bit Streams.
8.2.9 MPEG-4 AVC/H.264 Stereo High Profile / Level 4.2 Video Compression
DICOM provides a mechanism for supporting the use of MPEG-4 AVC/H.264 Image Compression through the Encapsulated Format. Annex A defines Non-Fragmentable and Fragmentable Encapsulated Transfer Syntaxes that reference the MPEG-4 AVC/H.264 Standard.
MPEG-4 AVC/H.264 Stereo High Profile can achieve better compression by additionally making use of prediction between the base and dependent stereoscopic views. The base view frames make use of intra and inter prediction as in MPEG-4 AVC/H.264 High Profile. This makes it possible for decoders which do not know how to decode the stereoscopic data to decode only the base view. The dependent view is encoded to make use of redundancy due to prediction based upon similarities between the base and the dependent views.
MPEG-4 AVC/H.264 Stereo High Profile makes use of the Level table A-1 of the MPEG-4 specification to set through-put limits. The properties required by the MPEG-4 AVC/H.264 Stereo High Profile Compression are identical to the properties defined in Section 8.2.8, except that Stereo Pairs Present (0022,0028) shall always be YES.
The container format for the video bit stream shall be MPEG-2 Transport Stream, a.k.a. MPEG-TS (see [ISO/IEC 13818-1]) or MPEG-4, a.k.a. MP4 container (see [ISO/IEC 14496-12] and [ISO/IEC 14496-14]). The PTS/DTS of the transport stream shall be used in the MPEG coding.
Any audio components included in the data container shall follow the constraints detailed in Section 8.2.12 Constraints for Audio Data Integration in AVC and HEVC Compressed Bit Streams.
8.2.10 HEVC/H.265 Main Profile / Level 5.1 Video Compression
HEVC/H.265 Main Profile / Level 5.1 Main tier is designed for the compression of 4:2:0 video formats up to 4k at 60 frames per second with a bit depth of 8 bits. DICOM provides a mechanism for supporting the use of HEVC/H.265 Image Compression through the Encapsulated Format. Annex A defines a Fragmentable Encapsulated Transfer Syntax that references the HEVC/H.265 Standard.
The use of the DICOM Encapsulated Format to support HEVC/H.265 compressed pixel data requires that the Data Elements that are related to the Pixel Data encoding (e.g., Photometric Interpretation, Samples per Pixel, Planar Configuration, Bits Allocated, Bits Stored, High Bit, Pixel Representation, Rows, Columns, etc.) shall contain Values that are consistent with the characteristics of the compressed data stream, with some specific exceptions noted here. The Pixel Data characteristics included in the HEVC/H.265 bit stream shall be used to decode the compressed data stream.
Note
-
These requirements are specified in terms of consistency with what is encapsulated, rather than in terms of the uncompressed pixel data from which the compressed data stream may have been derived.
-
When decompressing, should the characteristics explicitly specified in the compressed data stream be inconsistent with those specified in the DICOM Data Elements, those explicitly specified in the compressed data stream should be used to control the decompression. The DICOM Data Elements, if inconsistent, can be regarded as suggestions as to the form in which an uncompressed Data Set might be encoded, subject to the general and IOD-specific rules for uncompressed Photometric Interpretation and Planar Configuration, which may require that decompressed data be converted to one of the permitted forms.
The requirements are:
-
Planar Configuration (0028,0006) shall be 0
-
Samples per Pixel (0028,0002) shall be 3
-
Photometric Interpretation (0028,0004) shall be YBR_PARTIAL_420
-
Bits Allocated (0028,0100) shall be 8
-
Bits Stored (0028,0101) shall be 8
-
High Bit (0028,0102) shall be 7
-
Pixel Representation (0028,0103) shall be 0
-
The value of HEVC/H.265 sample aspect_ratio_idc shall be 1 in the encapsulated HEVC/H.265 bit stream if aspect_ratio_info_present_flag is 1.
-
Pixel Aspect Ratio (0028,0034) shall be absent. This corresponds to a 'Sampling Aspect Ratio' (SAR) of 1:1.
-
The Values for Rows (0028,0010), Columns (0028,0011), Cine Rate (0018,0040) and Frame Time (0018,1063) or Frame Time Vector (0018,1065) shall be compliant with the Main Profile / Level 5.1 of the HEVC/H.265 standard [ISO/IEC 23008-2] and restricted to a square pixel aspect ratio.
Note
-
The Value of Planar Configuration (0028,0006) is irrelevant since the manner of encoding components is specified in the HEVC/H.265 standard, hence it is set to 0.
-
The limitation on rows and columns are to maximize interoperability between software environments and commonly available hardware HEVC/H.265 encoder/decoder implementations. Source pictures that have a lower value should be re-formatted by scaling and/or pixel padding prior to HEVC/H.265 encoding.
-
The Frame Time (0018,1063) may be calculated from the frame rate of the acquiring camera. A frame rate of 29.97 frames per second corresponds to a frame time of 33.367 ms.
-
The value of chroma_format_idc for this profile and level is equal to 1, indicating the usage of 4:2:0 content.
The encapsulated pixel data stream may be segmented into multiple fragments.
Note
The recipient is expected to concatenate the fragments while decoding them. This allows for essentially unlimited length streams; the only limit imposed is the maximum Value for Number of Frames (0028,0008) which is 2^31-1 frames (largest positive Value in an Integer String VR).
The container format for the video bit stream shall be MPEG-2 Transport Stream, a.k.a. MPEG-TS (see [ISO/IEC 13818-1]) or MPEG-4, a.k.a. MP4 container (see [ISO/IEC 14496-12] and [ISO/IEC 14496-14]). The PTS/DTS of the transport stream shall be used in the MPEG coding.
Any audio components included in the data container shall follow the constraints detailed in Section 8.2.12 Constraints for Audio Data Integration in AVC and HEVC Compressed Bit Streams.
8.2.11 HEVC/H.265 Main 10 Profile / Level 5.1 Video Compression
HEVC/H.265 Main 10 Profile / Level 5.1 Main tier is designed for the compression of 4:2:0 video formats up to 4k at 60 frames per second with a bit depth of 10 bits. DICOM provides a mechanism for supporting the use of HEVC/H.265 Image Compression through the Encapsulated Format. Annex A defines a Fragmentable Encapsulated Transfer Syntax that references the HEVC/H.265 Standard.
The use of the DICOM Encapsulated Format to support HEVC/H.265 compressed pixel data requires that the Data Elements that are related to the Pixel Data encoding (e.g., Photometric Interpretation, Samples per Pixel, Planar Configuration, Bits Allocated, Bits Stored, High Bit, Pixel Representation, Rows, Columns, etc.) shall contain Values that are consistent with the characteristics of the compressed data stream, with some specific exceptions noted here. The Pixel Data characteristics included in the HEVC/H.265 bit stream shall be used to decode the compressed data stream.
Note
-
These requirements are specified in terms of consistency with what is encapsulated, rather than in terms of the uncompressed pixel data from which the compressed data stream may have been derived.
-
When decompressing, should the characteristics explicitly specified in the compressed data stream be inconsistent with those specified in the DICOM Data Elements, those explicitly specified in the compressed data stream should be used to control the decompression. The DICOM Data Elements, if inconsistent, can be regarded as suggestions as to the form in which an uncompressed Data Set might be encoded, subject to the general and IOD-specific rules for uncompressed Photometric Interpretation and Planar Configuration, which may require that decompressed data be converted to one of the permitted forms.
The requirements are:
-
Planar Configuration (0028,0006) shall be 0
-
Samples per Pixel (0028,0002) shall be 3
-
Photometric Interpretation (0028,0004) shall be YBR_PARTIAL_420
-
Bits Allocated (0028,0100) shall be 16
-
Bits Stored (0028,0101) shall be 10
-
High Bit (0028,0102) shall be 9
-
Pixel Representation (0028,0103) shall be 0
-
The value of HEVC/H.265 sample aspect_ratio_idc shall be 1 in the encapsulated HEVC/H.265 bit stream if aspect_ratio_info_present_flag is 1.
-
Pixel Aspect Ratio (0028,0034) shall be absent. This corresponds to a 'Sampling Aspect Ratio' (SAR) of 1:1.
-
The Values for Rows (0028,0010), Columns (0028,0011), Cine Rate (0018,0040), and Frame Time (0018,1063) or Frame Time Vector (0018,1065) shall be compliant with the Main 10 Profile / Level 5.1 of the HEVC/H.265 standard [ISO/IEC 23008-2] and restricted to a square pixel aspect ratio.
Note
-
The Value of Planar Configuration (0028,0006) is irrelevant since the manner of encoding components is specified in the HEVC/H.265 standard, hence it is set to 0.
-
The limitation on rows and columns are to maximize interoperability between software environments and commonly available hardware HEVC/H.265 encoder/decoder implementations. Source pictures that have a lower value should be re-formatted by scaling and/or pixel padding prior to HEVC/H.265 encoding.
-
The Frame Time (0018,1063) may be calculated from the frame rate of the acquiring camera. A frame rate of 29.97 frames per second corresponds to a frame time of 33.367 ms.
-
The value of chroma_format_idc for this profile and level is equal to 1, indicating the usage of 4:2:0 content.
The encapsulated pixel data stream may be segmented into multiple fragments.
Note
The recipient is expected to concatenate the fragments while decoding them. This allows for essentially unlimited length streams; the only limit imposed is the maximum Value for Number of Frames (0028,0008) which is 2^31-1 frames (largest positive Value in an Integer String VR).
The container format for the video bit stream shall be MPEG-2 Transport Stream, a.k.a. MPEG-TS (see [ISO/IEC 13818-1]) or MPEG-4, a.k.a. MP4 container (see [ISO/IEC 14496-12] and [ISO/IEC 14496-14]). The PTS/DTS of the transport stream shall be used in the MPEG coding.
Any audio components included in the data container shall follow the constraints detailed in Section 8.2.12 Constraints for Audio Data Integration in AVC and HEVC Compressed Bit Streams.
8.2.12 Constraints for Audio Data Integration in AVC and HEVC Compressed Bit Streams
This section describes the constraints pertaining to the presence of audio data alongside pixel data in DICOM objects. It affects the following pixel data encapsulation Transfer Syntaxes:
-
MPEG-4 AVC/H.264 High Profile / Level 4.1
-
MPEG-4 AVC/H.264 BD-compatible High Profile / Level 4.1
-
MPEG-4 AVC/H.264 High Profile / Level 4.2 For 2D Video
-
MPEG-4 AVC/H.264 High Profile / Level 4.2 For 3D Video
-
MPEG-4 AVC/H.264 Stereo High Profile / Level 4.2
-
HEVC/H.265 Main Profile / Level 5.1
-
HEVC/H.265 Main 10 Profile / Level 5.1
Any audio components present within a bit stream whose Transfer Syntax is among those listed above shall be interleaved in either LPCM, AC-3, AAC, MP3 or MPEG-1 Layer II audio format and shall comply with the following restrictions:
Table 8.2.12-1. Allowed Audio Formats
Audio Format
|
MPEG-2 TS Container
|
MP4 Container
|
LPCM
|
Allowed
|
-
|
AC3
|
Allowed
|
-
|
AAC
|
Allowed
|
Allowed
|
MP3
|
Allowed
|
Allowed
|
MPEG-1 Audio Layer II
|
Allowed
|
Allowed
|
8.2.13 Constraints For SMPTE ST 2110-20 Uncompressed Active Video For DICOM-RTV
This section describes the constraints applying to pixel data carried in the DICOM-RTV Flow (separated from DICOM-RTV Metadata Flow) and fully described in [SMPTE ST 2110-20] .
The following table describes constraints on the [SMPTE ST 2110-20] Video Flow in terms of the valid Values for the corresponding DICOM Attributes in the DICOM-RTV Metadata Flow:
-
Samples per pixel
-
Bits Allocated
-
Bits Stored
-
High Bit
Table 8.2.13-1. Constraints Applicable to Attributes describing Pixel Data
Samples per Pixel (0028,0002)
|
Bits Allocated (0028,0100)
|
Bits Stored (0028,0101)
|
High Bit (0028,0102)
|
3
|
8, 16, 16, 16
|
8, 10, 12, 16
|
7, 9, 11, 15
|
DICOM Photometric Interpretation is based on CCIR 601 (aka ITU-R BT.601), therefore some restrictions apply to the possible combination of Sampling System and Colorimetry parameters as stated by [SMPTE ST 2110-20] .
Table 8.2.13-2. List of supported SMPTE ST 2110-20 Parameter Combinations
SMPTE ST 2110-20
|
DICOM Photometric Interpretation (0028,0004)
|
Sampling system
|
Colorimetry
|
RGB
|
BT601
|
RGB
|
YCbCr-4:4:4
|
BT601
|
YBR_FULL
|
YCbCr-4:2:2
|
BT601
|
YBR_FULL_422
|
YCbCr-4:2:0
|
BT601
|
YBR_PARTIAL_420
|
Some other [SMPTE ST 2110-20] parameter combinations do not correspond to existing DICOM photometric interpretations, so their use is currently not permitted. Table 8.2.13-3 lists the unsupported combinations.
Table 8.2.13-3. List of unsupported SMPTE ST 2110-20 Parameter Combinations
SMPTE ST 2110-20
|
Sampling system
|
Colorimetry
|
RGB
|
BT2020, BT709, BT2100, ST2065-1, ST2065-3
|
YCbCr-4:4:4
|
BT2020, BT709, BT2100
|
YCbCr-4:2:2
|
BT2020, BT709, BT2100
|
YCbCr-4:2:0
|
BT2020, BT709, BT2100
|
CLYCbCr-4:4:4
|
BT2020
|
CLYCbCr-4:2:2
|
BT2020
|
CLYCbCr-4:2:0
|
BT2020
|
ICtCp-4:4:4
|
BT2100
|
ICtCp-4:2:2
|
BT2100
|
XYZ
|
XYZ
|
KEY
|
|
8.2.14 High-Throughput JPEG 2000 Image Compression
DICOM provides a mechanism for supporting the use of High-Throughput JPEG 2000 (HTJ2K) Image Compression through the Encapsulated Format. Annex A defines three Transfer Syntaxes that reference the HTJ2K Standard and provide one lossy compression scheme, and two lossless compression schemes, the second of which is optimized for display of progressive bit streams.
Note
The context where the usage of lossy compression of medical images is clinically acceptable is beyond the scope of the DICOM Standard. The policies associated with the selection of appropriate compression parameters (e.g., compression ratio) for HTJ2K lossy compression are also beyond the scope of this Standard.
The use of the DICOM Encapsulated Format to support HTJ2K Compressed Pixel Data requires that the Data Elements that are related to the Pixel Data encoding (e.g., Photometric Interpretation, Samples per Pixel, Planar Configuration, Bits Allocated, Bits Stored, High Bit, Pixel Representation, Rows, Columns, etc.) shall contain Values that are consistent with the characteristics of the compressed data stream. The Pixel Data characteristics included in the HTJ2K bit stream shall be used to decode the compressed data stream.
The requirements when using a Standard Photometric Interpretation (i.e., a Defined Term from PS.3. C.7.6.3.1.2) are specified in Table 8.2.14-1. No other Standard Photometric Interpretation Values shall be used.
Table 8.2.14-1. Valid Values of Pixel Data Related Attributes for HTJ2K Transfer Syntaxes using Standard Photometric Interpretations
Photometric Interpretation
|
Transfer Syntax
|
Transfer Syntax UID
|
Samples per Pixel
|
Planar Configuration
|
Pixel Representation
|
Bits Allocated
|
Bits Stored
|
High Bit
|
MONOCHROME1
MONOCHROME2
|
HTJ2K (Lossless Only)
|
1.2.840.10008.1.2.4.201
|
1
|
absent
|
0 or 1
|
8, 16, 24, 32 or 40
|
1-38
|
0-37
|
HTJ2K (Lossless RPCL)
|
1.2.840.10008.1.2.4.202
|
HTJ2K
|
1.2.840.10008.1.2.4.203
|
PALETTE COLOR
|
HTJ2K (Lossless Only)
|
1.2.840.10008.1.2.4.201
|
1
|
absent
|
0
|
8 or 16
|
1-16
|
0-15
|
HTJ2K (Lossless RPCL)
|
1.2.840.10008.1.2.4.202
|
YBR_RCT
|
HTJ2K (Lossless Only)
|
1.2.840.10008.1.2.4.201
|
3
|
0
|
0
|
8, 16, 24, 32 or 40
|
1-38
|
0-37
|
HTJ2K (Lossless RPCL)
|
1.2.840.10008.1.2.4.202
|
HTJ2K
|
1.2.840.10008.1.2.4.203
|
YBR_ICT
|
HTJ2K
|
1.2.840.10008.1.2.4.203
|
3
|
0
|
0
|
8, 16, 24, 32 or 40
|
1-38
|
0-37
|
RGB
|
HTJ2K (Lossless Only)
|
1.2.840.10008.1.2.4.201
|
3
|
0
|
0
|
8, 16, 24, 32 or 40
|
1-38
|
0-37
|
HTJ2K (Lossless RPCL)
|
1.2.840.10008.1.2.4.202
|
HTJ2K
|
1.2.840.10008.1.2.4.203
|
YBR_FULL
|
HTJ2K (Lossless Only)
|
1.2.840.10008.1.2.4.201
|
3
|
0
|
0
|
8, 16, 24, 32 or 40
|
1-38
|
0-37
|
HTJ2K (Lossless RPCL)
|
1.2.840.10008.1.2.4.202
|
HTJ2K
|
1.2.840.10008.1.2.4.203
|
Note
These requirements are specified in terms of consistency with what is encapsulated, rather than in terms of the uncompressed pixel data from which the compressed data stream may have been derived.
When decompressing, should the characteristics explicitly specified in the compressed data stream be inconsistent with those specified in the DICOM Data Elements, those explicitly specified in the compressed data stream should be used to control the decompression. The DICOM Data Elements, if inconsistent, can be regarded as suggestions as to the form in which an uncompressed Data Set might be encoded, subject to the general and IOD-specific rules for uncompressed Photometric Interpretation and Planar Configuration, which may require that decompressed data be converted to one of the permitted forms.
The HTJ2K bit stream specifies whether or not a reversible or irreversible multi-component (color) transformation [ISO 15444-1 Annex G], if any, has been applied. If no multi-component transformation has been applied, then the components shall correspond to those specified by the DICOM Attribute Photometric Interpretation (0028,0004). If the JPEG 2000 Part 1 reversible multi-component transformation has been applied then the DICOM Attribute Photometric Interpretation (0028,0004) shall be YBR_RCT. If the JPEG 2000 Part 1 irreversible multi-component transformation has been applied then the DICOM Attribute Photometric Interpretation (0028,0004) shall be YBR_ICT.
Note
-
For example, single component may be present, and the Photometric Interpretation (0028,0004) may be MONOCHROME2.
-
The application of a JPEG 2000 Part 1 reversible multi-component transformation is signaled in the JPEG 2000 bit stream by a value of 1 rather than 0 in the SGcod Multiple component transformation type of the COD marker segment [ISO 15444-1 Table A.17]. No other Value of Photometric Interpretation than YBR_RCT or YBR_ICT is permitted when SGcod Multiple component transformation type is 1.
-
Though it would be unusual, would not take advantage of correlation between the red, green and blue components, and would not achieve effective compression, a Photometric Interpretation of RGB could be specified as long as no multi-component transformation [ISO 15444-1 Annex G] was specified by the JPEG 2000 bit stream. For some applications the use of RGB is permitted, e.g., Whole Slide Microscopy Images, to allow conversion to DICOM from proprietary formats without loss due to color space transformation. Alternative methods of decorrelation of the color components than those specified in [ISO 15444-1 Annex G] are permitted as defined in PS3.3, such as a Photometric Interpretation of YBR_FULL; this may be useful when converting existing YBR_FULL Pixel Data (e.g., in a different Transfer Syntax) without further loss.
In either case (Photometric Interpretation of RGB or YBR_FULL), the value of SGcod Multiple component transformation type would be 0.
PS3.3 may constrain the Values of Photometric Interpretation for specific IODs.
-
Despite the application of a multi-component color transformation and its reflection in the Photometric Interpretation Attribute, the "color space" remains undefined. There is currently no means of conveying "standard color spaces" either by fixed values (such as sRGB) or by ICC profiles. Note in particular that the JP2 file header is not sent in the HTJ2K bit stream that is encapsulated in DICOM.
-
If HTJ2K Compressed Pixel Data is decompressed and re-encoded in Native (uncompressed) form, then the Data Elements that are related to the Pixel Data encoding are updated accordingly. If color components are converted from YBR_ICT or YBR_RCT to RGB during decompression and Native re-encoding, the Photometric Interpretation will be changed to RGB in the Data Set with the Native encoding.
-
The upper limit of 40 on Bits Allocated (0028,0100) and 38 on Bits Stored (0028,0101) reflects the maximum HTJ2K sample precision of 38 and the DICOM requirement to describe Bits Allocated (0028,0100) as multiples of bytes (octets).
The HTJ2K bit stream is capable of encoding both signed and unsigned pixel values, hence the Value of Pixel Representation (0028,0103) may be either 0 or 1 for monochrome Photometric Interpretations depending on what has been encoded (as specified in the SIZ marker segment in the precision and sign of component parameter).
The Value of Planar Configuration (0028,0006) is irrelevant since the manner of encoding components is specified in the JPEG 2000 standard, hence it shall be set to 0.
8.2.15 JPEG XL Image Compression
DICOM provides a mechanism for supporting the use of JPEG XL Image Compression through the Encapsulated Format.
Annex A defines a number of Transfer Syntaxes that reference the JPEG XL Standard.
The JPEG XL Lossless Transfer Syntax provides a compression scheme that preserves the bits of the original image, i.e., lossless.
The JPEG XL JPEG Recompression Transfer Syntax preserves the bits of the (lossy) JPEG encoding.
The JPEG XL Transfer Syntax is a potentially lossy compression of the original image.
Note
The context where the usage of lossy compression of medical images is clinically acceptable is beyond the scope of the DICOM Standard.
The policies associated with the selection of appropriate compression parameters (e.g., compression ratio) for JPEG XL lossy compression are also beyond the scope of this Standard.
The use of the DICOM Encapsulated Format to support JPEG XL Compressed Pixel Data requires that the Data Elements that are related to the Pixel Data encoding (e.g., Photometric Interpretation, Samples per Pixel, Planar Configuration, Bits Allocated, Bits Stored, High Bit, Pixel Representation, Rows, Columns, etc.) shall contain Values that are consistent with the characteristics of the compressed data stream.
The Pixel Data characteristics included in the JPEG XL bit stream shall be used to decode the compressed data stream.
The requirements when using a Standard Photometric Interpretation (i.e., a Defined Term from Section C.7.6.3.1.2 in PS3.3
) are specified in Table 8.2.15-1.
No other Standard Photometric Interpretation values shall be used.
Table 8.2.15-1. Valid Values of Pixel Data Related Attributes for JPEG XL Transfer Syntaxes using Standard Photometric Interpretations
Photometric Interpretation
|
Transfer Syntax
|
Transfer Syntax UID
|
Samples per Pixel
|
Planar Configuration
|
Pixel Representation
|
Bits Allocated
|
Bits Stored
|
High Bit
|
MONOCHROME1
MONOCHROME2
|
JPEG XL Lossless
|
1.2.840.10008.1.2.4.110
|
1
|
absent
|
0 or 1
|
1,8,16,24
|
1-24
|
0-23
|
JPEG XL
|
1.2.840.10008.1.2.4.112
|
MONOCHROME2
|
JPEG XL JPEG Recompression
|
1.2.840.10008.1.2.4.111
|
1
|
absent
|
0
|
8
|
8
|
7
|
XYB
YBR_RCT
RGB
|
JPEG XL Lossless
|
1.2.840.10008.1.2.4.110
|
3
|
0
|
0
|
8,16,24
|
8-24
|
7-23
|
JPEG XL
|
1.2.840.10008.1.2.4.112
|
YBR_FULL_422
XYB
RGB
|
JPEG XL JPEG Recompression
|
1.2.840.10008.1.2.4.111
|
3
|
0
|
0
|
8
|
8
|
7
|
Note
These requirements are specified in terms of consistency with what is encapsulated, rather than in terms of the uncompressed pixel data from which the compressed data stream may have been derived.
When decompressing, should the characteristics explicitly specified in the compressed data stream be inconsistent with those specified in the DICOM Data Elements, those explicitly specified in the compressed data stream should be used to control the decompression. The DICOM Data Elements, if inconsistent, can be regarded as suggestions as to the form in which an uncompressed Data Set might be encoded, subject to the general and IOD-specific rules for uncompressed Photometric Interpretation and Planar Configuration, which may require that decompressed data be converted to one of the permitted forms.
PS3.3 may constrain the values of Photometric Interpretation for specific IODs.
The JPEG XL bit stream is capable of encoding both signed and unsigned pixel values, hence the value of Pixel Representation (0028,0103) may be either 0 or 1 for monochrome Photometric Interpretations depending on what has been encoded.
The value of Planar Configuration (0028,0006) is irrelevant since the manner of encoding components is specified in the JPEG XL standard, hence it shall be set to 0.