&%PAGE& - &%PAGE& - Document AVC-107R CCITT SGXV Document AVC-107R Working Party XV/1 August 23, 1991 Experts Group for ATM Video Coding SOURCE : CHAIRMAN TITLE : REPORT OF THE SECOND MEETING OF THE EXPERTS GROUP FOR ATM VIDEO CODING IN SANTA CLARA (August 14-23, 1991) - PART II Purpose: Report ----------------- PART II - JOINT SESSIONS 1. Requirement Sub-group 2. Test Sub-group 3. Video Sub-group 4. System Sub-group 5. Implementation Sub-group 1. Requirement Sub-group (by Sakae Okubo) 1.1 General REQUIREMENTS/TEST subgroups met in cascade on August 20-22 under chairmanship of Mr. S. Okubo and Mr. T. Hidaka. Before starting the sessions, time and input document allocations were decided between the two sub- groups. These meeting sessions were jointly held by MPEG and CCITT EG. On August 20, the two groups had a joint meeting of one hour with VIDEO subgroup to discuss MPEG-2 matters. In the evening, a close pre-view for demonstration tapes was held at Apple with a purpose of allowing critical viewers to observe processed pictures at appropriate viewing distance. The meeting confirmed the following targets for the REQUIREMENTS/TEST meetings; 1) Identification of additional general requirements to MPEG-2 2) Finalization of the practical matters for the Kurihama Tests 3) Finalization of the testing methods for the Kurihama Tests 4) Finalization for the video related part of the PPD document 1.2 Time schedule August 20 (Tue) 9:00-13:00 REQ - general requirements 14:00-15:00 VIDEO/REQ/TST - MPEG2 matters 15:30-17:30 REQ - general requirements August 21 (Wed) 11:00-13:00 TST - testing and analysis methods 14:00-16:00 TST - testing and analysis methods 16:00-17:00 RQ/TST - reflection to PPD August 22 (Thu) 9:00-13:00 REQ/TST - meeting report In the joint session among VIDEO, REQUIREMENTS and TESTS, possible methodologies for convergence after the Kurihama Tests were discussed. The details are contained in the report of VIDEO subgroup. According to the discussion, the following request is added to Section 6.2.1 4) a.: "Good documentation containing the following materials is requested to make convergence easier after the Kurihama tests (see Section 5.3 for objectives of the subjective tests)" 1.3 Documentation REQ/TST: MPEG/100 and its revision (CHAIR-REQ) REQ : 094(CHAIR), 099(CMTT), 108(HUGHES), 115(MITSUBISHI), 139(CCITT), 140(CCITT), 144(CCITT), 151(CCITT), 155(MIT), 158(VADIS), 160(INTEL), 163(PRISM) TEST : 093(CHAIR-TST), 105(CMTT), 112(BNR), 116(JVC), 136(TCE), 168(Hughes) 1.4 Agreements of REQUIREMENTS subgroup 1.4.1 Review of the draft PPD revision (MPEG91/100 and its revision - Issue 2 updated of July 29) The revisions added to MPEG91/100 (AVC-70, parts indicated with bar in the left hand margin) were approved with the following amendments; 1) p.2, Section 1.1: include the statement on the tope of page as part of the body. 2) p.4, Section 3.3: add "assuming CCIR-601 input" at the end of sentence above the table showing two categories; 3) p.5, Section 3.4 1), CCIR601: change "240" to "243 (Note)" and add "Note - It should be noted that CCIR Rec. 601 defines active numbers per field as 243 for 525/60 systems. 4) p.5, Section 3.4 1), Progressive scan format: change "25/30" to "25/30/50/60". 5) p.10, Section 4.1: add "for existing television standards" after "-2 picture frames ... + 3 picture frames (see CCIR Rec. nnn)". 6) p.13, Section 6.2.1 3): add Section 3.2 3) of MPEG91/094 between g. and h. "h. The greatest degree of compatibility would be achieved by a core MPEG1 decoder operating on a bit stream in the range of 1.0-1.5 Mbit/s." 1.4.2 Review of the input documents to this meeting 1.4.2.1 MPEG1/099(CMTT) Preliminary functional requirements for secondary distribution of digital TV and HDTV It was agreed to incorporate the following parts into the PPD document; 1) Section 4/099 into Section 3.3 of PPD, after the table of categories. 2) Section 5/099 into Section 3.4 5) of PPD, at the end. During the discussion, the following comments and questions were given: - Quality objectives: We are considering whether future MPEG standards should guarantee performance in terms of picture quality. It is appreciated if CMTT/2 could provide information on how the picture quality of video codecs conforming to Rec. 723 have been measured or to be measured and how the results are represented. - Features: We believe some features such as random access lead us to common technical requirements to the video coding if we consider such features as channel hopping required for the secondary distribution. - Compatibility: It is appreciated if we could receive specific guidelines based on the "20-30 year long life of the standard" requirement so that we can interpret it into design objectives of video coding which we intend to standardize in the time frame of 1992-1993. The meeting agreed to contain these comments and questions in a liaison statement to CMTT/2 SRG. 1.4.2.2 MPEG91/108 (Hughes) Scalability It was agreed to add a note in Section 3.4 1) the second last item as follows; Broadcast television and scalable window system (Note) are desirable. ^^^^^^ Note - A scalable video format is defined as one where the parameters of decoding are independent of those of encoding. A bitstream is scalable when some coded bits can be disregarded and a usable image still results. Scalability facilitates decoding the images at different rates and resolution scales through the design of the bitstream or data representation itself (see MPEG91/108). 1.4.2.3 MPEG91/115 (Mitsubishi) A program to generate moving Campbell chart Information was provided on a program which generates the moving Campbell chart for checking particular elements of the coding algorithm. Those who are interested in can get contact with Mr. Nishida for further details. 1.4.2.4 MPEG91/139 (CCITT EG) Additional submission materials for the "Kurihama Tests" It was agreed to adopt the proposed additions to the PPD document encouraging consideration of cell loss resilience; 1) Add to Section 6.2.1 3) i. Any claims for additional features (such as cell loss and random bit ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ error resilience) should be supported by demonstrations. ^^^^^^^^^^^^^^^^^ 2) Add note to Section 6.2.1 4) c. the last hyphenated item - any other functionalities if claimed (Note) ^^^^^^ Note - To demonstrate cell loss resilience, it is recommended to simulate 1E-3 cell loss ratio, e.g. by replacing 384 coded bits every 0.1 sec with '0', decoding the resultant bit stream and reconstructing pictures. Exact method of simulation should be described. 1.4.2.5 MPEG91/140 (CCITT EG) H.26X requirements A list of updated requirements to the projected ATM video coding standard H.26X was presented. Items in Section 7 of MPEG91/140 are reflected in Section 3.4 7) b. as follows; 1) Add "high/low priority cell utilization" to the the list of B-ISDN transport flexibility in the 4th hyphenated item. 2) Add at the end of list "The video coding must provide rate control for conforming to the usage parameter control of the network". 3) Add "(e.g. for low complexity)" to the 2nd last paragraph of Section 3.4 9). 1.4.2.6 MPEG91/144 (CCITT EG) Remote video surveillance as a B-ISDN service It was agreed to add this item to the list of applications in the PPD document as follows; 1) Add "RVS Remote Video Surveillance" between "NDB" and "SSM" in Section 3.1. 2) Add "RVS" after "ENG/SNG" with the following parameters; Start from an arbitrary point: yes Symmetry in allowed complexity: C << D 1.4.2.7 MPEG91/151 (EBU) Basic audio quality requirements for digital audio bit-rate reduction systems for broadcast emission and primary distribution The meeting took note of the "indistinguishable from the compact disc quality" requirement and the precise definition of being distinguishable. How to reflect this content into the PPD document awaits AUDIO group discussion. 1.4.2.8 MPEG91/155 (MIT) Operations on the bitstream A necessity of "manipulation on the coded bitstream" and some applications were presented. The necessity was also stressed by CCITT EG so that the continuous presence multipoint system can be realized. Study progress in this topic is expected. Having heard this presentation, the meeting agreed to make the following amendments in the PPD document; 1) Add "(see MPEG91/155)" at the end of Section 3.4 14) 2) Add the following items to the end of Section 6.2.2 2) "- partial decoding and recoding for rescaling" "- on-line transmission bit rate selection on demand 1.4.2.9 MPEG91/158 (VADIS) EUREKA 625 (VADIS) press release 1 Requirements related activities in a European project VADIS was presented for information. 1.4.2.10 MPEG91/160 (Intel) Proposal for parallel decode requirement for MPEG-2 It was presented that potential reduction of hardware may be obtained by using parallel structure implementations. 1.4.2.11 MPEG91/163 (PRISM) Some hooks for error recovery Necessity of some hooks for error recovery was presented for information. It was pointed out that other means such as use of INTRA may give a solution. How to implement the error recovery is for further study. 1.4.3 Clarifications found necessary during the meeting 1.4.3.1 Small difference in number of coded pictures The meeting agreed to allow small difference in number of coded picture elements such as 704 pels/line instead of 720 pels/line, and 240 lines/field instead of 243 lines/field on the assumption that this small difference does not affect the coding performance. The algorithm proposers should include these parameters in the description. According to this agreement, the second line of Section 6.2.1 3) f. should now read "Some coding algorithms may subsample or crop the input signal ..." ^^^^^^^ 1.4.3.2 Random access specification for the test (Section 6.2.1 3) c. and d./PPD) The following modifications were agreed to clarify the current specification: 1) Replace the current Item c. including Note with; "Maximum interval between two entry points for random access must be less than about 2/5 second (equivalent to 10 frames at 25 Hz, 12 frames at 30 Hz)." 2) Add at the end of Item d. "(including a single frame (namely two fields in a set) random access)". 1.4.3.3 Paper listing for the coded bitstream file The following item is included at the end of in Section 6.2.1 4) a. of the PPD document for completeness of the required documentation for proposal submission. "- Paper listing for each sequence which indicated the corresponding coded bit stream file in a format "ls -l" output (see Section 7 of this document" 1.4.3.4 Obsolete items Delete Section 3.4 of the PPD document. 1.4.3.5 Questionnaire for hardware complexity estimation The following outcome of IMPLEMENTATION subgroup is included in the PPD document. 1) Add the following at the end of Section 8; "It is a very desirable objective that that it be possible to implement fully automatic encoders. The extent to which any non-automatic adjustment of coding parameters has been used to generate the pictures submitted to the Kurihama tests must be declared. The nature of these must be described such that an assessment can be made of the feasibility of eventual automation in real encoders." 2) Add the following list at the end of Section 6.2.1 4) a. " - number and sizes of all picture buffers, including necessary display buffers - size of coded date buffer - for each module - size and width of memory (on/off case) - memory bandwidth (on/off case) - number and width of additions per second - number and width of multiplications per second - table sizes (specify fixed or downloaded) - number of table lookups per second (specify fixed or downloaded) - a functional description including data, control, and address generation function and flow. Similarly, if any, to well known modules such as DCT, VLC decoder should be pointed out. Any "tricks" that can simplify implementations e.g. symmetry in tables should be stated. - global - Any implementations on coder and decoder when altering modes, parameters etc. to suit applications. - Any claims with supporting justification that the algorithm is amenable to simple.cheap implementation. - Block diagrams etc. and explanation of algorithm operation (likely to be similar to, if not same as, information required by VIDEO group) " 1.4.3.6 Test sequences for the Kurihama tests Replace Table 2 with that in the report of TEST subgroup. 1.4.4 Outstanding items for the Kurihama tests 1.4.4.1 Schedule Two alternatives were considered for the Kurihama meeting including the subjective test; November 14-22 (14-19 for subjective test, 18-22 for WG11) or November 18-26 (18-21 for subjective test, 18-26 for MPEG1). Some members expressed preference for the first alternative, while some others for the second alternative. Most of the members kept silence. These feelings were conveyed to Convenor for facilitating his decision. 1.4.4.2 Testing bit rates and sequences After having reviewed the tape demonstrations, the meeting agreed to keep the testing bit rates as they are according to the agreement at the Paris meeting (see Section 3.3 of MPEG91/094). Testing sequences were agreed as in the report of TEST subgroup based on the number of pre-registered coding schemes and the capacity of the hosting laboratory. 1.4.4.3 Proposal document submission Every proposer is requested to submit the proposal document according to the MPEG document management agreement (see Recommendation nn of the WG11 Santa Clara meeting). It is noted that the hosting laboratory, JVC, does not provide copying service for these algorithm proposal documents. 1.4.4.4 Deadline for D-1 demonstration tape Each proposer is requested to bring his/her D-1 tape directly to the meeting for demonstration of those features other than normal play back (see Section 6.2.1 4) c.). 1.4.5 Modifications made during the MPEG Plenary on August 23 1) Section 3.4 7) c. 1st line: change "optimized" to "adequate" 2) Add the following as new Section 3.5 3.5 Implementation aspects Algorithm developers seek solutions which allow flexibility in the choice of implementation architectures, thus giving manufacturers the opportunity to design equipment for the widest range of applications. 3) Section 4.1: change "-2 picture frames" to "-20 ms" and "+3 picture frames" to "+40 ms", and reference be made to CCIR Rec. 772. 4) Section 6.2.1 4) a.: add the following at the end - It is a very desirable objective that it be possible to implement fully automatic encoders. The extent to which any non-automatic adjustment of coding parameters has been used to generate the pictures submitted to the Kurihama tests must be declared. The nature of these must be described such that an assessment can be made of the feasibility of eventual automation in real encoders. 1.5 Improvement of document handling Related to the discussion in Section 4.4.3, a proposal was raised to improve the document handling in meetings that each document or at least its abstract be distributed in advance of the meeting among participants. This allows participants to review the contribution before the meeting starts, thus increases the productivity of the meeting. The meeting supported this proposal, but felt practical difficulties at the same time. Convenor should be consulted if we could establish such a mechanism. 1.6 Recommendations of REQUIREMENTS subgroup 1) Video coding algorithms be submitted according to the revised PPD document made available at the WG11 plenary of the Santa Clara meeting. 2) Identification of functional requirements for the audio and system be initiated at the next meeting. 3) Evaluation for functionalities of the proposed algorithms be initiated at the next meeting. 2. Test Sub-group (by Sakae Okubo) 2.1 General MPEG/TEST meeting took place in two sessions under chairmanship of Mr. T. Hidaka; 14:00-17:00 of August 21 and 12:00-13:00 of August 22. 2.2 Finalization of the boundary conditions for the kurihama test 2.2.1 Test sequences Since the number of intended proposals is so large as 39, the number of test sequences have to be reduced as follows; ---------------------------------- Sequence 4M 9M ---------------------------------- Flower Garden X X Popple X Table Tennis X X Mobile and Calendar X X Football X ---------------------------------- No. of Test Sequences 4 4 ---------------------------------- Note of Chairman - At the plenary of WG11 on August 23, Mr. Koster pointed out that the 625 line Exabyte version of "Football" test sequence contain some errors. After the meeting, Mr. Morris checked the D-1 version but found the same errors. Due to the short time to the D-1 tape submission deadline, it was decided to drop "Football" from the Kurihama tests. 2.2.2 Assessors for the subjective tests 1) Two group of assessors were agreed for the Kurihama tests; 2) 50 assessors with normal vision of color are required; 25 assessors are per group. 3) Algorithm proposers have the first priority to send assessors for the test; - one assessor per proposal - two to three assessors per joint proposal 4) Other MPEG/CCITT/CMTT members can take part in the subjective tests on the second priority basis. During the meeting, an application sheet for assessors was circulated, obtaining about 40 registraters. 2.2.3 Test conditions Mr. Mead, Hughes, proposed the use of 2H and 4H viewing distances for the Kurihama tests to get better distinction among proposals in MPEG91/168. Some supported this proposal, while others stressed the use of well established CCIR Rec.500-3 method. The meeting concluded to maintain the official test method as it is, but at the same time agreed that JVC would provide additional test sessions beside the official ones to experiment the shorter viewing distances. Volunteers are welcome to joint this experiment. 2.2.4 Final schedule for the Kurihama test - 4 to 5 sessions per day - 4 days 2.3 Discussion on next steps in MPEG decision process; ranking criteria Mr. Lemay, BNR, proposed to use the variance analysis method in MPEG91/112 for the Kurihama test results, which identifies a group of algorithms not significantly different from the best ranked algorithm. The meeting agreed to use this method in addition to the current method for better interpretation of the results. 2.4 Recommendations of the TEST sub-group 1) D-1 tapes shall be submitted to JVC-Kurihama by 18 October 1991. 2) In addition to formal subjective testing in accordance with CCIR Rec. 500-3, an experimental subjective test environment with viewing distances of 2H and 4H will be provided at the Kurihama meeting. 3. Video Sub-group (by Jacques Guichard) The video group met four times during the week. It also had a common meeting with the audio group and another with the requirements and test groups. - The first meeting mainly concerned the review of the MPEG1 CD (Committee Draft). This document is very close to being finalised since very little modification was reported. - The second meeting was held together with the requirements and test groups, and concerned: 1 - Methodology for convergence 2 - Technical inputs to MPEG2 Point 1 was discussed with the three groups. The main issue was the procedure for the Kurihama tests. The outcome of the Kurihama tests should be followed by a clustering phase to reduce the 39 proposals to n, and then to a second phase to go from n to a single TM (Test Model). According to the MPEG1 experience, the importance of a verification group during the tests has been emphasized as well as the concept of a core experiment model which was very helpful. Point 2 was discussed within the video group only. It was limited to a brief presentation of the following documents: * H. Sandgrind (Norway Telecom) CCITT AVC-104 * J. Guichard (CNET France Telecom) CCITT AVC-103 * K. McCann (National Transcom Ltd.) MPEG91/157, 158, 159 (some work which is currently done in VADIS) * I. Park (British Telecom) CCITT AVC-100 * A. Koster (RNL) CCITT AVC-94 * D. Anastassiou (Columbia University) MPEG91/131 (some consideration regarding the advantages of coding pictures on a field basis versus a frame basis) - The third session was devoted to tape demonstrations of coded picture concerning the current work for MPEG/H.26X. Only European members disclosed information regarding their coding schemes. - The last session was devoted to patent issues. 4. System Sub-group (by Barry G. Haskell) The WG11 (MPEG) Systems Committee met in Santa Clara August 19-23, 1991. During this period a number of improvements and simplifications were made to the multiplexing specifications, culminating in Revision 8/22/91 of the Committee Draft (CD). Recall that at the Paris meeting a System Target Decoder (STD) was defined for demultiplexing data packets into individual buffers, one for each elementary stream. Decoder buffer sizes are assumed known to the encode/multiplex system, and it is the responsibility of that system to not overflow or underflow the decoder buffers of the STD. Demux System Time Clock (STC) recovery is the responsibility of the designer. In Santa Clara, the following agreements were arrived at: 1) A System Header Packet was agreed to that contains information about the ISO11172 (MPEG) Stream, including maximum data rate, maximum number of audio and video elementary streams, whether or not the data rate is fixed, and the maximum sizes of STD buffers for each constituent elementary stream. The first packet of the ISO11172 stream must be a System Header Packet. 2) Two private elementary stream types were defined. Private1 uses the normal packet header protocol, including optional stuffing, buffer size indication and time stamps. Private2 is only required to provide packet_start_code and packet_length. 3) STD buffer size specification now has two scales. Scale 0, required for audio, is in multiples of 128 bytes. Scale 1, required for video, is in multiples of 1024 bytes. Other elementary streams can use either scale. 4) A tolerance of 50 ppm was agreed to for the demultiplexer System Time Clock (STC). 5) A maximum time of 0.7 seconds between packs was agreed to. 6) A start was made on defining a Constrained System Parameter Stream (CSPS). In video, this concept had been referred to as "core", but the term seems to have fallen from favor. So far, we have agreed only on a maximum packet rate within a pack of 300 per second. A little progress was made toward understanding how variable data rates might work. However, a common vocabulary has not yet been established, and the situation is far from being completely understood. For a constant (known) data rate, the System CD wording is fairly consistent. Bytes arrive at the STD at times known (fairly accurately) to the multiplexer and demultiplexer. If the DSM is master, the demux STC tries to lock itself to the SCRs as they are received. If the DSM is not master, the DSM (plus DSM buffer, if needed) must follow the master clock, in order to produce a (fairly) constant data rate. For variable data rates, there are a number of scenarios. If the STD byte arrival times are known to the multiplexer, which then sets SCR values accordingly, there is not much problem, since the demux STC can still lock to the received SCRs. If the multiplexer does not know the STD byte arrival times, it has to work with hypothetical values. For ATM networks, the byte transmission times may be sufficient if cell arrival jitter is small. In that case, a receiver buffer would try to output bytes to the STD so that demux STC and received SCR values match. Demux master clock and STC would use the fullness of this receiver buffer for tracking purposes. For bursty networks or DSMs, the multiplexer might simply use a hypothetical constant data rate for setting SCRs. A large receiver buffer would then be needed to smooth the data rate, so that Demux STC and received SCRs could be made to match. The System CD does not yet distinguish between hypothetical and actual STD byte arrival times. Further study and refinement of these and other aspects of multiplexing continue. 5. Implementation Sub-group (by Geoff Morrison) The Implementation Studies Group met on the 21st and 22nd of August 1991. Discussions centered on three areas. 1) Verification of CD11172 by a decoder implementation operating in real time. A separate report on this has been prepared. 2) Document MPEG91/160 proposed that the list of requirements for MPEG- 2 include the use of parallel structures for the video algorithm decoding operations which map on to parallel hardware implementations. Although the meeting recognised that parallel processing can be useful especially at higher coded data and computation rates, there were also thought to be disadvantages such that parallel structured algorithms should not be especially favoured. For example, a parallel implementation generally requires a larger silicon area and hence higher cost. It was agreed that flexibility of implementation approaches was important and algorithm designers should endeavor to allow this. A recommendation was drafted to this effect. 3) The major part of the meeting was spent on finalising the procedure to be used for addressing the complexity of algorithms to be proposed for MPEG-2 and evaluated in November 1991. The basic principle of rank ordering had been established at the May 1991 meeting. Document MPEG91/113 proposed a methodology which was favourably reviewed. In particular, the use of standard deviation to flag inconsistencies between reviewers was agreed. A suggestion at the previous meeting had been to provide a list of parameters etc for proposers to provide information about. This was prepared for inclusion in the updated Proposal Package Description. The meeting was of the opinion that encoders should also be assessed. The reasoning was that the cost and complexity of encoder will be important in some applications and it would therefore be unwise for MPEG to proceed after the November subjective picture quality assessment exercise without relevant information. It was agreed to rank order the encoder and decoder complexity separately and that less precision in the encoder assessment would be acceptable. It was pointed out that manual "tuning" of coding parameters for each test sequence, individual pictures within sequences and even individual blocks within pictures was possible under MPEG-2 proposal rules. While the Implementation Group did not wish to limit the coded picture quality presented by proposers there were obvious concerns about the practicality of achieving such results under real conditions. Accordingly a recommendation was drafted requiring proposers to state the extent and nature of such hand crafting so that an assessment can be made of the possibility and cost of its eventual automation. The reason for performing the complexity evaluation is to bring the implementation cost considerations into the MPEG-2 decision making process. However costs also depend on market size and hence general applicability of the chosen algorithm. A recommendation was prepared stating that all the requirements listed in the PPD be considered along with the picture quality and complexity results when selecting paths for further study. The expected number of proposals for assessment at the November meeting raised several questions of logistics. It was agreed that the first 2 or 3 days of the November meeting be spent on a series of 30 minute interview sessions with proposers. These would consist of a 15 minute presentation followed by 15 minutes of a question and answer session. To aid the assessors in obtaining an understanding of the proposals by the first few days of the November meeting it was agreed to request submission of all complexity assessment documentation from proposers by 1 November 1991 directly to all assessors (expected to number 12 to 15). The remaining timetable for the complexity evaluation exercise would be as follows: - By day 5 of November meeting - assessors make initial assessments and consult to check that the procedure is satisfactory, but these results not published. - By 17 Dec 91 - Assessors finish individual rankings. - On the first day of next meeting after November (ie 8 Jan 92 in Singapore) - Assessors meet to complete the evaluation process and issue the results. END