CCITT SG XV Working Party on Visual Telephony Document #21 Specialists Group on Coding for Visual Telephony Source: NTT, KDD, NEC and FUJITSU TITLE : PROPOSALS OF FUNDAMENTAL ARCHITECTURE OF THE 384 kbit/s CODEC 1. Introduction Following points require considerations for the 384 kbit/s codec: (1) The codec should operate equally for the 525/60 and the 625/50 television systems, (2) A reasonably good picture quality should be achievable corresponding to bit rate when operated 384\*n (n ≥ 2) kbit/s, (3) The codec should be of a structure that allows various accommodations, including especially for the LSI technology utilization, except for the areas essential in the preservation of interconnectability. In order to meet these conditions, the 'virtual digital video standard method' described in the contribution "Basic Parameters of 384 kbit/s Codec" (Document #20), also submitted to this meeting, would be useful. This contribution reports the results of investigation made up to now in our group concerning the basic codec architecture to be realized under the virtual digital video standard method. 2. Architecture based on the virtual digital video standard method Details of the virtual digital video standard method are given in the other companion document. The present contribution concentrates on the basic concepts and the structure of the encoder/decoder systems. The basic standpoint of the virtual digital standard method is that the encoder must be applicable to input video signals of both the 525/60 and the 625/50 television systems and those of composite or component format. The basic transmission rate is 384 kbit/s, but the codec must also be applied to 384\*n ( $n \ge 2$ ) kbit/s easily as well. Based on these standpoints of the codec, we would propose that the basic architecture should consist of the following three components (also shown in Figure 1); 1. Video I/O processor 2. Video source coder/decoder 3. Transmission coder/decoder Today's videocoferencing codecs, such as the conditional replenishment codec described in Rec. H.120, are materialized in the structure in which components (1) and (2) are combined together into a processing block called the video signal coder/decoder, and component (3), the transmission coder/decoder, has been separated. In this proposal, however, the former block of the codec would be suggested to be devided further into two blocks, that is, a video I/O processor, and a video source coder/decoder, for the flexibilty against various different video input and output signal types. 3. Outlines of the respective blocks 3.1 Video I/O processor The agreement on the basic coding parameters is somewhat related to the question of what input video signal type will be used within the 384 kbit/s codec, and this includes PAL, SECAM and NTSC standards, as well as the composite and the component signal format standardized in the CCIR Rec. 601. In the existing codec architecture, operation of the video source coder is dependent on the input video signal clock. Hence the existing codec architecture requires a separate video source coder/decoder operation for each input/output video source, which would eventually eliminate the benefits of the unified codec: namely, common codec standardization, improved reliability by LSI utilizaion, reduced cost, and compactization of the unit. The virtual digital video standard method proposes that a common digital signal format should be used together with a common source coder block, to allow a selection of appropriate frame rates, line number, and horizontal sampling number for various input video formats, under certain restrictions. This means that the virtual standard picture will be defined according to the maximum values for each of the picture parameters of the input picture signals. The video I/O processor is defined as responsible for converting the input analog (or digital) signal into the virtual standard picture signal, and vice versa. As a result, an alignment is required for the input/output video signal system ( fl and fl' in Figure 1) and the clock for the source coder (f2 and f2' in Figure 1), which can be handled through interconnected buffer memories. At the same time, a sampling frequency type identification mark and an input video signal clock information, especially video frame clock information, should be sent to the decoder side as attribute informations. Decoder side can use these information optionally. ## 3.2 Video source coder/decoder The video source coder/decoder performs the coding and decoding of the virtual digital video standard picture, which is created by the video I/O processor described in 3.1. At present, coders operate based on the video signal clock, but the basic concept proposed here is that the functional blocks should be independently operationable. An alternation of the source coder clock f2 due to a change in the input signal f1 is also optional. One such example is shown in Figure 2, where clocks f2 and f2' are independent of the video signal system and they have values slightly larger than those of fl and fl'. In this system, intermittent operation is possible based on the input video frame signals for all picture input signal formats. In this case, as long as the values of f2 and f2' are larger than those of fl and fl', the rest of the configuration is totally flexible, and it is possible, for example, to use a different clock f3, which is located in the transmission coder block as the source clock for f2 and f2'. We should note that this is only a single example chosen to explain the independent operations of the video source coder/decoder and the video I/O processor. 3.3 Transmission coder/decoder This block is essentially the same as that described in Rec. H.130 and the essential functions are as follows: 1. Mutiplexing of picture, voice, data and necessary codec-to-codec information signals 2. Generation of the transmission frame format 3. Error correction function 4. Demand refresh function 5. Conversion to transmission codes 6. Encryption function (optional) 7. Ability to handle non-BSI transmission channel (optional) In the 384 kbit/s codec, the net transmission data rate is small and it is conceivable that the transmission error recovery function would not be sufficient with a cyclic refresh. A demand refresh function is therefore considered necessary, even for unidirectional picture transmission (for example, where one direction is voice and video, while the other is voice only) where some kind of backword path for demand refresh is required. Summary A proposal was made for the basic architecture of the 384 kbit/s codec. In accordance with the virtual digital video standard method the basic architecture should be composed of three independent processing blocks: the video I/O processor, the video source coder/decoder, and the transmission coder/decoder. - 3 - Figure 1 Basic confisuration of the 384kbit/s codec Figure 2 An example of the video source coder operation for the 525/60 and the 625/50 television input signals