Any real-time video streaming codec is subject to the limitations of the network used to transfer the bitstream. The network introduces limited bandwidth, jitter and possibly dropped packets. The H.264 codec was designed with some knowledge of the network through the Network Abstraction Layer. On the encoder side, this layer generates NAL units which are intended to be transferred over the network as one piece. Very often the bit encoding for a video frame is larger than the MTU for the network, so frames may be divided into multiple NAL units. On the decoding side, the NAL units are received from the network (or not) and passed to the decoder.
The H.264 codec has two types of NAL units: Video Coding Layer (VCL) and non-VCL. The non-VCL units contain sequence (SPS) and picture (PPS) information and are transferred first, before all of the VCL units. The VCL units contain all of the bitstream information. The decoder will not operate without receiving the non-VCL units, so it is important to ensure their delivery. H.264 NAL units do not contain timestamp or sequence information, so a transport layer (such as RTP) is important to make sure NAL units are delivered to the decoder in the proper order, and to make sure decoded frames are presented at the correct time.